Skip to content

Commit

Permalink
Release 0.3.26 (#1373)
Browse files Browse the repository at this point in the history
# Description

Prefix the PR title with the Jira issue number on the form
`[CDF-12345]`.

Please describe the change you have made.

## Checklist

- [ ] Tests added/updated.
- [ ] Run Demo Job Locally.
- [ ] Documentation updated.
- [ ] Changelogs updated in
[CHANGELOG.cdf-tk.md](https://github.com/cognitedata/toolkit/blob/main/CHANGELOG.cdf-tk.md).
- [ ] Template changelogs updated in
[CHANGELOG.templates.md](https://github.com/cognitedata/toolkit/blob/main/CHANGELOG.templates.md).


[CDF-12345]:
https://cognitedata.atlassian.net/browse/CDF-12345?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
  • Loading branch information
doctrino authored Jan 16, 2025
2 parents 6ae46ee + 8a66e3e commit 84ca660
Show file tree
Hide file tree
Showing 80 changed files with 945 additions and 284 deletions.
5 changes: 2 additions & 3 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Description

Prefix the PR title with the Jira issue number on the form `[CDF-12345]`.

Please describe the change you have made.

## Checklist
Expand All @@ -9,6 +11,3 @@ Please describe the change you have made.
- [ ] Documentation updated.
- [ ] Changelogs updated in [CHANGELOG.cdf-tk.md](https://github.com/cognitedata/toolkit/blob/main/CHANGELOG.cdf-tk.md).
- [ ] Template changelogs updated in [CHANGELOG.templates.md](https://github.com/cognitedata/toolkit/blob/main/CHANGELOG.templates.md).
- [ ] Version bumped.
[_version.py](https://github.com/cognitedata/toolkit/blob/main/cognite/cognite_toolkit/_version.py) and
[pyproject.toml](https://github.com/cognitedata/toolkit/blob/main/pyproject.toml) per [semantic versioning](https://semver.org/).
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ repos:
- --fixable=E,W,F,I,T,RUF,TID,UP
- --target-version=py39
- id: ruff-format
rev: v0.8.6
rev: v0.9.1

- repo: https://github.com/igorshubovych/markdownlint-cli
rev: v0.43.0
Expand Down
29 changes: 29 additions & 0 deletions CHANGELOG.cdf-tk.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,35 @@ Changes are grouped as follows:
- `Fixed` for any bug fixes.
- `Security` in case of vulnerabilities.

## [0.3.26] - 2025-01-16

### Added

- [alpha feature] `cdf import transformation-cli` now has a new flag `--clean` to remove the
source files after importing.

### Fixed

- All groups are now correctly deployed before resources that has authentication to them (`Transformation`,
`'FunctionSchedule`, `WorkflowTrigger`).

### Changed

- Running `cdf auth init/verify` no longer automatically activates Cognite Functions on private link environments
projects.

### Improved

- You now get a warning if you use the `$FILENAME` template incorrectly in the `CogniteFile`/`FileMetadata` resource.
- If a `{{ variable }}` replacement causes a `YAMLFormatError`, the Toolkit now gives you a hint on how to fix it.
- If you use a `dataSetId`, the Toolkit now gives you a hint to use `dataSetExternalId` instead.
- The Toolkit fallback to read any file as `utf-8` if it fails to read.
- The Toolkit no longer gives `UnusedParamterWarning` for `WorkflowVersion` using a `subworkflow` task.
- If a `Transformation`/`FunctionSchedule`/`WorkflowTrigger` fails to deploy due to environment variables missing,
the Toolkit now gives a hint on how to fix it.
- [alpha feature] If you get a duplicated item due to using the `repeated-module` feature. The Toolkit now gives
you a hint on how to fix it.

## [0.3.25] - 2025-01-10

### Added
Expand Down
4 changes: 4 additions & 0 deletions CHANGELOG.templates.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ Changes are grouped as follows:
- `Fixed` for any bug fixes.
- `Security` in case of vulnerabilities.

## [0.3.26] - 2025-01-16

No changes to templates.

## [0.3.25] - 2025-01-10

No changes to templates.
Expand Down
92 changes: 52 additions & 40 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,65 @@

## How to contribute

We are always looking for ways to improve the templates and the workflow. You can
[file bugs](https://github.com/cognitedata/toolkit/issues/new/choose) in the repo.
We are always looking for ways to improve the Cognite Toolkit CLI. You can
report bugs and ask questions in [our Cognite Hub group](https://hub.cognite.com/groups/cognite-data-fusion-toolkit-277).

We are also looking for contributions to new modules, especially example modules can be very
useful for others. Please open a PR with your suggested changes or propose a functionality
by creating an issue.
We are also looking for contributions to new modules (content) and the Toolkit codebase that make the configuration of
Cognite Data Fusion easier, faster and more reliable.

## Module ownership
## Improving the codebase

If you want to contribute to the codebase, you can do so by creating a new branch and
[opening a pull request](https://github.com/cognitedata/toolkit/compare). Prefix the PR title with the Jira issue
number on the form `[CDF-12345]`. A good PR should include a good description of the change to help the reviewer
understand the nature and context of the change.

### Linting and testing

The Cognite Toolkit CLI and modules have an extensive test and linting battery to ensure quality and speed of development.

See [pyproject.toml](pyproject.toml) for the linting and testing configuration.

See [tests](tests/README.md) for more information on how to run and maintain tests.

The `cdf_` prefixed modules are tested as part of the product development.

### Setting up the local environment

Your local environment needs a working Python installation and a virtual environment. We use `poetry` to manage
the environment and its dependencies.

Install pre-commit hooks by running `poetry run pre-commit install` in the root of the repository.

When developing in vscode, the `cdf-tk-dev.py` file is useful to run the toolkit. This script will set the
environment and paths correctly (to avoid conflicts with the installed cdf package) and also sets the
`SENTRY_ENABLED` environment variable to `false` to avoid sending errors to Sentry.
In .vscode/launch.json you will see a number of examples of debugging configurations that you can use to debug.

### Essential code

- Main app entry point: [cognite_toolkit/_cdf.py](cognite_toolkit/_cdf.py)
- App subcommands: [cognite_toolkit/_cdf_tk/commands](cognite_toolkit/_cdf_tk/commands)
- Resource loaders: [cognite_toolkit/_cdf_tk/loaders](cognite_toolkit/_cdf_tk/loaders)
- Tests: [tests](tests)
- CI/CD: [.github/workflows](.github/workflows)

### Sentry

When you develop the Cognite Toolkit you should avoid sending errors to `sentry`. You can control `sentry` by setting
the `environment` variable `SENTRY_ENABLED=false`. This is set automatically when you use the `cdf-tk-dev.py`.

## Contributing in modules

### Module ownership

The official cdf_* modules are owned by the respective teams in Cognite. Any changes to these
will be reviewed by the teams to ensure that nothing breaks. If you open a PR on these modules,
the PR will be reviewed by the team owning the module.

cdf_infield_location is an example of a team-owned module.

## Adding a new module
### Adding a new module

Adding a new module consists of the following steps:

Expand Down Expand Up @@ -70,37 +113,6 @@ Of course, where data population of e.g. data model is part of the configuration
The scripts are continuously under development to simplify management of configurations, and
we are pushing the functionality into the Python SDK when that makes sense.

## Testing

The `cdf_` prefixed modules should be tested as part of the product development. Our internal
test framework for scenario based testing can be found in the Cognite private big-smoke repository.

The `cdf-tk deploy` script command will clean configurations if you specify `--drop`, so you can
try to apply the configuration multiple times without having to clean up manually. If you want to delete
everything that is governed by your templates, including data ingested into data models, the `cdf-tk clean`
script command can be used to clean up configurations using the `scripts/delete.py` functions.

See [tests](tests/README.md) for more information on how to run tests.

## Setting up Environment

In order to develop `cdf-tk` you need to set up a development environment. You need a working python
installation and a virtual environment. We recommend using `poetry` to set up the environment as this is
the package tool that the toolkit repo uses also to create the installable python package.

When developing, you should use `cdf-tk-dev.py` to run the toolkit. This script will set the environment and paths
correctly (to avoid running the installed cdf-tk package) and also set the `SENTRY_ENABLED` environment
variable to `false` to avoid sending errors to Sentry.
In .vscode/launch.json you will see a number of examples of debugging configurations that you can use to debug.
If you use VSCode or another IDE supporting devcontainers, the easiest way to set up the environment is to
run in the Dev Container as configured in .devcontainer. It creates a virtual python environment in .venv/ that
will automatically be picked up by VSCode or poetry also if you want to run outside the devcontainer.

### Sentry

When you develop `cdf-tk` you should avoid sending errors to `sentry`. You can control `sentry` by setting
the `environment` variable `SENTRY_ENABLED=false`. This is set automatically when you use the `cdf-tk-dev.py`.

## Releasing

The templates are bundled with the `cdf-tk` tool, so they are released together.
Expand Down Expand Up @@ -132,12 +144,12 @@ To release a new version of the `cdf-tk` tool and the templates, you need to do
- deactivate
- run script again

1. Get approval to squash merge the branch into `main`:
1. Get approval to **squash merge** the branch into `main`:
1. Verify that all Github actions pass.
1. Create a release branch: `release-x.y.z` from `main`:
1. Create a new tag on the branch with the version number, e.g. `v0.1.0b3`.
2. Open a PR with the existing `release` branch as base comparing to your new `release-x.y.z` branch.
3. Get approval and merge (do not squash).
3. Get approval and merge (**do not squash**).
4. Verify that the Github action `release` passes and pushes to PyPi.
1. Create a new release on github.com with the tag and release notes:
1. Find the tag you created and create the new release.
Expand Down
2 changes: 1 addition & 1 deletion cdf.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,4 @@ dump = true
[modules]
# This is the version of the modules. It should not be changed manually.
# It will be updated by the 'cdf module upgrade' command.
version = "0.3.25"
version = "0.3.26"
2 changes: 1 addition & 1 deletion cognite_toolkit/_builtin_modules/cdf.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ default_env = "<DEFAULT_ENV_PLACEHOLDER>"
[modules]
# This is the version of the modules. It should not be changed manually.
# It will be updated by the 'cdf module upgrade' command.
version = "0.3.25"
version = "0.3.26"


[plugins]
Expand Down
3 changes: 2 additions & 1 deletion cognite_toolkit/_cdf_tk/builders/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
)
from cognite_toolkit._cdf_tk.utils import (
humanize_collection,
safe_read,
)


Expand Down Expand Up @@ -141,7 +142,7 @@ def get_loader(
# If there is a tableName field, it is a table, otherwise it is a database.
if any(
line.strip().startswith("tableName:") or line.strip().startswith("- tableName:")
for line in source_path.read_text().splitlines()
for line in safe_read(source_path).splitlines()
):
return RawTableLoader, None
else:
Expand Down
2 changes: 1 addition & 1 deletion cognite_toolkit/_cdf_tk/builders/_datamodels.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ def _copy_graphql_to_build(
if "dml" in entry:
expected_filename = entry["dml"]
else:
expected_filename = f'{INDEX_PATTERN.sub("", source_file.source.path.stem.removesuffix(GraphQLLoader.kind).removesuffix("."))}.graphql'
expected_filename = f"{INDEX_PATTERN.sub('', source_file.source.path.stem.removesuffix(GraphQLLoader.kind).removesuffix('.'))}.graphql"
expected_path = source_file.source.path.parent / Path(expected_filename)

if expected_path in graphql_files:
Expand Down
12 changes: 11 additions & 1 deletion cognite_toolkit/_cdf_tk/builders/_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
)
from cognite_toolkit._cdf_tk.exceptions import ToolkitYAMLFormatError
from cognite_toolkit._cdf_tk.loaders import CogniteFileLoader, FileLoader, FileMetadataLoader
from cognite_toolkit._cdf_tk.tk_warnings import ToolkitWarning
from cognite_toolkit._cdf_tk.tk_warnings import LowSeverityWarning, ToolkitWarning


class FileBuilder(Builder):
Expand Down Expand Up @@ -55,6 +55,16 @@ def _expand_file_metadata(
and cls.template_pattern in raw_list[0].get("externalId", "")
)
if not is_file_template:
if (isinstance(raw_list, dict) and cls.template_pattern in raw_list.get("externalId", "")) or (
isinstance(raw_list, list)
and any(cls.template_pattern in entry.get("externalId", "") for entry in raw_list)
):
raw_type = "dictionary" if isinstance(raw_list, dict) else "list with multiple entries"
LowSeverityWarning(
f"Invalid file template {cls.template_pattern!r} usage detected in {module.relative_path.as_posix()!r}.\n"
f"The file template is expected in a list with a single entry, but got {raw_type}."
).print_warning()

return raw_list
if not (isinstance(raw_list, list) and raw_list and isinstance(raw_list[0], dict)):
raise ToolkitYAMLFormatError(
Expand Down
3 changes: 2 additions & 1 deletion cognite_toolkit/_cdf_tk/builders/_transformation.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from cognite_toolkit._cdf_tk.exceptions import ToolkitYAMLFormatError
from cognite_toolkit._cdf_tk.loaders import TransformationLoader
from cognite_toolkit._cdf_tk.tk_warnings import ToolkitWarning
from cognite_toolkit._cdf_tk.utils import safe_write


class TransformationBuilder(Builder):
Expand Down Expand Up @@ -85,7 +86,7 @@ def _add_query(
)
elif query_file is not None:
destination_path = self._create_destination_path(query_file.source.path, "Query")
destination_path.write_text(query_file.content)
safe_write(destination_path, query_file.content)
relative = destination_path.relative_to(transformation_destination_path.parent)
entry["queryFile"] = relative.as_posix()
extra_sources.append(query_file.source)
Expand Down
3 changes: 1 addition & 2 deletions cognite_toolkit/_cdf_tk/cdf_toml.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,7 @@ def load(cls, cwd: Path | None = None, use_singleton: bool = True) -> CDFToml:
alpha_flags = {clean_name(k): v for k, v in raw["alpha_flags"].items()}
if not alpha_flags and "feature_flags" in raw:
MediumSeverityWarning(
"The 'feature_flags' section has been renamed to 'alpha_flags'. "
"Please update your cdf.toml file."
"The 'feature_flags' section has been renamed to 'alpha_flags'. Please update your cdf.toml file."
).print_warning()
alpha_flags = {clean_name(k): v for k, v in raw["feature_flags"].items()}

Expand Down
7 changes: 7 additions & 0 deletions cognite_toolkit/_cdf_tk/client/_toolkit_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,13 @@ def cloud_provider(self) -> Literal["azure", "aws", "gcp", "unknown"]:
else:
return "unknown"

@property
def is_private_link(self) -> bool:
if "cognitedata.com" not in self.base_url:
return False
subdomain = self.base_url.split("cognitedata.com", maxsplit=1)[0]
return "plink" in subdomain


class ToolkitClient(CogniteClient):
def __init__(self, config: ToolkitClientConfig | None = None) -> None:
Expand Down
4 changes: 2 additions & 2 deletions cognite_toolkit/_cdf_tk/client/api/lookup.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ def id(
self._reverse_cache.update({v: k for k, v in lookup.items()})
if len(missing) != len(lookup) and not is_dry_run:
raise ResourceRetrievalError(
f"Failed to retrieve {self.resource_name} with external_id {missing}." "Have you created it?"
f"Failed to retrieve {self.resource_name} with external_id {missing}.Have you created it?"
)
return (
self._get_id_from_cache(external_id, is_dry_run, allow_empty)
Expand Down Expand Up @@ -116,7 +116,7 @@ def external_id(
self._cache.update({v: k for k, v in lookup.items()})
if len(missing) != len(lookup):
raise ResourceRetrievalError(
f"Failed to retrieve {self.resource_name} with id {missing}." "Have you created it?"
f"Failed to retrieve {self.resource_name} with id {missing}.Have you created it?"
)
return (
self._get_external_id_from_cache(id)
Expand Down
4 changes: 2 additions & 2 deletions cognite_toolkit/_cdf_tk/client/data_classes/sequences.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def __init__(
col_length = len(columns)
if wrong_length := [r for r in rows if len(r.values) != col_length]:
raise ValueError(
f"Rows { [r.row_number for r in wrong_length] } have wrong number of values, expected {col_length}"
f"Rows {[r.row_number for r in wrong_length]} have wrong number of values, expected {col_length}"
)
self.rows = rows
self.columns = columns
Expand Down Expand Up @@ -108,7 +108,7 @@ def __init__(
col_length = len(columns)
if wrong_length := [r for r in rows if len(r.values) != col_length]:
raise ValueError(
f"Rows { [r.row_number for r in wrong_length] } have wrong number of values, expected {col_length}"
f"Rows {[r.row_number for r in wrong_length]} have wrong number of values, expected {col_length}"
)
self.rows = rows
self.columns = columns
Expand Down
Loading

0 comments on commit 84ca660

Please sign in to comment.