Skip to content

Commit

Permalink
Merge pull request #94 from nicholasyager/feature/handle_nested_packages
Browse files Browse the repository at this point in the history
Feature: Add `excluded_packages` configuration to prevent specific nested packages from being injected
  • Loading branch information
nicholasyager authored Nov 18, 2024
2 parents 9882c1f + 1b8a6cf commit 2b420eb
Show file tree
Hide file tree
Showing 9 changed files with 68 additions and 17 deletions.
45 changes: 35 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,23 @@ manifests:
By default, `dbt-loom` will look for `dbt_loom.config.yml` in your working directory. You can also set the
`DBT_LOOM_CONFIG` environment variable.

### Using dbt Cloud as an artifact source
## How does it work?

As of dbt-core 1.6.0-b8, there now exists a `dbtPlugin` class which defines functions that can
be called by dbt-core's `PluginManger`. During different parts of the dbt-core lifecycle (such as graph linking and
manifest writing), the `PluginManger` will be called and all plugins registered with the appropriate hook will be executed.

dbt-loom implements a `get_nodes` hook, and uses a configuration file to parse manifests, identify public models, and
inject those public models when called by `dbt-core`.

## Advanced Features

### Loading artifacts from remote sources

`dbt-loom` supports automatically fetching manifest artifacts from a variety
of remote sources.

#### Using dbt Cloud as an artifact source

You can use dbt-loom to fetch model definitions from dbt Cloud by setting up a `dbt-cloud` manifest in your `dbt-loom` config, and setting the `DBT_CLOUD_API_TOKEN` environment variable in your execution environment.

Expand All @@ -89,7 +105,7 @@ manifests:
# which to fetch artifacts. Defaults to the last step.
```

### Using an S3-compatible object store as an artifact source
#### Using an S3-compatible object store as an artifact source

You can use dbt-loom to fetch manifest files from S3-compatible object stores
by setting up ab `s3` manifest in your `dbt-loom` config. Please note that this
Expand All @@ -107,7 +123,7 @@ manifests:
# The object name of your manifest file.
```

### Using GCS as an artifact source
#### Using GCS as an artifact source

You can use dbt-loom to fetch manifest files from Google Cloud Storage by setting up a `gcs` manifest in your `dbt-loom` config.

Expand All @@ -129,7 +145,7 @@ manifests:
# The OAuth2 Credentials to use. If not passed, falls back to the default inferred from the environment.
```

### Using Azure Storage as an artifact source
#### Using Azure Storage as an artifact source

You can use dbt-loom to fetch manifest files from Azure Storage
by setting up an `azure` manifest in your `dbt-loom` config. The `azure` type implements
Expand Down Expand Up @@ -180,14 +196,23 @@ manifests:
object_name: manifest.json.gz
```

## How does it work?
### Exclude nested packages

As of dbt-core 1.6.0-b8, there now exists a `dbtPlugin` class which defines functions that can
be called by dbt-core's `PluginManger`. During different parts of the dbt-core lifecycle (such as graph linking and
manifest writing), the `PluginManger` will be called and all plugins registered with the appropriate hook will be executed.
In some circumstances, like running `dbt-project-evaluator`, you may not want a
given package in an upstream project to be imported into a downstream project.
You can manually exclude downstream projects from injecting assets from packages
by adding the package name to the downstream project's `excluded_packages` list.

dbt-loom implements a `get_nodes` hook, and uses a configuration file to parse manifests, identify public models, and
inject those public models when called by `dbt-core`.
```yaml
manifests:
- name: revenue
type: file
config:
path: ../revenue/target/manifest.json
excluded_packages:
# Provide the string name of the package to exclude during injection.
- dbt_project_evaluator
```

## Known Caveats

Expand Down
12 changes: 11 additions & 1 deletion dbt_loom/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,17 @@ def initialize(self) -> None:
self.manifests[manifest_name] = manifest

selected_nodes = identify_node_subgraph(manifest)
self.models.update(convert_model_nodes_to_model_node_args(selected_nodes))

# Remove nodes from excluded packages.
filtered_nodes = {
key: value
for key, value in selected_nodes.items()
if value.package_name not in manifest_reference.excluded_packages
}

loom_nodes = convert_model_nodes_to_model_node_args(filtered_nodes)

self.models.update(loom_nodes)

@dbt_hook
def get_nodes(self) -> PluginNodes:
Expand Down
3 changes: 2 additions & 1 deletion dbt_loom/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from typing import List, Union
from urllib.parse import ParseResult, urlparse

from pydantic import BaseModel, validator
from pydantic import BaseModel, Field, validator

from dbt_loom.clients.az_blob import AzureReferenceConfig
from dbt_loom.clients.dbt_cloud import DbtCloudReferenceConfig
Expand Down Expand Up @@ -55,6 +55,7 @@ class ManifestReference(BaseModel):
S3ReferenceConfig,
AzureReferenceConfig,
]
excluded_packages: List[str] = Field(default_factory=list)


class dbtLoomConfig(BaseModel):
Expand Down
2 changes: 2 additions & 0 deletions test_projects/customer_success/dbt_loom.config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@ manifests:
type: file
config:
path: ../revenue/target/manifest.json
excluded_packages:
- dbt_project_evaluator
4 changes: 2 additions & 2 deletions test_projects/customer_success/package-lock.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
packages:
- package: dbt-labs/dbt_utils
version: 1.0.0
- package: dbt-labs/dbt_utils
version: 1.0.0
sha1_hash: efa9169fb1f1a1b2c967378c02b60e3d85ae464b
6 changes: 6 additions & 0 deletions test_projects/revenue/dbt_project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,9 @@ models:
+materialized: view
marts:
+materialized: table
dbt_project_evaluator:
+access: private
marts:
dag:
fct_source_fanout:
+enabled: true
8 changes: 5 additions & 3 deletions test_projects/revenue/package-lock.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
packages:
- package: dbt-labs/dbt_utils
version: 1.0.0
sha1_hash: efa9169fb1f1a1b2c967378c02b60e3d85ae464b
- package: dbt-labs/dbt_utils
version: 1.0.0
- package: dbt-labs/dbt_project_evaluator
version: 0.14.3
sha1_hash: 52459ce227fef835e4466cbb12d624b3e1971fae
2 changes: 2 additions & 0 deletions test_projects/revenue/packages.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
packages:
- package: dbt-labs/dbt_utils
version: 1.0.0
- package: dbt-labs/dbt_project_evaluator
version: 0.14.3
3 changes: 3 additions & 0 deletions tests/test_dbt_core_execution.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,9 @@ def test_dbt_core_runs_loom_plugin():
"revenue.orders.v2",
}

# Excluded packages do not get injected and loaded into a manifest.
assert not any(["dbt_project_evaluator" in item for item in output.result])

os.chdir(starting_path)

assert set(output.result).issuperset(
Expand Down

0 comments on commit 2b420eb

Please sign in to comment.