Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Implement ChangeSet file editing paradigm #107

Merged
merged 66 commits into from
Aug 17, 2023

Conversation

nicholasyager
Copy link
Collaborator

@nicholasyager nicholasyager commented Aug 8, 2023

Description and Motivations

This PR is a large refactoring effort that implements ChangeSets. Change sets are a standardized interface for describing resource and file changes within a dbt project. Whereas before we would have the DbtMeshConstructor and DbtProjectEditor classes to perform bespoke operations on a dbt project, now our operations yield Changes that describe the specific modifications to perform. This decoupling from identification and operation allows us to handle all operations at arbitrary times, like after we've logged our plan or even not at all (a dry run)!

As a byproduct of this change, there were also a number of opportunities to simplify our codebase considerably. Now we no longer required the following:

  1. Project-specific pathing, which allows us to deprecate the DbtFileManager, which also simplifies typing across the board.
  2. Remove bespoke file manipulation code now that we can condense our operations to a limited set (add, update, remove, copy, move, etc).
  3. Remove the DbtMeshConstructor code and move different operations into dedicated classes.

Resolves: #16
Resolves: #67
Resolves: #69
Resolves: #108

Example Run

16:31:27 | INFO | Executing dbt parse...
16:31:27 | INFO | Generating catalog with dbt docs generate...
16:31:28 | INFO | Selected 18 resources: {'metric.split_proj.customers', 'test.split_proj.accepted_values_customers_customer_type__new__returning.d12f0947c8', 'metric.split_proj.expenses', 'test.split_proj.unique_customers_customer_id.c5af1ff4b1', 'test.split_proj.unique_orders_order_id.fed79b3a6e', 'test.split_proj.not_null_orders_order_id.cf6c17daed', 'source.split_proj.ecom.raw_orders', 'metric.split_proj.gross_profit', 'model.split_proj.stg_orders', 'model.split_proj.customers', 'test.split_proj.dbt_utils_expression_is_true_orders_count_food_items_count_drink_items_count_items.57f3cadbad', 'test.split_proj.not_null_stg_orders_order_id.81cfe2fe64', 'test.split_proj.relationships_orders_customer_id__customer_id__ref_stg_customers_.918495ce16', 'model.split_proj.orders', 'test.split_proj.unique_stg_orders_order_id.e3b841c71a', 'test.split_proj.dbt_utils_expression_is_true_orders_subtotal_food_items_subtotal_drink_items_subtotal.40bf6e459d', 'metric.split_proj.revenue', 'test.split_proj.not_null_customers_customer_id.5c9bf9911d'}
16:31:28 | INFO | Creating subproject orders...
16:31:28 | INFO | Identifying operations required to split orders from split_proj.
16:31:28 | [  0/37] STARTING | Add metric `customers` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/metrics/metrics.yml
16:31:28 | [  0/37] SUCCESS  | Add metric `customers` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/metrics/metrics.yml
16:31:28 | [  1/37] STARTING | Remove metric `customers` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/metrics/metrics.yml
16:31:28 | [  1/37] SUCCESS  | Remove metric `customers` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/metrics/metrics.yml
16:31:28 | [  2/37] STARTING | Add metric `expenses` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/metrics/metrics.yml
16:31:28 | [  2/37] SUCCESS  | Add metric `expenses` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/metrics/metrics.yml
16:31:28 | [  3/37] STARTING | Remove metric `expenses` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/metrics/metrics.yml
16:31:28 | [  3/37] SUCCESS  | Remove metric `expenses` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/metrics/metrics.yml
16:31:28 | [  4/37] STARTING | Add source `ecom` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/staging/__sources.yml
16:31:28 | [  4/37] SUCCESS  | Add source `ecom` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/staging/__sources.yml
16:31:28 | [  5/37] STARTING | Remove source `raw_orders` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__sources.yml
16:31:28 | [  5/37] SUCCESS  | Remove source `raw_orders` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__sources.yml
16:31:28 | [  6/37] STARTING | Move code `stg_orders` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/staging/stg_orders.sql
16:31:28 | [  6/37] SUCCESS  | Move code `stg_orders` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/staging/stg_orders.sql
16:31:28 | [  7/37] STARTING | Add model `stg_orders` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/staging/__models.yml
16:31:28 | [  7/37] SUCCESS  | Add model `stg_orders` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/staging/__models.yml
16:31:28 | [  8/37] STARTING | Remove model `stg_orders` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [  8/37] SUCCESS  | Remove model `stg_orders` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [  9/37] STARTING | Move code `customers` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/customers.sql
16:31:28 | [  9/37] SUCCESS  | Move code `customers` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/customers.sql
16:31:28 | [ 10/37] STARTING | Add model `customers` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/__models.yml
16:31:28 | [ 10/37] SUCCESS  | Add model `customers` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/__models.yml
16:31:28 | [ 11/37] STARTING | Remove model `customers` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/marts/__models.yml
16:31:28 | [ 11/37] SUCCESS  | Remove model `customers` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/marts/__models.yml
16:31:28 | [ 12/37] STARTING | Update code `customers` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/customers.sql
16:31:28 | [ 12/37] SUCCESS  | Update code `customers` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/customers.sql
16:31:28 | [ 13/37] STARTING | Move code `orders` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/orders.sql
16:31:28 | [ 13/37] SUCCESS  | Move code `orders` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/orders.sql
16:31:28 | [ 14/37] STARTING | Add model `orders` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/__models.yml
16:31:28 | [ 14/37] SUCCESS  | Add model `orders` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/__models.yml
16:31:28 | [ 15/37] STARTING | Remove model `orders` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/marts/__models.yml
16:31:28 | [ 15/37] SUCCESS  | Remove model `orders` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/marts/__models.yml
16:31:28 | [ 16/37] STARTING | Update code `orders` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/orders.sql
16:31:28 | [ 16/37] SUCCESS  | Update code `orders` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/orders.sql
16:31:28 | [ 17/37] STARTING | Update code `orders` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/orders.sql
16:31:28 | [ 17/37] SUCCESS  | Update code `orders` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/orders.sql
16:31:28 | [ 18/37] STARTING | Update code `orders` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/orders.sql
16:31:28 | [ 18/37] SUCCESS  | Update code `orders` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/orders.sql
16:31:28 | [ 19/37] STARTING | Update code `orders` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/orders.sql
16:31:28 | [ 19/37] SUCCESS  | Update code `orders` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/marts/orders.sql
16:31:28 | [ 20/37] STARTING | Add metric `gross_profit` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/metrics/metrics.yml
16:31:28 | [ 20/37] SUCCESS  | Add metric `gross_profit` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/metrics/metrics.yml
16:31:28 | [ 21/37] STARTING | Remove metric `gross_profit` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/metrics/metrics.yml
16:31:28 | [ 21/37] SUCCESS  | Remove metric `gross_profit` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/metrics/metrics.yml
16:31:28 | [ 22/37] STARTING | Add metric `revenue` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/metrics/metrics.yml
16:31:28 | [ 22/37] SUCCESS  | Add metric `revenue` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/models/metrics/metrics.yml
16:31:28 | [ 23/37] STARTING | Remove metric `revenue` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/metrics/metrics.yml
16:31:28 | [ 23/37] SUCCESS  | Remove metric `revenue` from /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/metrics/metrics.yml
16:31:28 | [ 24/37] STARTING | Update model `stg_customers` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 24/37] SUCCESS  | Update model `stg_customers` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 25/37] STARTING | Update model `stg_customers` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 25/37] SUCCESS  | Update model `stg_customers` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 26/37] STARTING | Update model `stg_products` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 26/37] SUCCESS  | Update model `stg_products` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 27/37] STARTING | Update model `stg_products` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 27/37] SUCCESS  | Update model `stg_products` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 28/37] STARTING | Update model `stg_order_items` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 28/37] SUCCESS  | Update model `stg_order_items` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 29/37] STARTING | Update model `stg_order_items` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 29/37] SUCCESS  | Update model `stg_order_items` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 30/37] STARTING | Update model `stg_supplies` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 30/37] SUCCESS  | Update model `stg_supplies` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 31/37] STARTING | Update model `stg_supplies` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 31/37] SUCCESS  | Update model `stg_supplies` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 32/37] STARTING | Update model `stg_locations` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 32/37] SUCCESS  | Update model `stg_locations` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 33/37] STARTING | Update model `stg_locations` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 33/37] SUCCESS  | Update model `stg_locations` in /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/models/staging/__models.yml
16:31:28 | [ 34/37] STARTING | Add code `dbt_project.yml` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/dbt_project.yml
16:31:28 | [ 34/37] SUCCESS  | Add code `dbt_project.yml` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/dbt_project.yml
16:31:28 | [ 35/37] STARTING | Copy code `packages.yml` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/packages.yml
16:31:28 | [ 35/37] SUCCESS  | Copy code `packages.yml` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/packages.yml
16:31:28 | [ 36/37] STARTING | Add code `dependencies.yml` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/dependencies.yml
16:31:28 | [ 36/37] SUCCESS  | Add code `dependencies.yml` to /Users/nicholas/projects/nicholasyager/dbt-meshify/test-projects/split/test_project/orders/dependencies.yml
                                                            

nicholasyager and others added 10 commits August 13, 2023 12:18
… file managers

Why get rid of the DbtFileManager? It returned mutually exclusive types! When this
happens, we're stuck doing type checks every time we read a file, which is a nightmare.
Secondly, the DbtFileManager had assumptions about realtive paths being used in oeprations,
which as of ChangeSets is NO LONGER THE CASE. Now, absolute paths are used everywhere. As a
result, it made sense to split out our concerns quite a bit such that there is a
RawFileManager that handles I/O of raw data, and a YamlFileManager that adds in
a YAML parsing layer to keep our code clean.
Co-authored-by: dave-connors-3 <73915542+dave-connors-3@users.noreply.github.com>
@nicholasyager nicholasyager marked this pull request as ready for review August 13, 2023 20:34
Copy link
Collaborator Author

@nicholasyager nicholasyager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! All to-dos

dbt_meshify/main.py Outdated Show resolved Hide resolved
dbt_meshify/storage/dbt_project_editors.py Outdated Show resolved Hide resolved
tests/integration/test_contract_command.py Outdated Show resolved Hide resolved
@nicholasyager
Copy link
Collaborator Author

nicholasyager commented Aug 14, 2023

@nicholasyager: Feedback from the sync

  • Double-check the log level work with --debug
  • Resolve absolute paths to relative paths in logging.
  • Clean up tests to remove exception and stderr output
  • Add validation to changes to ensure that paths are absolute.

Copy link
Collaborator

@dave-connors-3 dave-connors-3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at first glance, nothing really jumping out to me except holy cow is this exceptionally readable. The comments here aren't really blocking at all, mostly questions, clarifications, and notes to self. I'm gonna keep using it a but more, but at the moment not seeing any reason not to approve

dbt_meshify/change.py Outdated Show resolved Hide resolved
dbt_meshify/dbt_projects.py Outdated Show resolved Hide resolved
dbt_meshify/main.py Show resolved Hide resolved
dbt_meshify/dbt_projects.py Show resolved Hide resolved
dbt_meshify/main.py Show resolved Hide resolved
dbt_meshify/main.py Show resolved Hide resolved
class BaseFileManager(ABC):
@abc.abstractmethod
def read_file(self, path: Path) -> Union[Dict[str, Any], str, None]:
class FileManager(Protocol):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total curiosity question: what's a Protocol? similar to an abstract base class?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Protocol is similar to an Abstract Base Class, but it uses duck typing instead of inheritance. So, for instance, we don't need to have a bunch of file managers inheriting from an abstract file manager. Instead, we can define a protocol, and any class that implements the correct methods will be allowed via the type checker. It's rather similar to the trait system in rust, which I LOVE.

}

@staticmethod
def update_refs__sql(model_name, project_name, model_code):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to future selves: will need to handle optional version args in refs before too long!

dbt_meshify/utilities/linker.py Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
Copy link
Collaborator

@dave-connors-3 dave-connors-3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am comfortable merging these changes -- incredible work on this @nicholasyager , a wholesale rewrite is no small task! can't wait to show off --dry-run!

@nicholasyager nicholasyager merged commit 3c59423 into dbt-labs:main Aug 17, 2023
@nicholasyager nicholasyager deleted the nicholasyager_changesets branch August 17, 2023 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
2 participants