Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

models with multiple cross project Parents are not properly updated #147

Closed
1 task done
dave-connors-3 opened this issue Aug 22, 2023 · 0 comments · Fixed by #148
Closed
1 task done

models with multiple cross project Parents are not properly updated #147

dave-connors-3 opened this issue Aug 22, 2023 · 0 comments · Fixed by #148
Assignees
Labels
bug Something isn't working

Comments

@dave-connors-3
Copy link
Collaborator

dave-connors-3 commented Aug 22, 2023

Describe the bug

When a project is split, for each boundary node, we updates the ref functions in the children of that model. If there is a child model that is also the child of another boundary model, the resulting code will only have the last change in the ChangeSet written. Each of the Operation.Update reads from the original file contents, and then performs the update for one parent at a time.

Steps to reproduce

  1. run a split operation that creates a cross project child with two or more cross project parents
$ dbt-meshify --dry-run --debug split test -s +stg_orders +stg_order_items
# orders is downstream of both these models
  1. View logs for Update change operations on orders.sql
FileChange for children of `stg_orders`
FileChange(
    operation=<Operation.Update: 'update'>, 
    entity_type=<EntityType.Code: 'code'>, 
    identifier='orders', 
    path=PosixPath('/Users/daveconnors/dev/dbt-labs/dbt-meshify/test-projects/split/split_proj/models/marts/orders.sql'), 
    data=
    "{{
    config(
        materialized = 'incremental',
        unique_key = 'order_id'
    )
}}

with

{#
     DuckDB will see {{ this }} evaluate to `orders` and a CTE called `orders` as being the same
     so when using DuckDB we append `_set` to any CTEs with the same name as {{ this }} to indicate
     we're not executing a recursive statement
#}

orders_set as (

    select * from {{ ref('test', 'stg_orders') }}

    where
        true

        {% if is_incremental() %}

            and ordered_at >= (
                select max(ordered_at) as most_recent_record from {{ this }}
            )

        {% endif %}

),

order_items as (

    select * from {{ ref('stg_order_items') }}

),

products as (

    select * from {{ ref('stg_products') }}

),

locations as (

    select * from {{ ref('stg_locations') }}

),

supplies as (

    select * from {{ ref('stg_supplies') }}

),

order_items_summary as (

    select

        order_items.order_id,

        sum(products.is_food_item) as count_food_items,
        sum(products.is_drink_item) as count_drink_items,
        count(*) as count_items,
        sum(
            case
                when products.is_food_item = 1 then products.product_price
                else 0
            end
        ) as subtotal_drink_items,
        sum(
            case
                when products.is_drink_item = 1 then products.product_price
                else 0
            end
        ) as subtotal_food_items,
        sum(products.product_price) as subtotal

    from order_items

    left join products on order_items.product_id = products.product_id

    group by 1

),

order_supplies_summary as (

    select

        order_items.order_id,

        sum(supplies.supply_cost) as order_cost

    from order_items

    left join supplies on order_items.product_id = supplies.product_id

    group by 1

),

joined as (

    select

        orders_set.*,

        order_items_summary.count_food_items,
        order_items_summary.count_drink_items,
        order_items_summary.count_items,

        order_items_summary.subtotal_drink_items,
        order_items_summary.subtotal_food_items,
        order_items_summary.subtotal,

        order_supplies_summary.order_cost,
        locations.location_name

    from orders_set

    left join order_items_summary
        on orders_set.order_id = order_items_summary.order_id
    left join order_supplies_summary
        on orders_set.order_id = order_supplies_summary.order_id
    left join locations
        on orders_set.location_id = locations.location_id

),

final as (

    select

        *,
        count_food_items > 0 as is_food_order,
        count_drink_items > 0 as is_drink_order

    from joined

)

select * from final", 
source=None
)
FileChange for children of `stg_order_items`
FileChange(
    operation=<Operation.Update: 'update'>, 
    entity_type=<EntityType.Code: 'code'>, 
    identifier='orders', 
    path=PosixPath('/Users/daveconnors/dev/dbt-labs/dbt-meshify/test-projects/split/split_proj/models/marts/orders.sql'), 
    data="{{
    config(
        materialized = 'incremental',
        unique_key = 'order_id'
    )
}}

with

{#
     DuckDB will see {{ this }} evaluate to `orders` and a CTE called `orders` as being the same
     so when using DuckDB we append `_set` to any CTEs with the same name as {{ this }} to indicate
     we're not executing a recursive statement
#}

orders_set as (

    select * from {{ ref('stg_orders') }}

    where
        true

        {% if is_incremental() %}

            and ordered_at >= (
                select max(ordered_at) as most_recent_record from {{ this }}
            )

        {% endif %}

),

order_items as (

    select * from {{ ref('test', 'stg_order_items') }}

),

products as (

    select * from {{ ref('stg_products') }}

),

locations as (

    select * from {{ ref('stg_locations') }}

),

supplies as (

    select * from {{ ref('stg_supplies') }}

),

order_items_summary as (

    select

        order_items.order_id,

        sum(products.is_food_item) as count_food_items,
        sum(products.is_drink_item) as count_drink_items,
        count(*) as count_items,
        sum(
            case
                when products.is_food_item = 1 then products.product_price
                else 0
            end
        ) as subtotal_drink_items,
        sum(
            case
                when products.is_drink_item = 1 then products.product_price
                else 0
            end
        ) as subtotal_food_items,
        sum(products.product_price) as subtotal

    from order_items

    left join products on order_items.product_id = products.product_id

    group by 1

),

order_supplies_summary as (

    select

        order_items.order_id,

        sum(supplies.supply_cost) as order_cost

    from order_items

    left join supplies on order_items.product_id = supplies.product_id

    group by 1

),

joined as (

    select

        orders_set.*,

        order_items_summary.count_food_items,
        order_items_summary.count_drink_items,
        order_items_summary.count_items,

        order_items_summary.subtotal_drink_items,
        order_items_summary.subtotal_food_items,
        order_items_summary.subtotal,

        order_supplies_summary.order_cost,
        locations.location_name

    from orders_set

    left join order_items_summary
        on orders_set.order_id = order_items_summary.order_id
    left join order_supplies_summary
        on orders_set.order_id = order_supplies_summary.order_id
    left join locations
        on orders_set.location_id = locations.location_id

),

final as (

    select

        *,
        count_food_items > 0 as is_food_order,
        count_drink_items > 0 as is_drink_order

    from joined

)

select * from final", 
source=None)

Each change has updated refs only for the relevant parent, and only the last change is written back to the file.

Expected results

These changes should be additive not destructive!

System information

Which database are you using dbt with?

  • duck
@dave-connors-3 dave-connors-3 added bug Something isn't working triage Tis issue or pull request must be triaged by a project maintainer labels Aug 22, 2023
@nicholasyager nicholasyager removed the triage Tis issue or pull request must be triaged by a project maintainer label Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants