Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Fix] Remove incremental logic #97

Merged
merged 11 commits into from
Jan 22, 2025
Merged
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
# dbt_shopify v0.16.0
[PR #97](https://github.com/fivetran/dbt_shopify/pull/97) includes the following updates:

## Bug Fixes
- Removed incremental logic in the following end models:
- `shopify__discounts`
- `shopify__order_lines`
- `shopify__orders`
- `shopify__transactions`
- Incremental strategies were removed from these models due to potential inaccuracies with the `merge` strategy on BigQuery and Databricks. For instance, the `new_vs_repeat` field in `shopify__orders` could produce incorrect results during incremental runs. To ensure consistency, this logic was removed across all warehouses. If the previous incremental functionality was valuable to you, please consider opening a feature request to revisit this approach.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that it was the incremental strategy in general and not limited to merge, especially since we had the models in question materializing as tables by default for BQ and Databricks, and delete+insert uses the same unique-key logic. Is this not the case? Unless I am mistaken, I would recommend generalizing it a bit:

Suggested change
- Incremental strategies were removed from these models due to potential inaccuracies with the `merge` strategy on BigQuery and Databricks. For instance, the `new_vs_repeat` field in `shopify__orders` could produce incorrect results during incremental runs. To ensure consistency, this logic was removed across all warehouses. If the previous incremental functionality was valuable to you, please consider opening a feature request to revisit this approach.
- Incremental strategies were removed from these models due to potential inaccuracies with the incremental strategy. For instance, the `new_vs_repeat` field in `shopify__orders` could produce incorrect results during incremental runs. If the previous incremental functionality was valuable to you, please consider opening a feature request to revisit this approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I wasn't certain, as the majority of the conversation centered around merge. I've tweaked this a little, let me know if this looks good!


## [Upstream Under-the-Hood Updates from `shopify_source` Package](https://github.com/fivetran/dbt_shopify_source/releases/tag/v0.15.0)
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
- (Affects Redshift only) Creates new `shopify_union_data` macro to accommodate Redshift's treatment of empty tables.
- For each staging model, if the source table is not found in any of your schemas, the package will create a table with one row with null values for Redshift destinations. There will be no change in behavior in non-Redshift warehouses.
- This is necessary as Redshift will ignore explicit data casts when a table is completely empty and materialize every column as a `varchar`. This throws errors in downstream transformations in the `shopify` package. The 1 row will ensure that Redshift will respect the package's datatype casts.

# dbt_shopify v0.15.0

[PR #94](https://github.com/fivetran/dbt_shopify/pull/94) includes the following updates:
Expand Down
2 changes: 1 addition & 1 deletion README.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package version for this package is still referencing the old v0.15.0 for dbt_shopify. This needs to be updated to reflect the latest version range.

Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ This dbt package is dependent on the following dbt packages. These dependencies
```yml
packages:
- package: fivetran/shopify_source
version: [">=0.14.0", "<0.15.0"]
version: [">=0.15.0", "<0.16.0"]

- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'shopify'
version: '0.15.0'
version: '0.16.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
models:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

47 changes: 37 additions & 10 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ integration_tests:
pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
port: 5439
schema: shopify_integration_tests_12
schema: shopify_integration_tests_15
threads: 8
bigquery:
type: bigquery
method: service-account-json
project: 'dbt-package-testing'
schema: shopify_integration_tests_12
schema: shopify_integration_tests_15
threads: 8
keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
snowflake:
Expand All @@ -33,7 +33,7 @@ integration_tests:
role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
schema: shopify_integration_tests_12
schema: shopify_integration_tests_15
threads: 8
postgres:
type: postgres
Expand All @@ -42,13 +42,13 @@ integration_tests:
pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
port: 5432
schema: shopify_integration_tests_12
schema: shopify_integration_tests_15
threads: 8
databricks:
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: shopify_integration_tests_12
schema: shopify_integration_tests_15
threads: 8
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
Expand Down
6 changes: 3 additions & 3 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'shopify_integration_tests'
version: '0.15.0'
version: '0.16.0'
profile: 'integration_tests'
config-version: 2

Expand All @@ -13,7 +13,7 @@ vars:
# shopify_using_fulfillment_event: true
# shopify_using_all_metafields: true
# shopify__standardized_billing_model_enabled: true
shopify_schema: shopify_integration_tests_12
shopify_schema: shopify_integration_tests_15

shopify_source:
shopify_customer_identifier: "shopify_customer_data"
Expand Down Expand Up @@ -57,7 +57,7 @@ dispatch:

models:
+schema: "{{ 'shopify_integrations_tests_sqlw' if target.name == 'databricks-sql' else 'shopify' }}"
# +schema: "shopify_{{ var('directed_schema','dev') }}"
# +schema: "shopify_{{ var('directed_schema','dev') }}" ## To be used for validation tests

seeds:
shopify_integration_tests:
Expand Down
16 changes: 0 additions & 16 deletions models/shopify__discounts.sql
Original file line number Diff line number Diff line change
@@ -1,25 +1,9 @@
{{
config(
materialized='table' if target.type in ('bigquery', 'databricks', 'spark') else 'incremental',
unique_key='discounts_unique_key',
incremental_strategy='delete+insert' if target.type in ('postgres', 'redshift', 'snowflake') else 'merge',
cluster_by=['discount_code_id']
)
}}

with discount as (

select
*,
{{ dbt_utils.generate_surrogate_key(['source_relation', 'discount_code_id']) }} as discounts_unique_key
from {{ var('shopify_discount_code') }}

{% if is_incremental() %}
where cast(coalesce(updated_at, created_at) as date) >= {{ shopify.shopify_lookback(
from_date="max(cast(coalesce(updated_at, created_at) as date))",
interval=var('lookback_window', 7),
datepart='day') }}
{% endif %}
),

price_rule as (
Expand Down
13 changes: 0 additions & 13 deletions models/shopify__order_lines.sql
Original file line number Diff line number Diff line change
@@ -1,23 +1,10 @@
{{
config(
materialized='table' if target.type in ('bigquery', 'databricks', 'spark') else 'incremental',
unique_key='order_lines_unique_key',
incremental_strategy='delete+insert' if target.type in ('postgres', 'redshift', 'snowflake') else 'merge',
cluster_by=['order_line_id']
)
}}

with order_lines as (

select
*,
{{ dbt_utils.generate_surrogate_key(['source_relation', 'order_line_id']) }} as order_lines_unique_key
from {{ var('shopify_order_line') }}

{% if is_incremental() %}
where cast(_fivetran_synced as date) >= {{ shopify.shopify_lookback(from_date="max(cast(_fivetran_synced as date))", interval=var('lookback_window', 3), datepart='day') }}
{% endif %}

), product_variants as (

select *
Expand Down
18 changes: 1 addition & 17 deletions models/shopify__orders.sql
Original file line number Diff line number Diff line change
@@ -1,26 +1,10 @@
{{
config(
materialized='table' if target.type in ('bigquery', 'databricks', 'spark') else 'incremental',
unique_key='orders_unique_key',
incremental_strategy='delete+insert' if target.type in ('postgres', 'redshift', 'snowflake') else 'merge',
cluster_by=['order_id']
)
}}

with orders as (

select
*,
{{ dbt_utils.generate_surrogate_key(['source_relation', 'order_id']) }} as orders_unique_key
from {{ var('shopify_order') }}

{% if is_incremental() %}
where cast(coalesce(updated_timestamp, created_timestamp) as date) >= {{ shopify.shopify_lookback(
from_date="max(cast(coalesce(updated_timestamp, created_timestamp) as date))",
interval=var('lookback_window', 7),
datepart='day') }}
{% endif %}


), order_lines as (

select *
Expand Down
16 changes: 1 addition & 15 deletions models/shopify__transactions.sql
Original file line number Diff line number Diff line change
@@ -1,22 +1,8 @@
{{
config(
materialized='table' if target.type in ('bigquery', 'databricks', 'spark') else 'incremental',
unique_key='transactions_unique_id',
incremental_strategy='delete+insert' if target.type in ('postgres', 'redshift', 'snowflake') else 'merge',
cluster_by=['transaction_id']
)
}}

with transactions as (
select
*,
{{ dbt_utils.generate_surrogate_key(['source_relation', 'transaction_id'])}} as transactions_unique_id
from {{ var('shopify_transaction') }}

{% if is_incremental() %}
-- use created_timestamp instead of processed_at since a record could be created but not processed
where cast(created_timestamp as date) >= {{ shopify.shopify_lookback(from_date="max(cast(created_timestamp as date))", interval=var('lookback_window', 7), datepart='day') }}
{% endif %}
from {{ var('shopify_transaction') }}

), tender_transactions as (

Expand Down
5 changes: 3 additions & 2 deletions packages.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
packages:
- package: fivetran/shopify_source
version: [">=0.14.0", "<0.15.0"]
- git: https://github.com/fivetran/dbt_shopify_source.git
revision: bugfix/redshift-limit-one
warn-unpinned: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to swap before release

fivetran-avinash marked this conversation as resolved.
Show resolved Hide resolved