Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Fix] Remove incremental logic #97

Merged
merged 11 commits into from
Jan 22, 2025
Merged
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,24 @@
# dbt_shopify v0.16.0
This release includes the following updates:

## Bug Fixes
- Removed incremental logic in the following end models
([PR #97](https://github.com/fivetran/dbt_shopify/pull/97)):
- `shopify__discounts`
- `shopify__order_lines`
- `shopify__orders`
- `shopify__transactions`
- Incremental strategies were removed from these models due to potential inaccuracies from incremental runs. For instance, the `new_vs_repeat` field in `shopify__orders` could produce incorrect results during incremental runs. To ensure consistency, this logic was removed across all warehouses. If the previous incremental functionality was valuable to you, please consider opening a feature request to revisit this approach.

## [Upstream Under-the-Hood Updates from `shopify_source` Package](https://github.com/fivetran/dbt_shopify_source/releases/tag/v0.15.0)
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
- (Affects Redshift only) Creates new `shopify_union_data` macro to accommodate Redshift's treatment of empty tables.
- For each staging model, if the source table is not found in any of your schemas, the package will create a table with one row with null values for Redshift destinations. There will be no change in behavior in non-Redshift warehouses.
- This is necessary as Redshift will ignore explicit data casts when a table is completely empty and materialize every column as a `varchar`. This throws errors in downstream transformations in the `shopify` package. The 1 row will ensure that Redshift will respect the package's datatype casts.

## Documentation
- Added Quickstart model counts to README. ([#96](https://github.com/fivetran/dbt_shopify/pull/96))
- Corrected references to connectors and connections in the README. ([#96](https://github.com/fivetran/dbt_shopify/pull/96))

# dbt_shopify v0.15.0

[PR #94](https://github.com/fivetran/dbt_shopify/pull/94) includes the following updates:
Expand Down
16 changes: 9 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,14 +49,16 @@ Curious what these tables can do? Check out example visualizations from the [sho
</a>
</p>

### Materialized Models
Each Quickstart transformation job run materializes 89 models if all components of this data model are enabled. This count includes all staging, intermediate, and final models materialized as `view`, `table`, or `incremental`.
<!--section-end-->

## How do I use the dbt package?

### Step 1: Prerequisites
To use this dbt package, you must have the following:

- At least one Fivetran Shopify connector syncing data into your destination.
- At least one Fivetran Shopify connection syncing data into your destination.
- One of the following destinations:
- [BigQuery](https://fivetran.com/docs/destinations/bigquery)
- [Snowflake](https://fivetran.com/docs/destinations/snowflake)
Expand All @@ -70,7 +72,7 @@ If you are **not** using the [Shopify Holistic reporting package](https://github
```yml
packages:
- package: fivetran/shopify
version: [">=0.15.0", "<0.16.0"] # we recommend using ranges to capture non-breaking changes automatically
version: [">=0.16.0", "<0.17.0"] # we recommend using ranges to capture non-breaking changes automatically
```

Do **NOT** include the `shopify_source` package in this file. The transformation package itself has a dependency on it and will install the source package as well.
Expand All @@ -84,7 +86,7 @@ dispatch:
```

### Step 3: Define database and schema variables
#### Single connector
#### Single connection
By default, this package runs using your destination and the `shopify` schema. If this is not where your Shopify data is (for example, if your Shopify schema is named `shopify_fivetran`), add the following configuration to your root `dbt_project.yml` file:

```yml
Expand All @@ -94,8 +96,8 @@ vars:
shopify_database: your_database_name
shopify_schema: your_schema_name
```
#### Union multiple connectors
If you have multiple Shopify connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `shopify_union_schemas` OR `shopify_union_databases` variables (cannot do both) in your root `dbt_project.yml` file:
#### Union multiple connections
If you have multiple Shopify connections in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `shopify_union_schemas` OR `shopify_union_databases` variables (cannot do both) in your root `dbt_project.yml` file:

```yml
# dbt_project.yml
Expand All @@ -110,7 +112,7 @@ To connect your multiple schema/database sources to the package models, follow t

### Step 4: Disable models for non-existent sources

The package takes into consideration that not every Shopify connector may have the `fulfillment_event`, `metadata`, or `abandoned_checkout` tables (including `abandoned_checkout`, `abandoned_checkout_discount_code`, and `abandoned_checkout_shipping_line`) and allows you to enable or disable the corresponding functionality. To enable/disable the modeling of the mentioned source tables and their downstream references, add the following variable to your `dbt_project.yml` file:
The package takes into consideration that not every Shopify connection may have the `fulfillment_event`, `metadata`, or `abandoned_checkout` tables (including `abandoned_checkout`, `abandoned_checkout_discount_code`, and `abandoned_checkout_shipping_line`) and allows you to enable or disable the corresponding functionality. To enable/disable the modeling of the mentioned source tables and their downstream references, add the following variable to your `dbt_project.yml` file:

```yml
# dbt_project.yml
Expand Down Expand Up @@ -254,7 +256,7 @@ This dbt package is dependent on the following dbt packages. These dependencies
```yml
packages:
- package: fivetran/shopify_source
version: [">=0.14.0", "<0.15.0"]
version: [">=0.15.0", "<0.16.0"]

- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'shopify'
version: '0.15.0'
version: '0.16.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
models:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

47 changes: 37 additions & 10 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ integration_tests:
pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
port: 5439
schema: shopify_integration_tests_12
schema: shopify_integration_tests_15
threads: 8
bigquery:
type: bigquery
method: service-account-json
project: 'dbt-package-testing'
schema: shopify_integration_tests_12
schema: shopify_integration_tests_15
threads: 8
keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
snowflake:
Expand All @@ -33,7 +33,7 @@ integration_tests:
role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
schema: shopify_integration_tests_12
schema: shopify_integration_tests_15
threads: 8
postgres:
type: postgres
Expand All @@ -42,13 +42,13 @@ integration_tests:
pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
port: 5432
schema: shopify_integration_tests_12
schema: shopify_integration_tests_15
threads: 8
databricks:
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: shopify_integration_tests_12
schema: shopify_integration_tests_15
threads: 8
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
Expand Down
6 changes: 3 additions & 3 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'shopify_integration_tests'
version: '0.15.0'
version: '0.16.0'
profile: 'integration_tests'
config-version: 2

Expand All @@ -13,7 +13,7 @@ vars:
# shopify_using_fulfillment_event: true
# shopify_using_all_metafields: true
# shopify__standardized_billing_model_enabled: true
shopify_schema: shopify_integration_tests_12
shopify_schema: shopify_integration_tests_15

shopify_source:
shopify_customer_identifier: "shopify_customer_data"
Expand Down Expand Up @@ -57,7 +57,7 @@ dispatch:

models:
+schema: "{{ 'shopify_integrations_tests_sqlw' if target.name == 'databricks-sql' else 'shopify' }}"
# +schema: "shopify_{{ var('directed_schema','dev') }}"
# +schema: "shopify_{{ var('directed_schema','dev') }}" ## To be used for validation tests

seeds:
shopify_integration_tests:
Expand Down
16 changes: 0 additions & 16 deletions models/shopify__discounts.sql
Original file line number Diff line number Diff line change
@@ -1,25 +1,9 @@
{{
config(
materialized='table' if target.type in ('bigquery', 'databricks', 'spark') else 'incremental',
unique_key='discounts_unique_key',
incremental_strategy='delete+insert' if target.type in ('postgres', 'redshift', 'snowflake') else 'merge',
cluster_by=['discount_code_id']
)
}}

with discount as (

select
*,
{{ dbt_utils.generate_surrogate_key(['source_relation', 'discount_code_id']) }} as discounts_unique_key
from {{ var('shopify_discount_code') }}

{% if is_incremental() %}
where cast(coalesce(updated_at, created_at) as date) >= {{ shopify.shopify_lookback(
from_date="max(cast(coalesce(updated_at, created_at) as date))",
interval=var('lookback_window', 7),
datepart='day') }}
{% endif %}
),

price_rule as (
Expand Down
13 changes: 0 additions & 13 deletions models/shopify__order_lines.sql
Original file line number Diff line number Diff line change
@@ -1,23 +1,10 @@
{{
config(
materialized='table' if target.type in ('bigquery', 'databricks', 'spark') else 'incremental',
unique_key='order_lines_unique_key',
incremental_strategy='delete+insert' if target.type in ('postgres', 'redshift', 'snowflake') else 'merge',
cluster_by=['order_line_id']
)
}}

with order_lines as (

select
*,
{{ dbt_utils.generate_surrogate_key(['source_relation', 'order_line_id']) }} as order_lines_unique_key
from {{ var('shopify_order_line') }}

{% if is_incremental() %}
where cast(_fivetran_synced as date) >= {{ shopify.shopify_lookback(from_date="max(cast(_fivetran_synced as date))", interval=var('lookback_window', 3), datepart='day') }}
{% endif %}

), product_variants as (

select *
Expand Down
18 changes: 1 addition & 17 deletions models/shopify__orders.sql
Original file line number Diff line number Diff line change
@@ -1,26 +1,10 @@
{{
config(
materialized='table' if target.type in ('bigquery', 'databricks', 'spark') else 'incremental',
unique_key='orders_unique_key',
incremental_strategy='delete+insert' if target.type in ('postgres', 'redshift', 'snowflake') else 'merge',
cluster_by=['order_id']
)
}}

with orders as (

select
*,
{{ dbt_utils.generate_surrogate_key(['source_relation', 'order_id']) }} as orders_unique_key
from {{ var('shopify_order') }}

{% if is_incremental() %}
where cast(coalesce(updated_timestamp, created_timestamp) as date) >= {{ shopify.shopify_lookback(
from_date="max(cast(coalesce(updated_timestamp, created_timestamp) as date))",
interval=var('lookback_window', 7),
datepart='day') }}
{% endif %}


), order_lines as (

select *
Expand Down
16 changes: 1 addition & 15 deletions models/shopify__transactions.sql
Original file line number Diff line number Diff line change
@@ -1,22 +1,8 @@
{{
config(
materialized='table' if target.type in ('bigquery', 'databricks', 'spark') else 'incremental',
unique_key='transactions_unique_id',
incremental_strategy='delete+insert' if target.type in ('postgres', 'redshift', 'snowflake') else 'merge',
cluster_by=['transaction_id']
)
}}

with transactions as (
select
*,
{{ dbt_utils.generate_surrogate_key(['source_relation', 'transaction_id'])}} as transactions_unique_id
from {{ var('shopify_transaction') }}

{% if is_incremental() %}
-- use created_timestamp instead of processed_at since a record could be created but not processed
where cast(created_timestamp as date) >= {{ shopify.shopify_lookback(from_date="max(cast(created_timestamp as date))", interval=var('lookback_window', 7), datepart='day') }}
{% endif %}
from {{ var('shopify_transaction') }}

), tender_transactions as (

Expand Down
2 changes: 1 addition & 1 deletion packages.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
packages:
- package: fivetran/shopify_source
version: [">=0.14.0", "<0.15.0"]
version: [">=0.15.0", "<0.16.0"]
Loading