New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

new docs for on_schema_change #747

Merged

jtcohen6 merged 1 commit into next from update/configuring-incremental-models

Aug 2, 2021

Contributor

matt-winkler commented Jul 25, 2021

Description & motivation

Added documentation for new on_schema_change configuration options on incremental models per 3387

Pre-release docs

Is this change related to an unreleased version of dbt?

Yes: please
- update the base branch to next
- add Changelog components: <Changelog>[New/Changed] in v0.x.0</Changelog>
- add links to the "New and changed documentation" section of the latest Migration Guide


          new docs for on_schema_change

e1f6cbe

matt-winkler requested a review from annafil as a code owner

July 25, 2021 23:05

matt-winkler requested a review from jtcohen6

July 25, 2021 23:05

jtcohen6 reviewed

View reviewed changes

Collaborator

jtcohen6 left a comment

Nice work @matt-winkler!! I always find it so thrilling to update a "this isn't yet possible" section in the docs site with "heck yeah, it is now!"

I left a handful of comments, mostly around refining the language we want to use to talk about this feature. On substance, what you've got is pretty much good to go, so I'm happy to give this a thumbs up when you are.

Thanks for creating the v0.21 migration guide stub as well. I'll fill that out as I add docs for more new-in-21 features.

website/docs/docs/building-a-dbt-project/building-models/configuring-incremental-models.md

Comment on lines +156 to +166

+              ```sql
+              {{
+                  config(
+                      materialized='incremental',
+                      unique_key='date_day',
+                      on_schema_change=['ignore', 'fail', 'append_new_columns', 'sync_all_columns'] --choose one
+                  )
+              }}
+              ```
+              </File>

Collaborator

jtcohen6 Jul 28, 2021

Could you also include an example of setting this from the project file? Similar to how we show both for incremental_strategy below:

models:
  +on_schema_change: sync_all_columns

website/docs/docs/building-a-dbt-project/building-models/configuring-incremental-models.md


		Note: The `on_schema_change` behaviors do not currently include backfill functionality on the target table.

		### For dbt versions <= v0.20.0, refer to the logic below

Collaborator

jtcohen6 Jul 28, 2021

Suggested change

      
            ### For dbt versions <= v0.20.0, refer to the logic below
          
            ### Default behavior
          
            This is the behavior if `on_schema_change: ignore`, and on older versions of dbt.

website/docs/docs/building-a-dbt-project/building-models/configuring-incremental-models.md

+              The behaviors for `on_schema_change` are:
+              * `ignore`: this is the default, and preserves the behavior of dbt versions <= v.0.20.0
+              * `fail`: triggers an error message when the source and target schemas diverge

Collaborator

jtcohen6 Jul 28, 2021

What do you think about using the words "old" and "new" here instead of "target" and "source"? I'm not convinced that's better, just thinking about what would be most intuitive for a person newly grokking incremental models

website/docs/docs/building-a-dbt-project/building-models/configuring-incremental-models.md

+              * `ignore`: this is the default, and preserves the behavior of dbt versions <= v.0.20.0
+              * `fail`: triggers an error message when the source and target schemas diverge
+              * `append_new_columns`: Append new columns identified in the temporary source schema to the target schema. Note that this setting does *not* remove columns from the target that are not present in the source.
+              * `sync_all_columns`: Adds any new columns to the target table and removes them from the temporary source schema. Note that this is *inclusive* of data type changes. On Bigquery, data type changes currently cause a full table scan, so we advise Bigquery users to be mindful of the trade-offs when implementing this setting.

Collaborator

jtcohen6 Jul 28, 2021

Just a bit of wordsmithing:

Suggested change

      
            * `sync_all_columns`: Adds any new columns to the target table and removes them from the temporary source schema. Note that this is *inclusive* of data type changes. On Bigquery, data type changes currently cause a full table scan, so we advise Bigquery users to be mindful of the trade-offs when implementing this setting.  
          
            * `sync_all_columns`: Adds all new columns, and removes all columns that have been removed. Note that this is *inclusive* of data type changes.  On BigQuery, data type changes currently require a full scan of the existing table, so we advise BigQuery users to be mindful of the trade-offs when implementing this setting.

website/docs/docs/building-a-dbt-project/building-models/configuring-incremental-models.md

+              * `append_new_columns`: Append new columns identified in the temporary source schema to the target schema. Note that this setting does *not* remove columns from the target that are not present in the source.
+              * `sync_all_columns`: Adds any new columns to the target table and removes them from the temporary source schema. Note that this is *inclusive* of data type changes. On Bigquery, data type changes currently cause a full table scan, so we advise Bigquery users to be mindful of the trade-offs when implementing this setting.
+              **Note**: The `on_schema_change` behaviors do not currently include backfill functionality on the target table.

Collaborator

jtcohen6 Jul 28, 2021

Suggested change

      
            **Note**: The `on_schema_change` behaviors do not currently include backfill functionality on the target table.  
          
            **Note**: None of the `on_schema_change` behaviors backfill values in old records for newly added columns. If you need to populate those values, we recommend running manual updates, or triggering a `--full-refresh`.

website/docs/docs/building-a-dbt-project/building-models/configuring-incremental-models.md

@@ @@ -196,6 +230,7 @@ select ... @@
               <Changelog>
                 - **v0.20.0:** Introduced `merge_update_columns`
+                - **v0.21.0:** Introduced `on_schema_change`

Collaborator

jtcohen6 Jul 28, 2021

Could you move this to a new Changelog entry up under ## What if the columns of my incremental model change??

jtcohen6 approved these changes

View reviewed changes

Collaborator

jtcohen6 left a comment •

edited

Loading

I'm going to merge this so we can have it live in next in time for releasing v0.21.0-b1. I'll follow up on some of the TODOs in updates to come.

jtcohen6 merged commit bf5d922 into next

jtcohen6 deleted the update/configuring-incremental-models branch

August 2, 2021 21:29

jtcohen6 mentioned this pull request

Prerelease: v0.21.0-b1 #756

Merged

5 tasks

jtcohen6 pushed a commit that referenced this pull request


          new docs for on_schema_change (#747)

e5913a2

jtcohen6 added a commit that referenced this pull request


          [Release] v0.21.0 (#825)

b24c20e

* new docs for on_schema_change (#747)

* Prerelease: v0.21.0-b1 (#756)

* Edits for on_schema_change

* dbt source freshness

* DBT_ENV_SECRET_ env var

* dbt build first cut

* Redshift profile ra3 property

* Beta callout in migration guide

* Self-review build docs

* Prerelease: v0.21.0-b2 (#767)

* state:modified subselectors, modified.macros

* Add build RPC method

* PR feedback

* add dbt deps logging example (#798)

* [Prerelease] Prep for 0.21.0-rc1 (#802)

* Switch --models to --select

* BQ snapshot config aliases

* Configurable postgres connect timeout

* Add list --output-keys. Add list RPC method

* Adapter unique_field dbt-labs/dbt-core#3796

* PR feedback: -s replaces -m

* Add BQ execution_project

* Add default property for yaml selectors

* Update migration guide. New fields in sources.json

* Test where config macro

* Dispatch for global macros

* Update build details

* Some self review

* Greedy flag/property for test selection

* Resolve #803 while we're here

* Fix broken link typo

* Refactor: configs + properties (#766)

* Very incomplete start of first draft

* Second big pass

* Initial self review. Sidebar reorg

* Continue self review. Address #616

* Add note to migration guide

* [Prerelease] v0.21 post-RC updates (#831)

* Artifact version bumps

* Add v0.20 -> v0.21 to Cloud upgrade FAQ

* PR feedback from jasnonaz <3

* [Release] v0.21.0 (#839)

* Update links, info in migration guides

* Fix v0.21 discourse link

Co-authored-by: matt-winkler <75497565+matt-winkler@users.noreply.github.com>
Co-authored-by: Sung Won Chung <sungwonchung3@gmail.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet