-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Add optional lookback_interval
param to insert_by_period
materialization
#394
Feature: Add optional lookback_interval
param to insert_by_period
materialization
#394
Conversation
* Tidy up changelog * Add 0.7.0 entry to changelog * Add order_by argument to get_column_values (#349) * Add slugify macro to utils, use in pivot macro (#314) * 0.20.0 compatibility (#371) * Explicitly redefine Redshift -> default * Upgrade generic tests * Rm namespaces macro. New dispatch syntax * Run tests with 0.20.0rc1 * Update changelog, readme Co-authored-by: Jeremy Cohen <jeremy@fishtownanalytics.com> * Simplify concat (#373) * Postgres also have an alternative concat binary operation (#296) * Update default implementation of concat macro Co-authored-by: Christophe Duong <christophe.duong@gmail.com> Co-authored-by: Jeremy Cohen <jeremy@fishtownanalytics.com> Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
* power and pow are synonyms (except in TSQL) * contrib * power is more crossdb friendly than pow Co-authored-by: Jeremy Cohen <jeremy@fishtownanalytics.com>
…materialization was in an old version of dbt_utils. not v0.7.0
lookback_interval' param to
insert by period` materialization for use with dbt Segment package modelslookback_interval
param to insert by period
materialization for use with dbt Segment package models
lookback_interval
param to insert by period
materialization for use with dbt Segment package modelslookback_interval
param to insert by period
materialization for use with dbt Segment package models
lookback_interval
param to insert by period
materialization for use with dbt Segment package modelslookback_interval
param to insert_by_period
materialization
I also see issue dbt-labs/dbt-labs-experimental-features#32 talking about modernizing this materialization. But, it looks like all of the changes discussed there have happened in the Would be great to modernize the original redshift version as well (or a universal materialization). But that work seems like a separate PR. |
@jtcohen6 since you have done a fair amount of work with Segment in the past, thought you might have the context to review this -- it's an application of insert_by_period that allows for a window to use for sessionization-type queries. |
Hi @GClunies! This is a really fantastic addition to the codebase and clearly has a ton of thought and care put into it. We are actually in the process of determining what the future of the insert_by_period materialization is. It's less battle tested and cross compatible and much more experimental than the level of assurance you would normally expect in dbt-utils and it'd quite likely we will be moving it into a repo for experimental functionality in the near future. We'd love this addition and other things like cross database functionality to get added when we do that. In the meantime I'm going to close this as it isn't a good fit for utils right now but we really, really do appreciate the contribution and look forward to merging this in when we ultimately determine where insert_by_period is going to live. |
@jasnonaz thanks for the kind words and update! Agree that a separate repo makes a lot of sense. Feel free to ping me in the new repo when you get to it. For now I have this as a macro in our dbt project so no risk of a breaking change for me. |
This is a:
master
dev/
branchdev/
branchDescription & motivation
The
insert_by_period
materialization does not work with sessionization models likesegment_web_page_views__sessionized
from the dbt Segment package due to some unusually complex logic. Since these sessionization models are often very large and complex, I would like to be able to use theinsert_by_period
materialization with them.This is my first PR to a dbt repo, let alone a dev branch, so please let me know if any additional work is needed for this PR.
Also, huge thanks to Scott Barber on the dbt-labs customer support engineering team to get me started on refactoring the macro! His input and guidance were a massive help and this was a great learning experience for me.
Describe your changes, and why you're making them.
This PR adds an optional
lookback_interval
parameter to theinsert_by_period
materialization, allowing it to be used with sessionization models likesegment_web_page_views__sessionized
from the dbt Segment package.This is achieved by allowing the end user to include
__PERIOD_FILTER_WITH_LOOKBACK__
in their model SQL, which the materialization replaces at runtime of each period using this modified macro logic in the materialization code.In addition to the integration tests I have added in this PR, I have tested the application of these changes to the materialization on my sessionization models @Surfline to ensure idempotency of
session_id
s is still maintained. I have added documentation on usage to the README.The materialization can also be used on any model where loading by period is desired, but a lookback interval is also required to filter the upstream data (which is typically accomplished using incremental logic).
Checklist
star()
source)