-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-1044] Leverage BigQuery table clones for a Low Cost CI environment #270
Comments
I'd like to add to this issue/enhancement. I assume it might be fairly easy to add simple table clones within the current copy materialization framework provided by dbt-bigquery. The CopyJobConfig is already incorporated in the dbt-bigquery framework and used for the materialized='copy' configurations. This CopyJobConfig is implemented at the following location. We can pass the operation_type parameter to the CopyJobConfig, see docs and accepted_strings. The accepted values for this parameter include both COPY and CLONE, my guess is that default is set to COPY. Adding operation_type to the copy_bq_table and copy_and_results functions would reconfigure our copy job.
Further down the line, in the BigQuery Adapter Class file we could redefine the copy_table function by adding a operation_type input parameters. This way, we can call the bigquery adapter with either copy or clone (should be the accepted values, default is set to clone)
We can then duplicate the copy materialization and changing it to clone.sql and changing the write_dispositions input to clone copy.sql.
This would require the config to include an operation_type (logging should be more extensive). I'm not an expert by any chance and have never meddled with dbt source code, so I'm probably missing something / underestimating things. However, these are my two cents. |
I'm moving that out of triage and into refinement. |
This would really help us as our costs for running dbt on bigquery is quite expensive on our increasing data tables. We are going to do something hacky like this. |
Let's do this for real :) and not just on BQ (but leveraging BQ-specific copying/cloning capabilities where possible!) Check out: |
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days. |
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers. |
Describe the feature
The BigQuery table clone feature is in Preview since February 2022. It would allow us to setup a CI environment in a dedicated GCP Project/Bigquery Database at Zero/Low Cost (following the zero-copy-clone logic of Snowflake).
Describe alternatives you've considered
Configure CI to build in a separate project.
Additional context
Limitations
You can't create a clone of a :
(Source : Table Clones Limitations)
View and external tables being just an abstraction for data stored elsewhere, we could recreate them in the CI Google Project/BigQuery Database.
Workflow
GCP describe how to Create Tables Clones in SQL, I just need to figure out a couple of things about the workflow :
Who will this benefit?
Any user of dbt-bigquery who wishes to setup a secured test environment for CI.
Are you interested in contributing this feature?
Yes.
The text was updated successfully, but these errors were encountered: