Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery KMS should not be used on VIEWS #3682

Closed
1 of 5 tasks
AndreasTA-AW opened this issue Aug 3, 2021 · 2 comments · Fixed by #3691
Closed
1 of 5 tasks

BigQuery KMS should not be used on VIEWS #3682

AndreasTA-AW opened this issue Aug 3, 2021 · 2 comments · Fixed by #3691
Labels
bigquery bug Something isn't working good_first_issue Straightforward + self-contained changes, good for new contributors!

Comments

@AndreasTA-AW
Copy link
Contributor

Describe the bug

If I add one kms key for the entire project that should be used for all models, like below. It will blow up, because you can't add KMS encryption to views. ( Unless I'm missing something here )

models:
  test:
    +kms_key_name: 'projects/<project>/locations/<location>/keyRings/key-ring/cryptoKeys/key'

Steps To Reproduce

Add kms_key_name to the entire project or specific folder that contains views.

Expected behavior

It should skip views, and only add the encryption key to the tables.

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

dbt --version
installed version: 0.20.0
   latest version: 0.20.0

Up to date!

Plugins:
  - bigquery: 0.20.0
  - snowflake: 0.20.0
  - redshift: 0.20.0
  - postgres: 0.20.0

The operating system you're using:
MAC

Additional context

Should probably be in its own issue, but I would like to be able to add default encryption key to the entire dataset that DBT creates. Maybe this is possible, but I didn't find anything about it :)

Cheers!

@AndreasTA-AW AndreasTA-AW added bug Something isn't working triage labels Aug 3, 2021
@jtcohen6
Copy link
Contributor

jtcohen6 commented Aug 3, 2021

Thanks for the report @AndreasTA-AW! I agree, dbt-bigquery should be smarter here about which configs can apply to which materialization types.

We have an adapter method called get_table_options:

https://github.com/dbt-labs/dbt/blob/159e79ee6ba86c852f4bf067dd0f1b6356135383/plugins/bigquery/dbt/adapters/bigquery/impl.py#L771-L805

That's called in a Jinja macro, bigquery_table_options, which templates out the appropriate string:

https://github.com/dbt-labs/dbt/blob/159e79ee6ba86c852f4bf067dd0f1b6356135383/plugins/bigquery/dbt/include/bigquery/macros/adapters.sql#L31-L40

Then that stringified version is included in the create view as statement:
https://github.com/dbt-labs/dbt/blob/159e79ee6ba86c852f4bf067dd0f1b6356135383/plugins/bigquery/dbt/include/bigquery/macros/adapters.sql#L61-L70

I can think of two possible resolutions here:

  1. Add conditional logic within the get_table_options method that checks config.get('materialization') for properties like kms_key_name that won't work on views.
  2. Create a new method, get_view_options, that serves the same purpose as get_table_options but excludes any configs that are unsupported on views. Then the bigquery__create_view_as macro would call bigquery_view_options --> adapter.get_view_options.

I'm leaning toward the second option. The first obfuscates differences, and it won't account for custom materializations. By contrast, mapping in the second (create view as --> get_view_options, create table as --> get_table_options) feels right. The only challenge will be implementing get_table_options + get_view_options in a way that doesn't duplicate a bunch of logic.

This feels like a straightforward and self-contained change, so I'm going to mark it a good first issue. Would you be interested in contributing the fix for it? :)

@jtcohen6 jtcohen6 added bigquery good_first_issue Straightforward + self-contained changes, good for new contributors! and removed triage labels Aug 3, 2021
@AndreasTA-AW
Copy link
Contributor Author

Thanks for the quick response! :)
I looked at the code before I submitted the PR and it looks like it worked before a major refactor of the code. Before, I think it only triggered for tables, but now when it’s in python, I think i triggers for both tables and views. I do agree with the second solution though, to keep them isolated.

I will check if I can put some time to submit a PR, I’m a consultant and a father and all that :P but I can try to find some time.

Also, wanna let u know that DBT is a great tool and it makes life a lot easier. Ofc it got some flaws, like everything, but it’s getting there!

cheers!

jtcohen6 pushed a commit that referenced this issue Sep 9, 2021
…fferen… (#3691)

* Changed how tables and views are generated to be able to use different options

* 3682 added unit tests

* 3682 had conflict in changelog and became a bit messy

* 3682 Tested to add default kms to dataset and accidently pushed the changes
TeddyCr pushed a commit to TeddyCr/dbt that referenced this issue Sep 9, 2021
…o use differen… (dbt-labs#3691)

* Changed how tables and views are generated to be able to use different options

* 3682 added unit tests

* 3682 had conflict in changelog and became a bit messy

* 3682 Tested to add default kms to dataset and accidently pushed the changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery bug Something isn't working good_first_issue Straightforward + self-contained changes, good for new contributors!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants