Handle schema pattern on BQ #275

clrcrl · 2020-09-04T17:56:21Z

This is a:

bug fix PR with no breaking changes (please change the base branch to master)
new functionality
a breaking change

Description & motivation

We weren't handling the schema_pattern arg correctly. And it turns out that on BQ it's kind of annoying to handle it the right way. I think I got it though.

Checklist

I have verified that these changes work locally on the following warehouses (Note: it's okay if you do not have access to all warehouses, this helps us understand what has been covered)
- BigQuery
- Postgres
- Redshift
- Snowflake
n/a I have updated the README.md (if applicable)
n/a I have added tests & descriptions to my models (and macros if applicable)
I have added an entry to the changelog

Local testing

I added this:
Raw: analysis/foo.sql

{% set tables = dbt_utils.get_relations_by_pattern(
    schema_pattern = '%',
    table_pattern = '%'
) %}

{% for table in tables %}
    select * from {{ table }}
{% endfor %}

Compiled:

    select * from `unique-serenity-254521`.`dbt_claire`.`data_star_expected`

    select * from `unique-serenity-254521`.`dbt_claire`.`data_test_relationships_where_table_1`

    select * from `unique-serenity-254521`.`dbt_claire`.`data_unpivot`
...

LGTM :)

To discuss:

wildcard logic — see comment below
whitespace cleanup — I didn't do any whitespace cleaning up. I think it's okay because these macros use return though

clrcrl · 2020-09-04T17:59:04Z

macros/sql/get_tables_by_pattern_sql.sql

-        where table_schema = '{{schema_pattern}}'
-            and lower(table_name) like lower ('{{table_pattern}}')
-            and lower(table_name) not like lower ('{{exclude}}')
+    {% if '%' in schema_pattern %}


Technically, _ is a wildcard pattern (docs)

This means that we actually introduced a slight regression by switching to using like operators instead of = operators

Not sure I follow how this is a regression?

Question is, should we amend that line to be

{% if '%' in schema_pattern or '_' in schema_pattern %}

knowing we're going to get a lot more false positives...

Not sure I follow how this is a regression?

Let's say we have schemas named a_c and abc.

I think previously for get_relations_by_prefix we used a strict = for the schema name.

where schema_name = 'a_c'

This would return only a_c.

Now, we pass it to a like operator (i.e. here and here)

where schema_name like 'a_c'

This would return both a_c and abc.

It's a teeny tiny regression for anyone that uses get_relations_by_prefix

Ah, I see your point. To match only a_c, the user should pass the pattern as a\\_c. I still think it's okay to ship this in a patch release, maybe link to the BQ doc in the readme.

I think let's just ship this as-is, and if someone notices the regression we can fix it heh

jtcohen6

This is really cool! Updated the integration test to (a) include an input table from a different schema, and (b) actually check for equality with an expected result. Smooth sailing on all databases, including BQ.

Just left one comment/question on your comment/question

jtcohen6 · 2020-09-08T13:03:33Z

macros/sql/get_tables_by_pattern_sql.sql

-        where table_schema = '{{schema_pattern}}'
-            and lower(table_name) like lower ('{{table_pattern}}')
-            and lower(table_name) not like lower ('{{exclude}}')
+    {% if '%' in schema_pattern %}


Not sure I follow how this is a regression?

Question is, should we amend that line to be

{% if '%' in schema_pattern or '_' in schema_pattern %}

knowing we're going to get a lot more false positives...

clausherther · 2020-09-09T01:51:43Z

Just ran into this :)

For context, in our Blue/Green deployment pipeline, the schema changes from Blue to Green, so we'd like to use % for the schema pattern.

clausherther · 2020-09-09T01:59:36Z

FYI, we are currently using {% set relations = dbt_utils.get_relations_by_prefix('xa', '') %} to find all tables in the xa schema, but this returns 0 rows now if the model you're running this in ({{adapter.quote(database)}}.{{schema}}) is in a different schema than schema_pattern. I think this is a bug?

select distinct
            table_schema, table_name

        from {{adapter.quote(database)}}.{{schema}}.INFORMATION_SCHEMA.TABLES
        where table_schema = '{{schema_pattern}}'
            and lower(table_name) like lower ('{{table_pattern}}')
            and lower(table_name) not like lower ('{{exclude}}')

Handle schema pattern on BQ

c02a04e

clrcrl commented Sep 4, 2020

View reviewed changes

clrcrl requested a review from jtcohen6 September 4, 2020 18:47

clrcrl mentioned this pull request Sep 4, 2020

Bump dbt version dbt-labs/dbt-codegen#31

Merged

1 task

Update test with schema pattern

b165de1

jtcohen6 approved these changes Sep 8, 2020

View reviewed changes

Update Changelog

9520785

clausherther mentioned this pull request Sep 9, 2020

get_relations_by_prefix only works in target schema #277

Closed

clrcrl merged commit c838953 into master Sep 14, 2020

clrcrl deleted the fix/bq-schema-pattern branch September 14, 2020 14:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle schema pattern on BQ #275

Handle schema pattern on BQ #275

clrcrl commented Sep 4, 2020 •

edited

Loading

clrcrl Sep 4, 2020

clrcrl Sep 4, 2020

jtcohen6 Sep 8, 2020

clrcrl Sep 8, 2020 •

edited

Loading

jtcohen6 Sep 8, 2020

clrcrl Sep 8, 2020 •

edited

Loading

jtcohen6 left a comment

jtcohen6 Sep 8, 2020

clausherther commented Sep 9, 2020 •

edited

Loading

clausherther commented Sep 9, 2020

Handle schema pattern on BQ #275

Handle schema pattern on BQ #275

Conversation

clrcrl commented Sep 4, 2020 • edited Loading

Description & motivation

Checklist

Local testing

To discuss:

clrcrl Sep 4, 2020

Choose a reason for hiding this comment

clrcrl Sep 4, 2020

Choose a reason for hiding this comment

jtcohen6 Sep 8, 2020

Choose a reason for hiding this comment

clrcrl Sep 8, 2020 • edited Loading

Choose a reason for hiding this comment

jtcohen6 Sep 8, 2020

Choose a reason for hiding this comment

clrcrl Sep 8, 2020 • edited Loading

Choose a reason for hiding this comment

jtcohen6 left a comment

Choose a reason for hiding this comment

jtcohen6 Sep 8, 2020

Choose a reason for hiding this comment

clausherther commented Sep 9, 2020 • edited Loading

clausherther commented Sep 9, 2020

clrcrl commented Sep 4, 2020 •

edited

Loading

clrcrl Sep 8, 2020 •

edited

Loading

clrcrl Sep 8, 2020 •

edited

Loading

clausherther commented Sep 9, 2020 •

edited

Loading