[CT-1375] [Bug] on snowflake "quoting" is not respected in python models #6103

pquadri · 2022-10-19T15:49:35Z

Is this a new bug in dbt-core?

I believe this is a new bug in dbt-core
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Using a quoting: true parameter results in the quoting not being respected.

def model(dbt, session):
    print(type(dbt))
    dbt.config(materialized="incremental", unique_key="id")

    leads_created = dbt.source("products_eventstore", "events").filter(
        "\"type\" = 'LeadCreated'"
    )

    leads_df = leads_created["aggregateId"]

    return leads_df

yields this query being executed:

SELECT  *  FROM ( SELECT  *  FROM (PRODUCTS.EVENTSTORE.events)) WHERE "type" = 'LeadCreated'

when the source specifies quoting the identifier.

Expected Behavior

The identifier is correctly quoted

Steps To Reproduce

As above.

Relevant log output

No response

Environment

- OS: OSX 12.5.1
- Python: 3.10.6
- dbt: 1.3.0

Which database adapter are you using with dbt?

snowflake

Additional Context

I think the issue might be the fact that identifier = false is passed explicitly here: https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/include/global_project/macros/python_model/python.sql, removing those defaults solved the problem on my machine.

The text was updated successfully, but these errors were encountered:

pquadri · 2022-11-03T15:17:42Z

can i take this issue if nobody has time to fix it? :D

jtcohen6 · 2022-11-17T14:15:41Z

Hey @pquadri - sorry for the delay getting back to you!

cc @ChenyuLInx: Can you remember the reason why you opted for .quote(False) in the initial implementation?

First question: How to address objects with quoted components in Snowpark Python, given that object names already need to be passed into session.table() as quoted strings? Following your example:

create or replace table analytics.jerco_src_schema."$$VerySpecialTable" as (
    select 1 as aggregateId, 'LeadCreated' as "type"
);

version: 2

sources:
  - name: jerco_src_schema
    tables:
      - name: very_special_table
        identifier: "Very%Special&Table"
    quoting:
      database: false
      schema: false
      identifier: true

Does this work in Snowpark Python if we escape the " as \"?

# https://docs.snowflake.com/en/sql-reference/stored-procedures-python.html

import snowflake.snowpark as snowpark

def main(session: snowpark.Session):
    df = session.table("analytics.jerco_src_schema.\"Very%Special&Table\"")
    df.filter(
        "\"type\" = 'LeadCreated'"
    )
    return df.collect()

I tried it out in Snowsight, and it does!

Second question: Does this just work if we remove the .quote(False, False, False) modifications from ref and from source?

Here's a dbt model, similar to yours:

def model(dbt, session):
    # switched from 'incremental' to 'table' for simplicity
    dbt.config(materialized="table")

    leads_created = dbt.source("jerco_src_schema", "very_special_table").filter(
        "\"type\" = 'LeadCreated'"
    )

    # switched this to .select() so as to return a DataFrame, rather than a single Column
    leads_df = leads_created.select("aggregateId")

    return leads_df

When I do dbt run, after making those changes to the core macros, I see it succeed! This is what's included in the compiled model code:

def source(*args, dbt_load_df_function):
    sources = {"jerco_src_schema.very_special_table": "analytics.jerco_src_schema.\"Very%Special\u0026Table\""}
    key = ".".join(args)
    return dbt_load_df_function(sources[key])

Third question: Given that the change here would be to dbt-core, this will affect other adapters as well. Does this same approach to escape-quoting work in other databases? Namely Spark/Databricks + BigQuery/Dataproc, where quoting is on by default, and quotes are backticks instead of escaped quotes?

For example, this works in SparkSQL:

select * from `dbt_jcohen`.`orders`

Does this work in PySpark?

df = spark.table("`dbt_jcohen`.`orders`")
df.describe()

It looks like it does!

I'm going to verify that a simple Python model runs successfully on both data platforms, with the quoting enabled for source + ref. If it looks good, then I think the straightforward change here is indeed the right one!

* Respect quoting for 'this' in dbt-py models #6619 * Respect quoting for ref/source in dbt-py models #6103 * Add changelog entries

pquadri added bug Something isn't working triage labels Oct 19, 2022

github-actions bot changed the title ~~[Bug] on snowflake "quoting" is not respected in python models~~ [CT-1375] [Bug] on snowflake "quoting" is not respected in python models Oct 19, 2022

jtcohen6 added python_models Team:Adapters Issues designated for the adapter area of the code labels Oct 20, 2022

lostmygithubaccount self-assigned this Oct 23, 2022

jtcohen6 assigned jtcohen6 and unassigned lostmygithubaccount Nov 14, 2022

jtcohen6 added Team:Execution and removed triage Team:Adapters Issues designated for the adapter area of the code labels Nov 17, 2022

jtcohen6 removed their assignment Nov 18, 2022

jtcohen6 mentioned this issue Jan 16, 2023

[CT-1824] [Bug] Python models do not respect quoting for dbt.this() #6619

Closed

2 tasks

jtcohen6 added a commit that referenced this issue Jan 16, 2023

Respect quoting for ref/source in dbt-py models #6103

e48ba3c

jtcohen6 mentioned this issue Jan 16, 2023

Respect quoting config in dbt-py models #6620

Merged

6 tasks

jtcohen6 closed this as completed in #6620 Jan 19, 2023

jtcohen6 added a commit that referenced this issue Jan 19, 2023

Respect quoting config in dbt-py models (#6620)

fa7c4d1

* Respect quoting for 'this' in dbt-py models #6619 * Respect quoting for ref/source in dbt-py models #6103 * Add changelog entries

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CT-1375] [Bug] on snowflake "quoting" is not respected in python models #6103

[CT-1375] [Bug] on snowflake "quoting" is not respected in python models #6103

pquadri commented Oct 19, 2022 •

edited

Loading

pquadri commented Nov 3, 2022

jtcohen6 commented Nov 17, 2022 •

edited

Loading

[CT-1375] [Bug] on snowflake "quoting" is not respected in python models #6103

[CT-1375] [Bug] on snowflake "quoting" is not respected in python models #6103

Comments

pquadri commented Oct 19, 2022 • edited Loading

Is this a new bug in dbt-core?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Which database adapter are you using with dbt?

Additional Context

pquadri commented Nov 3, 2022

jtcohen6 commented Nov 17, 2022 • edited Loading

pquadri commented Oct 19, 2022 •

edited

Loading

jtcohen6 commented Nov 17, 2022 •

edited

Loading