Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: discovery unnecessarily goes through all schemas and tables on discovery #35

Open
nidhi-akkio opened this issue Nov 3, 2023 · 1 comment

Comments

@nidhi-akkio
Copy link
Contributor

We are running into an issue where tap-snowflake is taking about 90 seconds to execute meltano invoke custom-snowflake tap —discover. We are providing a select config, tables, and schema in the config so it should be a relatively quick execution.

A 90 sec execution is far too long. In the tap, it states that when we provide tables in the configuration, it should limit discovery to the schema and tables specified. However, based on logs, the tap is introspecting the entire database. The tap is going through every table & every view in every schema the user has access to. This seems like a tap-snowflake bug and we want to address it asap so our customers can have a better experience.

FYI, it seems like there has been a 6-month outstanding issue that would address this: #23

the configurations for your reference.

- config:
   account: <redacted>
   database: CUSTOMER_DB
   password: <redacted>
   schema: customer_schema
   user: <redacted>
   warehouse: CUSTOMER_WH
   tables:
    - customer_schema.table_name
  inherit_from: tap-snowflake
  metadata: {}
  name: custom-snowflake-tap
  schema: {}
  select:
  - customer_schema-table_name.*
  select_filter: []

Snippet of logs during discovery:

2023-11-02 18:33:56,016 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_view_names */ VIEWS IN test_google_analytics]
2023-11-02 18:33:56,085 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,085 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 0
2023-11-02 18:33:56,086 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:56,136 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,137 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:56,138 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_table_names */ TABLES IN test_insights_proudmoments]
2023-11-02 18:33:56,252 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,253 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 0
2023-11-02 18:33:56,253 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:56,307 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,307 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:56,308 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_view_names */ VIEWS IN test_insights_proudmoments]
2023-11-02 18:33:56,466 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,467 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 6
2023-11-02 18:33:56,467 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:56,517 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,517 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:56,518 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_table_names */ TABLES IN test_keyword_kabbage]
2023-11-02 18:33:56,615 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,615 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 2
2023-11-02 18:33:56,616 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:56,686 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,687 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:56,687 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_view_names */ VIEWS IN test_keyword_kabbage]
2023-11-02 18:33:56,765 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,765 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 0
2023-11-02 18:33:56,766 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:56,834 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,834 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:56,835 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_table_names */ TABLES IN test_mixpanel]
2023-11-02 18:33:56,935 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,936 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 3
2023-11-02 18:33:56,936 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:57,016 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:57,016 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:57,017 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_view_names */ VIEWS IN test_mixpanel]
@SlavaLenskyy
Copy link

I have the same issue with the exception that discovery phase never finishes for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants