-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
duckdb as DataSource #14563
Comments
hi @ravikishorer this is cool! Do you know how mature the sqlalchemy dialect is for duckdb? This thread seems to imply its still very early: duckdb/duckdb#305 I'm happy to help here |
@srinify I'm not fully aware. Based on the documentation, it is based on postgres dialect and in the issue you have attached, someone has implemented a very basic engine wrapping postgres one - https://pypi.org/project/duckdb-engine/ |
That someone was me - if you have any questions or suggestions for how I have implemented it, feel free to tag me :) |
I forked both duckdb-engine and superset, and made superset support duckdb now. db_engine_specs/duckdb.py is simply copy and modify from db_engine_specs/sqlite.py and here is the video https://www.bilibili.com/video/BV1zQ4y1q7ti/ |
Q. With DuckDB I see its has some support for Apache Arrow. Is there some capabilities between DuckDB and Superset to take advantage of arrow format ? |
@alitrack Did you send the pull request in for your solution for duckdb please ? |
yes, I did. |
@armando-fandango my PR is merged. |
took the liberty of opening up a PR from @alitrack's fork, since this is awesome and it looks like the sqlalchemy side is unblocked. |
* + duckdb support needs the forked version of [duckdb-engine](https://github.com/alitrack/duckdb_engine) * Update duckdb.py update _time_grain_expressions * removed superfluous get_all_datasource_names def in duckdb engine spec * added exception handling for duckdb single-threaded RuntimeError * fixed linter blips and other stylistic cleanup in duckdb.py * one last round of linter tweaks in test_connection.py for duckdb support Co-authored-by: Steven Lee <admin@alitrack.com> Co-authored-by: Richard Whaling <richardwhaling@Richards-MacBook-Pro.local> (cherry picked from commit 202e34a)
* + duckdb support needs the forked version of [duckdb-engine](https://github.com/alitrack/duckdb_engine) * Update duckdb.py update _time_grain_expressions * removed superfluous get_all_datasource_names def in duckdb engine spec * added exception handling for duckdb single-threaded RuntimeError * fixed linter blips and other stylistic cleanup in duckdb.py * one last round of linter tweaks in test_connection.py for duckdb support Co-authored-by: Steven Lee <admin@alitrack.com> Co-authored-by: Richard Whaling <richardwhaling@Richards-MacBook-Pro.local>
Is your feature request related to a problem? Please describe.
To be able to connect to Duckdb as datasource.
Describe the solution you'd like
To be able to connect to Duckdb just like how easy it is to connect to sqlite currently.
Describe alternatives you've considered
Sqlite is an alternative. But as Duckdb is more performant for analytical use cases, it will be a great addition.
Additional context
DuckDB is an in-process SQL OLAP database management system - https://duckdb.org/. It is much faster than sqlite for analytical purposes.
From https://duckdb.org/docs/api/python,
The standard DuckDB Python API provides a SQL interface compliant with the DB-API 2.0 specification described by PEP 249 similar to the SQLite Python API.
The text was updated successfully, but these errors were encountered: