From 66dfa1b6f5bc0946726835cabdd5860605af1a20 Mon Sep 17 00:00:00 2001 From: Seamus Abshere Date: Tue, 21 Nov 2023 10:26:02 -0500 Subject: [PATCH 1/4] Update README.md --- README.md | 85 ++++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 71 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index 7e95368..9927369 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,71 @@ -# `joinery`: Transpile (some) of BigQuery's "Standard SQL" dialect to other databases +# `joinery`: Safely transpile between SQL dialects + +It was decided to write a greenfield transpiler in Rust due to concerns about correctness of Python-based solutions. + +It performs type inference and preserves whitespace. + +If you want to run _your_ production workloads, **you will almost certainly need to contribute code.** In particular, our API coverage is limited. See [`tests/sql/`](./tests/sql/) for examples of what we support. + +See [ARCHITECTURE.md](./ARCHITECTURE.md) for an overview of the codebase. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
DialectInputOutputComments
BigQuery🟢🟢
Snowflake🔴🟢"Not bad"
Trino🔴🟢Best coverage. Easy to run locally under Docker.
Athena 3 (Trino)🔴🟢Need to convert UDFs
Athena 2 (Presto)??Try it?
Redshift🔴🔴
Postgres🔴🔴
SQLite🟢🟢
-**Current status:** Preparing for a quiet public release, but not yet there. This is currently a proof of concept that runs the tests in [`tests/sql/`](./tests/sql/), but which isn't _quite_ ready for anything else. See [ARCHITECTURE.md](./ARCHITECTURE.md) for an overview of the codebase. This code is less than 2 months old, and it was built quickly, so we still have some refactoring to do. ## What is this? @@ -29,18 +94,6 @@ FROM array_select_data It even does type inference, which is needed for certain complex transformations! The transformation process makes some effort to preserve whitespace and comments, so the output SQL is still mostly readable. -## Current status - -This is very much a work in progress, though it has enough features to run a large fraction of our production workload. It supports the following databases to some degree: - -- Trino: Best coverage. Easy to run locally under Docker. - - AWS Athena 3: Mostly works, but we need to port the UDFs. - - Presto: Try it and see? -- Snowflake: Not bad. -- SQLite3: Will probably be removed soon. Might be replaced with DuckDB? - -If you want to run _your_ production workloads, **you will almost certainly need to contribute code.** In particular, our API coverage is limited. See [`tests/sql/`](./tests/sql/) for examples of what we support. - ## Design philosophy In an _ideal_ world, `joinery` would do one of two things: @@ -122,3 +175,7 @@ If you're interested in running analytic SQL queries across multiple databases, - [`sqlglot`](https://github.com/tobymao/sqlglot). Transform between many different SQL dialects. Much better feature coverage than we have, though it may generate incorrect SQL in tricky cases. If you're planning on adjusting your translated queries by hand, or if you need to support a wide variety of dialects, this is probably a better choice than `joinery`. - [`dbt-core`](https://github.com/dbt-labs/dbt-core). - [BigQuery Emulator](https://github.com/goccy/bigquery-emulator). A local emulator for BigQuery. This supports a larger fraction of BigQuery features than we do. + +## Corporate support + +joinery is open-sourced by [Faraday](https://faraday.ai) From 3b4b193fb46c0cc4f757de85d587d08c6a13f17b Mon Sep 17 00:00:00 2001 From: Seamus Abshere Date: Tue, 21 Nov 2023 10:31:50 -0500 Subject: [PATCH 2/4] Update README.md --- README.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/README.md b/README.md index 9927369..16626d6 100644 --- a/README.md +++ b/README.md @@ -8,6 +8,21 @@ If you want to run _your_ production workloads, **you will almost certainly need See [ARCHITECTURE.md](./ARCHITECTURE.md) for an overview of the codebase. +``` +$ joinery --help +Usage: joinery + +Commands: + parse Parse SQL from a CSV file containing `id` and `query` columns + sql-test Run SQL tests from a directory + transpile Transpile BigQuery SQL to another dialect + help Print this message or the help of the given subcommand(s) + +Options: + -h, --help Print help +``` + +## Status From e0dfdf49401cd1feb1f6fc1f43e8460b98356667 Mon Sep 17 00:00:00 2001 From: Seamus Abshere Date: Tue, 21 Nov 2023 10:33:49 -0500 Subject: [PATCH 3/4] Update README.md --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 16626d6..9091dda 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,9 @@ It was decided to write a greenfield transpiler in Rust due to concerns about correctness of Python-based solutions. -It performs type inference and preserves whitespace. +[BigQuery "Standard SQL"](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax) was taken as the reference dialect, but it is anticipated the other input dialects will be supported. + +It performs type inference (necessary, for example, to expand `EXCEPT(*)` into a list of columns, because Trino doesn't support it) and preserves whitespace. If you want to run _your_ production workloads, **you will almost certainly need to contribute code.** In particular, our API coverage is limited. See [`tests/sql/`](./tests/sql/) for examples of what we support. From de61359c6b2d789c28824a765382389539b87912 Mon Sep 17 00:00:00 2001 From: Seamus Abshere Date: Tue, 21 Nov 2023 10:34:39 -0500 Subject: [PATCH 4/4] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9091dda..4941b11 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# `joinery`: Safely transpile between SQL dialects +# `joinery`: Safe SQL transpiler, written in Rust It was decided to write a greenfield transpiler in Rust due to concerns about correctness of Python-based solutions.