-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use duckdb functions without having to specify the full type signature #531
base: main
Are you sure you want to change the base?
Conversation
ffb9a26
to
59d8a57
Compare
59d8a57
to
1992351
Compare
@@ -9015,6 +9128,45 @@ get_rule_expr(Node *node, deparse_context *context, | |||
appendStringInfoString(buf, " := "); | |||
get_rule_expr(refassgnexpr, context, showimplicit); | |||
} | |||
else if (IsA(sbsref->refexpr, Var) && pgduckdb_var_is_duckdb_row((Var*) sbsref->refexpr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a comment about why this is needed.
@@ -11082,6 +11234,10 @@ get_const_expr(Const *constval, deparse_context *context, int showtype) | |||
char *extval; | |||
bool needlabel = false; | |||
|
|||
if (pgduckdb_is_unresolved_type(constval->consttype)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a comment.
@@ -12121,8 +12277,11 @@ get_from_clause_item(Node *jtnode, Query *query, deparse_context *context) | |||
/* Print the relation alias, if needed */ | |||
get_rte_alias(rte, varno, false, context); | |||
|
|||
if (pgduckdb_func_returns_duckdb_row(rtfunc1)) { | |||
/* Don't print column aliases for functions that return a duckdb.row */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we do that?
This reverts commit a799f7e.
Currently when you use
read_parquet
/read_csv
/etc they require specifying the full type signature of the function call. This is quite cumbersome, especially when usingSELECT * FROM read_parquet(...) AS (...)
. UsingSELECT *
its main advantage is that you don't have to specify all the columns.This changes that by instead allowing the following new syntax:
This syntax is made possible by using two clever hacks to solve two different problems:
The first problem is that Postgres its parser needs to consider the query valid, which means it has to resolve all the types. Sadly, we cannot hook into the type resolution as an extension. Instead the types need to be resolved fully from information in the Postgres catalogs. To do so we create a new type for our extension:
duckdb.unresolved_type
. This type only exists for the Postgres parser and should never be used explicitly. We add catalog entries that allow this type be casted explicitly to any type that's supported by pg_duckdb. Using explicit casts always is quite annoying though. So we also define various operators and functions for this type, so that things liker['age'] > 21
are allowed.The second problem is that the Postgres parser replaces the
*
inSELECT *
with an expanded column list. So the query above will look as follows once we receive it (*
replaced withr
):To resolve this problem we let
read_parquet
return aduckdb.row
, which we then replace with a*
again before in ourpgduckdb_get_querydef
function. This might sound pretty simple but there's tricks that need to be done for subqueries that return both aduckdb.row
type and some other columns.TODO:
src/pgduckdb_ruleutils.cpp
which is shared across versions. Afaik there's no technical problems to implement this feature for all PG versions that we support. The reason this is not done yet is that I'd like to wait with updating all the PG version specific ruleutils files until after the first round of review.