Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named Unions as function parameters #4501

Open
kgutwin opened this issue May 23, 2024 · 5 comments
Open

Named Unions as function parameters #4501

kgutwin opened this issue May 23, 2024 · 5 comments
Labels
language-design Changes to PRQL-the-language

Comments

@kgutwin
Copy link
Collaborator

kgutwin commented May 23, 2024

What's up?

From the discussion on #4499, we propose allowing functions to take parameters that are typed by named unions (aka enums, #4038).

The core syntax would look like this:

type InvoiceStatus = (
  Paid = 0 ||
  Unpaid = 1 ||
  Canceled = 2 ||
)

let filter_status = func status <InvoiceStatus> tbl <relation> -> (
  filter (this.status == status) tbl
)

from invoices
filter_status InvoiceStatus.Paid

If the type of the function parameter can be resolved, it should be possible to drop the explicit reference to the type name when specifying the parameter value.

from invoices
filter_status Paid

This would be handled by prepending the parameter type to the name resolution order, so InvoiceStatus would be searched first, followed then by the remaining namespace scopes.

The same would also apply for named parameters:

let filter_status = func status <InvoiceStatus>:Paid tbl <relation> -> (
  filter (this.status == status) tbl
)

from invoices
filter_status # defaults to InvoiceStatus.Paid

# or
from invoices
filter_status status:Unpaid

# or, explicitly
from invoices
filter_status status:InvoiceStatus.Unpaid

With this, the join transformation in the standard library would have the updated definition:

type join_side = (
  inner = "inner" ||
  outer = "outer" ||
  left = "left" ||
  right = "right" ||
)

let join = func
  `default_db.with` <relation>
  condition <bool>
  side <join_side>:inner
  tbl <relation>
  -> <relation> internal join

A similar update would also be made for from_text format:...


As an aside, it would also be worth considering a similar feature of named union namespace resolution for typed relations, when those are implemented. I could see something like this working very well:

from invoices (customer_id = text || amount = float || status = InvoiceStatus)
filter status == Paid
# or, from #2930
filter (match this.status {
  Paid => true,
  Unpaid { amount } => amount < 10.0,
  _ => false,
})
@vanillajonathan
Copy link
Collaborator

I think the syntax in other languages such as Rust, Python and C# is much nicer.

In Rust it would be so much cleaner:

enum InvoiceStatus {
    Paid,
    Unpaid,
    Canceled,
}

@max-sixty
Copy link
Member

I think this would be a very nice addition to the language!

--

Do we need the mapping to ints / strings?

Or could we just have the variants like type InvoiceStatus = [Paid, Unpaid, Canceled]?

(possibly the syntax would need to be slightly different — possibly Paid etc doesn't refer to anything despite being on the right-hand side, but conceptually...)

@kgutwin
Copy link
Collaborator Author

kgutwin commented May 23, 2024

I found the discussion of named unions in #2930 compelling, not to mention that the parsing of that syntax is already done (#4038). I would guess that we want to have just one syntax for named unions / enums, not that the existing one is perfect, but it might have certain advantages when parsing... I'm not sure, though, and I don't feel inspired to dive into the parser 😄

As for the necessity of mapping named union members to values, I think it's probably better to make it explicitly defined by users rather than implicit. I don't think there's a single convention that users would expect -- I can imagine that one user would expect [Paid, Unpaid, Canceled] to map to [0, 1, 2] while another would expect string mapping to ["Paid", "Unpaid", "Canceled"].

Requiring explicit definitions would also help a lot if named unions (as enums) end up getting cast to other data types. This casting is probably going to be very common, especially if named unions are used for typing relations, where the underlying database types are disconnected from the types defined in the PRQL context.

@max-sixty
Copy link
Member

Requiring explicit definitions would also help a lot if named unions (as enums) end up getting cast to other data types. This casting is probably going to be very common, especially if named unions are used for typing relations, where the underlying database types are disconnected from the types defined in the PRQL context.

Yes, if we cast them to other types then I totally agree. type InvoiceStatus is indeed going to get cast, since there's no Enum type in the underlying data. Whereas type JoinDirection is only used in PRQL — but maybe there aren't many of these...

@aljazerzen
Copy link
Member

This is a nice feature overview of exactly how I think PRQL should approach sum types!

We will need to invest some time into how this will be implemented in SQL and unfortunately I won't we able to help here for some time, as I have other priorities right now.

@kgutwin kgutwin added the language-design Changes to PRQL-the-language label Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language-design Changes to PRQL-the-language
Projects
None yet
Development

No branches or pull requests

4 participants