-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: existential unions and intersections #55257
Comments
I'm really not understanding how the site-specific identity is supposed to work. Looking at this function type Fn =
|| ((a: string) => void)
|| ((a: number) => void)
// is equivalent to
type Fn = (a: string && number) => void How do you get a value suitable to call an function fn(arg: () => string || () => number) {
return arg();
}
// declaration emit is...
function fn(arg: () => string || () => number): ????? <- ? |
Looking at declaration emit is indeed something that is still a TODO on my list. I think the only fully backwards compatible method would be to only allow exports of types that are fully closed over. As soon as a In general I cannot immediately think of a use case for a naked |
Compat isn't a big deal IMO; if you use newer language features then those features may appear in the corresponding .d.ts. This has always been the case. |
Addressing your question on how to call that function: type A =
|| { x: string, f: (a: string) => void }
|| { x: number, f: (a: number) => void }
declare const a: A
a.f(a.x)
// `a.f` here has the type `(a: string && number) => void`,
// but the choice here is linked to the same source as the choice of `a.x`, namely A
// So the checker iterates at top level, choosing linked options together:
// > pass `string` to `(a: string) => void`. can string be assigned to string ? yes
// > pass `number` to `(a: number) => void`. can number be assigned to number ? yes However, you can also call this function with a parameter that is valid for both options, namely How to refer to these types is by using their full reference, which for simple references could be either calculated or kept in the existential analog of the But in your specific example case it is not complex: function fn(arg: () => string || () => number) {
return arg();
}
// declaration emit is...
function fn(arg: () => string || () => number): ReturnType<typeof arg> The idea is exactly that type Obj =
| { obj: { foo: string }, key: 'foo' }
| { obj: { bar: string }, key: 'bar' }
declare const o: Obj
const x = o.obj[o.key] The type of In fact in my first attempt at implementing this I just built up a chain of lazy |
Per the code example(s) I've seen in this thread, this all looks an awful lot like #30581, fwiw. |
Interesting. This is indeed the correspondence problem or very similar at least. It looks like the PR that closed that ticket ( #47109 ) solved the correspondence problem for a specific encoding (outlined in the PR description) that explicitly stores the correspondence in a record type (essentially explicitly storing the "links between choices" that I am tracking). I guess my suggestion is attempting to find a more general solution by having the type system manage these links, while also addressing other inconsistencies and other requested features, like being able to call an overloaded function with a union. |
This sounds like it would be extremely useful for React developers (not sure about other frameworks) when trying to implement polymorphic components. I recently spent some time trying to work out how to accomplish this, and the best I could come up with was ugly and only produced a component type signature that worked 1 level deep. So any complex types (such as a parameter list in a function) would be converted to "any". If it's at all helpful, here's some relevant links:
|
Suggestion
🔍 Search Terms
overload existential union intersection set operation correspondence problem switch return problem
extends oneof
generic indexed access type dependent function
generic narrowing
Generic derived value type
conditional type inference enum enumerated narrowing branching generic parameter type guard
generic bounds
narrow generics
correlated union record types
This expression is not callable
none of those signatures are compatible with each other
✅ Viability Checklist
My suggestion meets these guidelines:
⭐ Suggestion
This issue suggests a potential solution to many different issues and cases of unsoundness in TypeScript that have remained unsolved for years (or have been addressed with a suboptimal workaround). These are some relevant issues I found but I'm quite sure this is not exhaustive:
All of these issues ultimately stem from the same problem: TypeScript currently does not have an accurate type representing an index into (or, rather, access of) a union or intersection of types. The type returned by
{ x: string } | { x: number }['x']
isstring | number
but this is not really correct. It has the semantics that any ofstring
ornumber
can be written to it, which would be an invalid assumption in this case. If you want to write to thisx
you need to use a type that is eitherstring & number
, or you need to somehow know that the type you hold is the same as the one in the union. TypeScript currently works around this by reusing the "write types" used by setters for unions' and intersections' properties to set the write type tostring & number
, but this is a suboptimal solution which results in most of the problems above.This suggestion introduces 2 new types representing these "accesses of a union/intersection" that are not only a logical consequent of the structure and behavior of unions/intersections, but also naturally display the real desired behavior and are able to track which types are really inside those unions, allowing them to fan out over other unions/intersections like overloads and fixing the
oneof
problems as well.I have been working on a PoC fork that implements the concepts introduced in this issue for a few months, and it is still a work in progress. However, the semantics have by now cleared up significantly (with a huge shoutout and thanks to the amazing @mkantor whose feedback and input have been invaluable!) to the point where I believe it is useful to create this suggestion issue to get some feedback from the maintainers and the TS community in general.
My apologies for the length of this text. The contents of this description are the result of a few months of R&D work and as the ideas are related to some of the core mechanics of TypeScript it is difficult to condense it all into a very short description.
📃 Motivating Example
Currently unions and intersections form a single unit in the type tree. The core idea of this suggestion is that these can be pulled apart into 2 interconnected elements: a varying type and an iteration point. I.e. the following union:
Can also be written as such:
Where
allof
is a type operator that forms the aforementioned "iteration point", the location in the type tree where the possible variants are "collected", a form of scope for the||
under it, andstring || number
describes the various types that the union can assume. There is an equivalent operator&&
for intersections.For a simple union like above this seems useless, however the power of this idea shows when the varying type is more deeply nested. In the rest of this issue's text, assume that
equals
is an imaginary constraint keyword that tests for type equality. The following equalities would hold:You can see that in that last example inside the function scope the parameter
a
has as its type a nakedstring || number
. It is this naked form of||
and&&
that solves the issues and unsoundnesses mentioned in the introduction. I refer to it as an "existential union/intersection", and to standard unions/intersections as "universal union/intersection". Whether this is entirely correct terminology I am not sure as I am not a type theoretic, but I will refer to them like this regularly in the rest of this issue.My proposal is to implement the following type operators:
||
: Existential union (MVP)&&
: Existential intersection (MVP)oneof
: Selection point / Conversion from universal set operation to existential set operation (maybe in MVP)allof
: Iteration point / Conversion from existential set operation to universal set operation (not in MVP and optional, other approaches may be better, see below)The meaning of a standalone
||
or&&
The motivating problem that started my research is the following common issue encountered by users, which has been dubbed the "correspondence problem" by users in the TS community discord (not to be confused with the undecidable "post correspondence problem"):
Users expect to be able to do this index operation, but it is not allowed. What happens is that TS sees
o.obj
as having type{ foo: string } | { bar: string }
ando.key
as having type'foo' | 'bar'
. When the index occurs TypeScript does not realize that the types of both properties correspond to the same option in the unionObj
, and tries to assigno.key
to its contravariant complement: an intersection of the union's constituents. This fails.The information we lost here is the location of the iteration point of
Obj
. TypeScript should never have takeno.key
to be of type'foo' | 'bar'
. Instead its correct type is the type'foo' || 'bar'
.So what does the type
A || B
mean as a standalone type? It is an existential type that corresponds to "eitherA
orB
", but you don't know which it is. It is a type that holds a choice that was made betweenA
andB
, but where you have forgotten which of the two was chosen. This is different from the unionA | B
. WhileA | B
can be assigned with eitherA
orB
,A || B
can either only be assigned anA
, or only aB
, it's just that you don't know which of the 2 options is correct. It signifies a choice that was made between the 2 somewhere in a higher scope (at the level of the correspondingallof
, or the global scope if there is none), and in your current scope you don't know which of the 2 options was chosen so you have to consider both of them. It is almost like a type of multiple types, with the semantics that it is really only one of them. In that way, it has 2 different perspectives of looking at it:allof
(or the global scope if there is none): A set of types{A,B}
that is iterated over with union semantics.allof
: An unknown type that is eitherA
orB
It is important to understand that to local code this really is a unary type. It is either
A
or it isB
, the fact that there are multiple types that it could be does not change the fact that it is only one of them. This also means thatA | B
cannot be assigned toA || B
.It is actually similar to a type variable, with the difference that a type variable is by definition unbounded on the subtype side, whereas these types are fully bounded.
A || B
is eitherA
orB
, not any of its subtypes. Whereas in type variables you make a choice from an infinite (constrained) set, in these types you make a choice from a finite set, and the choice between||
or&&
adds some extra semantics to the way they can be handled.There is an equivalent type for intersections:
&&
.A && B
is a type that is bothA
andB
, but in the local scope we are only looking at one of them. It is like a type representing a set of multiple types with intersection semantics. This one is harder to wrap your head around at first but it solves multiple soundness issues.Contravariant positions swap between
&&
and||
just like with regular unions and intersections.Converting between universal and existential at the type level
Existential -> Universal:
allof
allof
is a type operator that serves to create an iteration point at which descendant existential types are iterated, e.g.:Existential type declarations are lexically scoped to their surrounding
allof
operator:To retain soundness and referential transparency, it is important that inner allofs do not iterate over existentials that were declared outside their scope. Scoping is lexical:
This
allof
type operator is a useful construct to talk about theoneof
operator and the behavior of the existential set operations in a symbolic way because it is a direct representation of the "iteration point". It could be implemented as described here, but it is not strictly required to implement the existential unions/intersections. It is simply an alternative way to generate distributive unions/intersections, and maybe not the most intuitive.Universal -> Existential:
oneof
oneof
"picks" one of the types of a universal set operation, it creates a new instance of the existential variant of the input union or intersection:A
andB
here are 2 distinct, incompatibleA || B
types.oneof
is sort of the inverse operation ofallof
, but not entirely. You an think of it as lopping off a top levelallof
:But with the caveat that
A
here is a new instance ofA || B
, it is not somehow reaching in to the innerA || B
itself.Typechecking existential unions and intersections
Existential type identity
An important property of these existential types is that each of them has its own identity, and they cannot be assigned to each other even if they have the same structure, just like type parameters with the same constraint can't be assigned to each other. If you have:
The assignment will not work because there is no way to be sure that the choice made in
typeof a
is the same as the choice made intypeof b
. However, if both variables have exactly the same type the assignment is allowed, as now we can be sure that the same choice was made:"infectiousness" and tracking of choices
Another interesting property of these types is that, because they have lost their iteration point, they show "infectious" behavior. Types that include them in their type tree become themselves existential unions/intersections:
With one caveat: their choices are linked together. The above only works if in the latter type
{ a: A }
is chosen when in the former oneA
is chosen, and analogously forB
.This shows a core difference between
|
and||
: while({ a: A } | { a: B })['a']
should not returnA | B
, with the existential type we do have this equivalence:({ a: A } || { a: B })['a']
isA || B
, but with their choices linked together. This can be trivially shown:{ a: A } || { a: B }
is equivalent to{ a: A || B }
, and{ a: A || B }['a']
trivially has the typeA || B
. Once we pass through theallof
barrier we are in a context where all these types are equivalent, as long as their choices are linked.This infectious behavior is the reason why
allof
works at any depth, like a form of scope. You can always rewrite the type below it in a way that puts the existentials at the top level.It also has as a result that during things like typechecking, overload resolution etc. these types need to be iterated over on the top level (in the order of their corresponding
allof
scope nesting). This is especially important because if the same existential type occurs in multiple places it must pick the same type for every location. This is different from universally quantified unions and intersections, which in the type checker are iterated over when their location in the type tree is reached (because they have an iteration point), and thus don't have any interdependence between them.Finding correspondence between types through this choice tracking
If we go back to the problem above, we can see that this inherent tracking of choices solves our problem. If we use
||
instead of|
:Since the type of
o.obj
is{ foo: string } || { bar: string }
with its choices linked to the choice made inObj
and the type ofo.key
is'foo' || 'bar'
with its choices linked to the same choice, it can be worked out that this operation will always succeed! (I refer to this sometimes as "proving correspondence".)Contravariant positions
Existential set operations in contravariant positions swap to their contravariant complement just like normal unions and intersections:
Generalized subtyping behavior on naked existentials and how it naturally leads to mutation soundness
Comparing the subtyping behavior of universal and existential set operations, we find some interesting properties. First, as a reminder:
Now, for the existential variants, the thing here is that when we have a naked existential, we don't know which type was actually chosen. This limits what we can do with it if the choices between the 2 types being checked are not interlinked:
Note how this naturally produces the mutation behavior that we are currently using write types for, but in a more sound way and with more functionality: if the type of
({ a: A } | { a: B })['a']
isA || B
, then there are only 2 options for us to assign to this type:A & B
.Again this naturally produces the desired mutation behavior: if the type of
({ a: A } & { a: B })['a']
isA && B
, then there are only 2 options for us to assign to this type:A & B
.The difference between
&
and&&
is more subtle but note that&&
if we can prove correspondence, which you can't do with universal intersections.&&
'd type create existential unions, which resolve many unsound behaviors in the current implementation.Some even more high level subtyping laws that hold:
Whether
A && B
can really be assigned toA
is not 100% sure yet and has to be sanity checked, but is probably true.Automatic
allof
scoping for functionsOne could wonder what a function definition with existentials in its signature would mean:
Such a function signature is rather useless. It can only be called with either
typeof x
orA & B
, but there is no way to create a value of typetypeof x
, so the signature could as well just beA & B
.However, existentials in the function signature are actually really useful. In fact they solve one of the original problem statements. They can be used to define overloaded functions that require the same choice in multiple places in the signature:
Because I'd prefer avoiding value level syntax though and because raw existential type parameters are useless anyway, all function types are automatically wrapped in an
allof
scope:Interactions in value context
Existential types interact with their universal counterparts in various ways whenever an
allof
border is crossed, which causes a kind of fan-out and fan-in behavior alongside linked choice matching. I have currently identified 4 critical junction points:Passing an existential union/intersection to a parameter
Since the existential unions/intersections represent a unary type, they can be passed to overloaded functions if their existential semantics are met:
Iteration occurs at the top level, even before overload selection. Since both potential instantiations have an acceptable overload available, the call succeeds. Existential intersections would work the same, except they only require one of their instantiations to have a valid overload.
If the function returns different values per overload, these values are collected again into a linked existential (it may be linked to multiple other existentials if there were multiple existential inputs) with the same semantics as the input parameters:
Existential intersections can (probably) be passed to parameters of a union of functions, as they only require one of their options to match (to be sanity checked):
Accesses
When looking at the examples above with the indexes on unions it may seem like an index operation generates a new existential instance. However this is actually not the case. It is the accessing of the union that generates the instance:
Think of it like this: if you have a value that was in
a
, then the actual real type of that valuea
has to be either{ foo: A }
or{ foo: B }
, it can't be both. Of course, if the variable was reassigned then the "real" type might change, because that variable itself is type|
, so without any additional measures every access would have to generate a new existential instance. However, there is a way we can track the existential instance in the compiler: flow types.Currently I am using flow types to set the initial flow type of any identifier with type
A | B
to an initial existential instanceA || B
. So the real type of any non-assignment reference to this identifier is nowA || B
. The control flow analysis tracks these types and since an assignment overrides the initial type with a new type, we can actually track the history of these types (within function boundaries of course):As mentioned in the comments, keeping the flow type as the inferred type is fine for
const
variables, since||
is a subtype of|
and the only difference is in assignments, but forlet
declarations with an inferred existential type I widen it back to the universal version, to prevent confusion and to retain backwards compatibility in e.g. this simple case:Tracking these flow types bridges
|
and||
, e.g. the following now works:The following also works:
Calling an existential union/intersection
This is rather simple to work out:
Unions:
Indexing with an existential union/intersection
This is a complex operation that needs to group the choices for the object type and the choices of the index type and run the check for each of these groups, then output a linked result type. It is essentially the same operation as with the function calls above.
💻 Use Cases
This suggestions fixes many issues that are all variants of the same underlying problem:
The correspondence problem
With this suggestion, correspondence can be proven between o.obj and o.key and this operation no longer fails. Other manifestations of the same issues that are more common:
Switch return problem
This problem stems from the inability to define a function with an overloaded signature without specifying an implementation signature. Often people try to do this:
This fails because S is unbounded on the subtype side, you can't assign a concrete type to it. The problem here is that developers try to use a polymorphic variable because they want to select which type to return:
With this suggestion this is possible though still rather clunky:
If there were a hypothetical
equals
constraint this clunkiness goes away:However this is a different feature entirely.
Issues with mutations
Intersections of arrays
The following is unsound but currently allowed:
What happens here is the different
push
methods get combined into an overloaded function. With our suggestion this doesn't happen:So the
push
function must be passed a{ a: number } || { b: number }
, which is only possible to be satisfied with a type with fitting linked choices or with{ a: number, b: number }
.Failures when through type-level operations the read type is used as a write type:
This occurs because
ColorTypes[COLOR]['result']
returns the read typestring | boolean
instead of the write typestring & boolean
. With this suggestion the type would instead bestring || boolean
, which can both track the type choices and accept the more correctstring & boolean
.A solution to #44373 without concessions
This problem is now worked around in the beta version of the compiler by silently coercing to the supertype
(Fizz | Buzz)[]
.This can cause unexpected errors:
With our suggestion this coercion is not needed:
Another way to derive this:
Note that we are also kind of coercing, because an access creates an existential instance, but to the subtype side, so the resulting type can still be used as usual.
Typing constants like environment variables
One could, as an example, type something like this:
Now
createDb
can only be called with a database name that matches the environment passed in throughNODE_ENV
, whereas the type${'dev' | 'test' | 'prod'}-${string}
would allow it to be called with any valid environment.The text was updated successfully, but these errors were encountered: