Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GADT constraints in maximiseType #15544

Merged
merged 7 commits into from
Jul 12, 2022
Merged

Conversation

dwijnand
Copy link
Member

@dwijnand dwijnand commented Jun 28, 2022

No description provided.

@dwijnand dwijnand linked an issue Jun 28, 2022 that may be closed by this pull request
@dwijnand dwijnand force-pushed the gadt/unsound-cast branch from f7bb6e1 to 8811725 Compare June 28, 2022 21:34
@dwijnand dwijnand marked this pull request as ready for review June 29, 2022 16:54
@dwijnand dwijnand requested a review from Linyxus June 29, 2022 16:55
Copy link
Contributor

@Linyxus Linyxus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM. 👍

compiler/src/dotty/tools/dotc/core/GadtConstraint.scala Outdated Show resolved Hide resolved
@dwijnand dwijnand force-pushed the gadt/unsound-cast branch from 6011d24 to b6fc951 Compare June 30, 2022 16:54
@dwijnand dwijnand enabled auto-merge June 30, 2022 17:35
@Linyxus
Copy link
Contributor

Linyxus commented Jul 1, 2022

It seems that since I do not have the write permission yet, my approval alone does not meet the merging requirements.

@dwijnand
Copy link
Member Author

dwijnand commented Jul 1, 2022

@abgruszecki would you be happy to rubberstamp this through, on Linyxus' review approval?

// if it doesn't occur then it's safe to instantiate the tvar
// Eg neg/i14983 the C in Node[+C] is in the GADT lower bound X >: List[C] so maximising to Node[Any] is unsound
// Eg pos/precise-pattern-type the T in Tree[-T] is in no GADT upper bound so can maximise to Tree[Type]
val safeToInstantiate = v != 0 && gadtBounds.forall(tb => !tvar.occursIn(if v == 1 then tb.lo else tb.hi))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't entirely feel correct, we're not taking into account the variance at which tvar is occuring at in tb. I would disallow instantiating if tvar occurs at all in tb.

Copy link
Member Author

@dwijnand dwijnand Jul 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. And I was hoping to find a legitimate case in either bootstrapping or in the community build which would help me figure out what variance flipping I need to be doing. But I didn't encounter any. Do you have in mind an example where this would be negatively impacted?

I proposed this version because it fixes a soundness issue even if it might have this incompleteness issue (correct me if I'm misdiagnosing).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would worry that it's a partial fix, not that it's incomplete. Variables always occur at some variance in types: A is negative in A => Unit, positive in A => Unit => Unit, invariant in A => A. Here we partially take variance in account by only inspecting one bound, which feels very counterintuitive. Unless we have a clear argument that doing what the code is doing right now is OK, we should just disregard variance at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indeed results in soundness problems. Consider the following example, whose main difference with #14983 is that the value encapsulated in Node has a contravariant type argument.

case class Showing[-C](show: C => String)

sealed trait Tree[+A]
final case class Leaf[+B](b: B)       extends Tree[B]
final case class Node[-C](l: Showing[C]) extends Tree[Showing[C]]

object Test:
  def meth[X](tree: Tree[X]): X = tree match
    case Leaf(v) => v
    case Node(x) =>
      // GADT bounds: X >: Showing[C]
      // maximise Node[C] to Node[Nothing].
      //   Since C is contravariant and in GADT lower bound,
      //   instantiation is currently assumed to be safe.
      // After that: X >: Showing[Nothing]
      Showing[String](_ + " boom!")

  def main(args: Array[String]): Unit =
    val tree = Node(Showing[Int](_.toString))
    val res = meth(tree)
    println(res.show(42))  // runtime crash: ClassCastException

The problem here is that the type variable C are contravariant in both places. So maximising Showing[C] also results in maximising the GADT lower bound of X, which makes the constraint unsound.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should disable the instantiation as long as the type variable appears in GADT bounds as suggested by Alex? Or shall we query the type variable variances in GADT bounds using variances, and only instantiate the tvar when its variance conforms with that in GADT bounds?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to take variance into account, we will need an argument why that's sound.

The way I understand things, the type variable is supposed to correspond to a type member in DOT terms: type variables in PatternTypeConstrainer will only come from type arguments to unapply functions, and in the pattern type they will always appear as type arguments to some class, unless we're dealing with an unapply with a very strange signature. If we do instantiate the type variable, we turn it into an abstract type, which makes it clearly correspond to the type member. In certain situations it's OK to approximate the type variable to avoid unnecessary problems with things like implicit search, but if the type variable appears in the constraint, approximating the type variable could be unsound.

@dwijnand dwijnand disabled auto-merge July 4, 2022 12:51
@abgruszecki
Copy link
Contributor

@Linyxus What do you think we should do? Can you come up with an argument why taking variance into account is sound?

@Linyxus
Copy link
Contributor

Linyxus commented Jul 5, 2022

I do not have an argument for the safety regarding the instantiated type variable (since its bounds get narrowed), but I could try to (partially and informally) show that the GADT constraints of other variables will be sound after instantiation.

Assuming that (I have poor knowledge of type inferencing so correct me if I'm wrong :P),

  • We are typing a match case case TypCon() with type member A.

  • After GADT reasoning and other relevant logic, we decide to maximise A, so now the variances of A in both places satisfy the conformance requirement. A is bounded by L <: A <: H.

  • At this point, we have GADT constraints: SR(Xᵢ) = TypeBounds(loᵢ, hiᵢ), i = 1, 2, ..., n.

    Note that loᵢ and hiᵢ can contain reference to A, and A ∉ {X₁, X₂, ..., Xₙ}.

Now, we view A from a type variable perspective, thinking of SR as being parameterized by A (SR(A)(Xᵢ)) and A would range over L .. H.

We have Aₘ = maximise(A) = L if A contra-variant and H if A is variant. We can show that for any A in L .. H, SR(A) entails SR(Cₘ). So instantiating A to Aₘ results in a necessary constraint.

To see this, we can show that instantiating A to Aₘ always loosens GADT bounds as long as the variances conforms. If we maximise A in T to T', it should be ensured that T' always become a supertype of T. And since the variances conforms, it is guaranteed that hiᵢ' becomes the supertype of hiᵢ and loᵢ' becomes a lower type of loᵢ (note that the lower bound in TypeBounds negates the variance). So the type bounds of every variable in SR(Aₘ) is more loose than any other SR(A).

Therefore, given that the GADT constraints before instantiation is sound, then its entailment (or its weaker version) is guaranteed to be sound.

@dwijnand
Copy link
Member Author

dwijnand commented Jul 5, 2022

I'm not fussed. Just checking occurrences is also cheaper.

@Linyxus
Copy link
Contributor

Linyxus commented Jul 6, 2022

Sure! I just wanted to record these thoughts on the soundness of such approximation. Considering the simplicity of checking occurrence, I agree that we should go with this solution too.

A bit off-topic: I am a bit confused about why we would need type variable instantiation in constraint inferencing. Would it be good to always avoid instantiating type variables, turn it into abstract types and record their type bounds?

@dwijnand
Copy link
Member Author

dwijnand commented Jul 6, 2022

Implicit lookup, for example, is impacted. Here's a semi-minimal example:

trait Show[T]:
  def show(x: T): String

object Show:
  def show[T](x: T)(using z: Show[T]): String = z.show(x)
  given Show[String] with { def show(x: String): String = x }

opaque type Email = String

object Email:
  given (using s: Show[String]): Show[Email] = s

class Test:
  def test(opt: Option[Email]): String = opt match
    case Some(email) => Show.show(email)
    case None        => "none"

Without instantiating Some[A] to Some[Email] but to Some[A$1] we get:

-- Error: Email.scala:16:40 ----------------------------------------------------
16 |    case Some(email) => Show.show(email)
   |                                        ^
   |No given instance of type Show[A$1] was found for parameter z of method show in object Show
   |
   |where:    A$1 is a type in method test with bounds <: Email

It's a slightly bad example because it doesn't justify why Show isn't contravariant. And it's manageable with a new instance of Show defined for Show[Email] or an unsafe cast of Show[String]. But presumably it's representative of a corpus of code we can't negatively impact. My minimisation is based on the real tests/run/opaque-inline case. tests/pos/precise-pattern-type.scala also fails, but that's perhaps even worst because it involves a @uncheckedVariance assertion.

dwijnand added 7 commits July 6, 2022 16:12
In a modified version of maximizeType I ended up with picking test
failures, and this seems to be the right position.
... by keeping an memory-optimised path & extracting a ExternalizeMap.

Also a small fixup in ContainsNoInternalTypesAccumulator using itself.
Consider the GADT constraints during Inferencing's maximiseType to avoid
instantiating type variables that lead to GADT casting inserting unsound
casts.
@abgruszecki abgruszecki merged commit 2ea400a into scala:main Jul 12, 2022
@dwijnand dwijnand deleted the gadt/unsound-cast branch July 12, 2022 08:15
@Kordyjan Kordyjan added this to the 3.2.1 milestone Aug 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unsoundness in GADT casting when using a variant type constructor
4 participants