Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile RBIs for foreign constants #903

Merged
merged 4 commits into from
Jun 13, 2022
Merged

Compile RBIs for foreign constants #903

merged 4 commits into from
Jun 13, 2022

Conversation

egiurleo
Copy link
Contributor

@egiurleo egiurleo commented Apr 25, 2022

Motivation

Part of the work for #890. Heavily based on work I did with @paracycle on this branch.

Currently, Tapioca only generates RBI for constants that are defined in the gem that's currently being compiled. This is the correct behavior in most cases, except that in Ruby, a gem can add functionality to any constant, even one that it doesn't define.

Let's say there's a gem foo that defines the class Foo.
Then let's say there's another gem bar that defines the module Bar and contains the following code:

Foo.prepend(Bar)

or

class Foo
   prepend Bar
end

Currently, Tapioca would not recognize that Foo had Bar mixed in and would generate incomplete RBIs for Foo. This is especially true if Foo is a constant defined in the Ruby standard library.

We're calling these "foreign constants."

Implementation

There are a few steps to this implementation:

  • Modify the Mixin tracker to keep a map between constants and classes that mix them in. The tracker currently keeps a map the other way around, but I think it makes more sense to flip the keys and values in this case (will explain more in next bullet point).
  • Introduce a new listener called ForeignConstants. This listener takes a constant and uses the Mixin tracker to identify all the other constants that have the first constant as a mixin. This will include any constants that are defined in another gem or in the Ruby standard library that have mixed in constants defined in the gem being compiled. All of these constants are then added to the pipeline in a Gem::ForeignConstantFound event.
  • This event can be processed in a simpler way than the Gem::ConstantFound event because there are certain logical branches that don't apply to it and can be skipped entirely. While processing this event, the pipeline knows to skip the defined_in_gem? check.
  • This will lead to a ForeignScopeNodeAdded event being added to the pipeline. The listeners have been modified such that only the Mixins listener will accept the ForeignScopeNodeAdded event. This way, only mixin RBIs are generated for foreign constants and all other RBIs (method definitions, documentation, etc.) are ignored.

Tests

I have added multiple pipeline tests and a gem test that fail on main but pass with these changes.

Tophatting 🎩

I have tested these changes on Shopify core. It compiles all gem RBIs and results in no type errors!

lib/tapioca/gem/pipeline.rb Outdated Show resolved Hide resolved
lib/tapioca/runtime/trackers/mixin.rb Show resolved Hide resolved
lib/tapioca/runtime/trackers/mixin.rb Show resolved Hide resolved
lib/tapioca/runtime/trackers/mixin.rb Outdated Show resolved Hide resolved
Copy link
Member

@paracycle paracycle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks quite complete already. I've added some inline comments.

spec/tapioca/cli/gem_spec.rb Outdated Show resolved Hide resolved
spec/tapioca/cli/gem_spec.rb Outdated Show resolved Hide resolved
spec/tapioca/gem/pipeline_spec.rb Outdated Show resolved Hide resolved
spec/tapioca/gem/pipeline_spec.rb Outdated Show resolved Hide resolved
lib/tapioca/gem/pipeline.rb Outdated Show resolved Hide resolved
@egiurleo egiurleo force-pushed the unreachable-constants branch 2 times, most recently from 656f80b to f4fdbcc Compare April 26, 2022 21:10
@egiurleo egiurleo changed the title Compile RBIs for unreachable constants Compile RBIs for constants not owned by current gem Apr 26, 2022
Copy link
Member

@paracycle paracycle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much much cleaner. I think one more pass will be enough. I am still not in love with the name we have of the concept, so it might be good to get feedback from other folks by turning this into a proper PR.

lib/tapioca/gem/events.rb Outdated Show resolved Hide resolved
lib/tapioca/gem/pipeline.rb Outdated Show resolved Hide resolved
lib/tapioca/gem/pipeline.rb Outdated Show resolved Hide resolved
@egiurleo egiurleo force-pushed the unreachable-constants branch from f4fdbcc to 86061b7 Compare April 27, 2022 15:10
@egiurleo egiurleo marked this pull request as ready for review April 27, 2022 15:23
Comment on lines 48 to 56
sig { returns(Module).checked(:never) }
attr_reader :constant

sig { params(symbol: String, constant: Module).void.checked(:never) }
def initialize(symbol, constant)
super
@constant = constant
end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reached the bounds of my Sorbet knowledge 😅

The constant instance variable in NotOwnedConstantFound is always going to be a Module instance (because it has to have other constants mixed into it). In the superclass ConstantFound, constant is a BasicObject, which is an ancestor of Module. So I essentially want to narrow the type of an instance variable on a subclass... Is this the right way to go about it?

@egiurleo egiurleo requested review from Morriar and a team April 27, 2022 15:28
class NotOwnedConstantFound < ConstantFound
extend T::Sig

sig { returns(Module).checked(:never) }
Copy link
Collaborator

@Morriar Morriar Apr 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need the checked(:never) since this redefinition is valid regarding the Liskov principle (you can make a return type narrower).

But you can precise it's an override:

Suggested change
sig { returns(Module).checked(:never) }
sig { override.returns(Module) }

sig { returns(Module).checked(:never) }
attr_reader :constant

sig { params(symbol: String, constant: Module).void.checked(:never) }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sig { params(symbol: String, constant: Module).void.checked(:never) }
sig { params(symbol: String, constant: Module).void }

lib/tapioca/gem/events.rb Outdated Show resolved Hide resolved
@@ -42,6 +42,7 @@ def initialize(gem, include_doc: false)
@node_listeners << Gem::Listeners::Subconstants.new(self)
@node_listeners << Gem::Listeners::YardDoc.new(self) if include_doc
@node_listeners << Gem::Listeners::RemoveEmptyPayloadScopes.new(self)
@node_listeners << Gem::Listeners::NotOwnedConstants.new(self)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move this higher in the listeners? I think we need to run Subconstants after this one right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Morriar I completely believe you, but can you explain why? It's not obvious to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think it should be at least before RemoveEmptyPayloadScopes:

Let say we find String as a not owned constant and somehow we do not add anything to String. We would end up adding class String; end in the RBI. But since String already comes with the payload, we shouldn't be adding it to the RBI.

This is the aim of RemoveEmptyPayloadScopes: to make sure we remove the scope if it's already in the payload and it is empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that makes complete sense. Thanks! I'll add a test for that as well.

compile_constant(name, constant)

if event.is_a?(Gem::NotOwnedConstantFound)
compile_not_owned_constant(name, T.cast(constant, Module))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use event.constant you won't need to cast it:

Suggested change
compile_not_owned_constant(name, T.cast(constant, Module))
compile_not_owned_constant(name, event.constant)

I think we could refactor a bit the code to not use a local variable.

@egiurleo egiurleo force-pushed the unreachable-constants branch from 86061b7 to fb4876d Compare April 27, 2022 18:32
Comment on lines 165 to 178
if event.is_a?(Gem::NotOwnedConstantFound)
compile_not_owned_constant(name, event.constant)
else
compile_constant(name, event.constant)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we turn this into a case/when to make it explicit that we are matching on the type of event?

Suggested change
if event.is_a?(Gem::NotOwnedConstantFound)
compile_not_owned_constant(name, event.constant)
else
compile_constant(name, event.constant)
end
case event
when Gem::NotOwnedConstantFound
compile_not_owned_constant(name, event.constant)
else
compile_constant(name, event.constant)
end

module Tapioca
module Gem
module Listeners
class NotOwnedConstants < Base
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not a fan of this name, tbh. What do you think of:

  • ForeignConstant as the name of a constant that we discover that we need to handle but wasn't declared by the current gem.
  • ForeignConstantFound as the event name

Comment on lines 18 to 23
# There are cases where we want to compile RBI for constants not owned by the current
# gem - e.g. constants defined in one gem that then have a mixin applied to them in
# another gem. This is especially common for Ruby standard library classes.
#
# We can identify these cases by keeping track of which classes have mixins, and then
# adding those classes to the pipeline.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is correct but I think we can improve it slightly. What do you think of this:

Suggested change
# There are cases where we want to compile RBI for constants not owned by the current
# gem - e.g. constants defined in one gem that then have a mixin applied to them in
# another gem. This is especially common for Ruby standard library classes.
#
# We can identify these cases by keeping track of which classes have mixins, and then
# adding those classes to the pipeline.
# There are cases where we want to process constants not declared by the current
# gem, i.e. "foreign constant". These are constants defined in another gem
# that this gem is applying a mix-in to. This pattern is especially common
# for gems that add behaviour to Ruby standard library classes by mixing in
# modules to them.
#
# The way we identify these "foreign constants" is by asking the mixin
# tracker which constants have mixed in the current module that we are
# handling. We add all those constants that we discover to the pipeline
# to be processed.

Comment on lines 16 to 17
mixin = event.constant

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can return early if mixin is a Class instance, since no Class can be mixed into something else:

Suggested change
mixin = event.constant
mixin = event.constant
return if Class === mixin # Classes can't be mixed into anything

lib/tapioca/gem/events.rb Outdated Show resolved Hide resolved
@egiurleo
Copy link
Contributor Author

Spent some time pairing with @Morriar to debug the errors that were occurring when trying to compile gem types on Shopify core. This is what we discovered:

  • The errors were occurring because tapoica loads all of Shopify core when compiling gems. This means that when core mixes in certain gems (e.g. ActiveModel), the mixin tracker will track those classes from core. The NotOwnedConstants listener will then add those classes from core to the pipeline, and tapioca will try to compile them. Obviously this is not desired behavior.
  • I think the actual issue is that the mixin tracker is casting too wide of a net. It's tracking even the mixins that happen outside of the gem we're currently compiling. This seems related to the other issue that Better mixin generation for gem RBIs #890 is trying to address, so it might make sense to try to address that issue first, then revisit this PR.

My next step will be to write a failing test to make sure the issue is captured in a repeatable way.

@egiurleo egiurleo force-pushed the unreachable-constants branch 3 times, most recently from dd81c95 to 12e6f1f Compare May 13, 2022 15:23
lib/tapioca/gem/events.rb Show resolved Hide resolved
lib/tapioca/gem/pipeline.rb Show resolved Hide resolved
Comment on lines 28 to 29
location = locations.first
next if location.nil? || !@pipeline.gem.contains_path?(location)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was experiencing an issue where running tapioca gem on Shopify core was generating RBIs for ALL mixins, not just mixins performed in the gem currently being compiled. This is my fix for this issue. The mixin tracker keeps track of the location where this mixin took place, and checks that it occurred within the gem before generating RBIs.

Two points:

  • Currently, the locations are an array of file paths, so I took the first one here, which I believe should correspond to the "closest" location where the mixin took place? Let me know if that sounds right for now.
  • I'm going to create a follow-up PR that changes the notion of what a mixin location is, based on the work @paracycle did in this branch -- basically, taking the most recent "require" location as the correct location. I wanted to save that for another PR because this one is already quite big.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is going to change soon, as you said, but up to this point, our thinking was "this is coming from this gem if any of the locations match the gem require path prefix". So we used to have this method that wasn't being used, that we can bring back again for this: bdf5d45

@egiurleo egiurleo requested review from paracycle and Morriar May 13, 2022 15:36
@egiurleo egiurleo changed the title Compile RBIs for constants not owned by current gem Compile RBIs for foreign constants May 13, 2022
@egiurleo egiurleo force-pushed the unreachable-constants branch from 12e6f1f to aa6a11a Compare May 13, 2022 20:23
@egiurleo egiurleo force-pushed the unreachable-constants branch 6 times, most recently from 6531810 to b1894ce Compare May 27, 2022 19:20
@egiurleo egiurleo force-pushed the unreachable-constants branch 7 times, most recently from 70ddb65 to 47ef642 Compare June 3, 2022 18:15
lib/tapioca/gem/events.rb Show resolved Hide resolved
@@ -42,6 +44,15 @@ def on_scope(event)
sig { params(event: MethodNodeAdded).void }
def on_method(event)
end

sig { params(event: NodeAdded).returns(T::Boolean) }
def should_skip?(event)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to feedback here --

Since listeners contain all the event handling logic, I figured they should also be the ones to determine which events get skipped. I've made the default behavior to skip ForeignScopeNodeAdded events, and then I've overridden this method in Mixins listener, such that only the mixin information gets included in foreign constant RBIs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Base should accept all events by default and this should be overridden in subclasses that do not want to handle such event.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, nitpick I would define the API in the opposite way: accept? and return unless accept?(event).

spec/tapioca/cli/gem_spec.rb Show resolved Hide resolved
spec/tapioca/cli/gem_spec.rb Show resolved Hide resolved
spec/tapioca/cli/gem_spec.rb Show resolved Hide resolved
Comment on lines 4 to 8
# Temporary fix for CI
# A patch as been upstreamed, we're waiting for it to be merged.
# See https://github.com/rubyconfig/config/pull/319
# TODO: remove this one the patch is released.
require "rails"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a patch to help fix CI that will be merged soon (#965). Once it's merged I'll rebase on main.

Copy link
Member

@paracycle paracycle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for all the work you put into this.

The should_skip? mechanism is a little bit kludgy but I think we can refine it afterwards.

@egiurleo
Copy link
Contributor Author

egiurleo commented Jun 3, 2022

@paracycle Yay! 100% agreed that should_skip? is kludgy (great word) but I am totally happy to make it better later.

I'll merge this once #965 or something similar is merged.

@egiurleo egiurleo force-pushed the unreachable-constants branch 2 times, most recently from 098326e to 34b31a8 Compare June 3, 2022 20:06
@egiurleo egiurleo force-pushed the unreachable-constants branch 2 times, most recently from 3a6cd2a to 434b98b Compare June 9, 2022 20:05
@@ -42,6 +44,15 @@ def on_scope(event)
sig { params(event: MethodNodeAdded).void }
def on_method(event)
end

sig { params(event: NodeAdded).returns(T::Boolean) }
def allow?(event)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Morriar Here is that allow? method! By default, the Base listener allows all events, and then most of the subclasses do not allow ForeignScopeNodeAdded events.

Because most events are allowed, I've implemented this method in the subclasses as:

!event.is_a?(Tapioca::Gem::ForeignScopeNodeAdded)

This logic feels more complicated to me, because the event is called allow? but it's implementation states which events aren't allowed! That seems backwards. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean about the implementation. While it's not bothering me I'm not against reverting to a ignore?(event) method so the implementation can be simplified as long as the base class doesn't ignore anything and the specificities are done in the subclasses.

Bikeshedding: I feel that accept? sounded better than allow? 😅

@@ -42,6 +44,15 @@ def on_scope(event)
sig { params(event: MethodNodeAdded).void }
def on_method(event)
end

sig { params(event: NodeAdded).returns(T::Boolean) }
def ignore?(event)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Morriar I have changed this method to ignore? and reversed the logic, but the base class still ignores nothing.

@egiurleo egiurleo force-pushed the unreachable-constants branch from 5f6a2b5 to fb9ce40 Compare June 10, 2022 18:40
@Morriar Morriar merged commit 2de2918 into main Jun 13, 2022
@Morriar Morriar deleted the unreachable-constants branch June 13, 2022 13:13
@shopify-shipit shopify-shipit bot temporarily deployed to production July 7, 2022 17:53 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants