Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for first class support of conditional dependencies in Pkg #1285

Closed
KristofferC opened this issue Aug 3, 2019 · 48 comments
Closed

Comments

@KristofferC
Copy link
Sponsor Member

This is a proposal for adding first class support in Pkg and the code loading system for conditional dependencies.

What is a conditional dependency

Desribing a conditional dependency is easiest with an example. A typical concrete example is for a plotting package to add support for plotting e.g. DataFrames (by adding some method plot(::DataFrame)) but not require a user to install DataFrames to use the plotting package. The plotting package wants to run a bit of extra code (the part that defines the method) when the conditional dependency DataFrames are somehow "available" to the user. The extra code that the package executes when the conditional dependency is active is called "glue code".

Current way of doing conditional dependencies

The way people implement conditional dependencies right now is by using Requires.jl. It works by registering a callback that evaluates some code with the package loading code in Base. The callback gets executed when the conditional dependency is loaded (by e.g. comparing UUID), the code from the callback is evaluated into the module and the functionality for the conditional dependency is provided.

As an example usage:

using Requries
function __init__()
    @require DataFrames="c91e804a-d5a3-530f-b6f0-dfbca275c004" plot(df::DataFrame) = ---
end

What is the problem with Requires.jl

There are a few reasons why the current strategy using Requires.jl to deal with this is unsatisfactory.

  1. It doesn't work well with precompilation. The way people tend to use Requires is by include-ing some file when the conditional dependnecy is available. Requires.jl runs inside __init__ which means the code evaluated by the include command does not end up in the precompile file.
  2. It is "implicit" in the sense that the conditional dependency is only defined in the Julia code. We typically want to put all dependency information inside the Project file.
  3. There is currently no good way to set compat bounds on the conditional dependency.
  4. It has performance problems (Improving @require performance JuliaPackaging/Requires.jl#39)
  5. Basing the activation criteria of the conditional dependency on it simply being loaded might means that packages loaded from other places than the current project will affect whether the glue code is run or not. It would be better to base it only on the current active project.

The current proposal

How does declare a conditional dependency

One declares a conditional dependency by adding an entry to the Project.toml file as:

[conditional-deps]
DataFrames = "$UUID_DATAFRAMES"

[compat]
DataFrames

An alternative possibility is to just put DataFrames inside [deps] and then have a list of names that are conditional.

[deps]
SomeOtherDep = "..."
DataFrames = "$UUID_DATAFRAMES"

[conditional-deps] = ["DataFrames", ]

Where should the glue code be stored?

Precompilation works on a module granularity so we want a module containing the glue code for each conditional dependency. The gluecode would be stored (based on a documented convention) in a file inside the package, eg src/DataFramesGlue.jl inside Plots where the exact name of the file is yet to be decieded.

An example of a glue file for Plots conditionally depending on DataFrames is:

module DataFramesGlue

using Plots, DataFrames
Plots.plot(df::DataFrame) = ...

end

How is the glue code loaded?

When DataFrames gets loaded, we check all packages that declares a conditional dependency with it. If the version of DataFrames loaded is compatible with the compat entry for a package with DataFrames as a conditional dependency, we load the glue code which will act like a normal package and precompile. We need to teach code loading some stuff about glue packages so it knows how to map the names inside the glue module to the UUIDs in the "main package".

The fact that we are not trying to resolve a set of versions compatible with the conditional dependency avoids cases where we in general need to resolve in arbitrarily many times with potential of cycles.

@tpapp
Copy link
Contributor

tpapp commented Aug 4, 2019

Thank you very much for writing this up.

I have a use case in LogDensityProblems.jl which I am wondering about. Specifically, both the glue code for working with ForwardDiff and ReverseDiff relies on DiffResults to extract gradients.

Currently this is handled by a code that looks like

function __init__()
    @require DiffResults="163ba53b-c6d8-5494-b064-1a9d43ac40c5" include("DiffResults_helpers.jl")
    @require ForwardDiff="f6369f11-7733-5829-9624-2563aa707210" include("AD_ForwardDiff.jl")
    @require ReverseDiff="37e2e3b7-166d-5795-8a7a-e32c996b4267" include("AD_ReverseDiff.jl")
end

so, for the purposes of Requires.@require, DiffResults is considered available because if the user is using ForwardDiff then it loaded DiffResults so it triggered the shared glue code.

Would this continue to work? For the mechanism you propose, I imagine I could just provide deps information for DiffResults.

Generally, how is it handled when glue code needs other modules which themselves would not trigger glue code of their own? Can we still specify eg compat bounds for them?

@KristofferC
Copy link
Sponsor Member Author

If I understand your example correctly, you would just have to declare a conditional dep on DiffResults (and ForwardDiff + ReverseDiff).

@tpapp
Copy link
Contributor

tpapp commented Aug 4, 2019

Thanks. So, if I do that, then eg it would be triggered by ForwardDiff loading DiffResults, and the latter would not have to be explicitly loaded by the user? That's the way it works now with Requires.

@KristofferC
Copy link
Sponsor Member Author

Yes.

@tkf
Copy link
Member

tkf commented Aug 8, 2019

What if I want to define a glue module to be loaded when both CuArrays and OrdinaryDiffEq are imported? That is to say, can there be something equivalent to the following?

function __init__()
    @require CuArrays="..." begin
        @require OrdinaryDiffEq="..." include("glue.jl")
    end
end

I guess a possible API would be to include (say) [on-import] section in Project.toml to bundle conditional-deps explicitly

[conditional-deps]
CuArrays = "..."
OrdinaryDiffEq = "..."

[compat]
CuArrays = "..."
OrdinaryDiffEq = "..."

[on-import]
foo = ["CuArrays", "OrdinaryDiffEq"]

which tells the loader to include src/on-import/foo.jl when CuArrays and OrdinaryDiffEq are loaded.

@DilumAluthge
Copy link
Member

This sounds fantastic. Is this a feature that would be available in a Julia 1.x release, e.g. Julia 1.4 or Julia 1.5? Or would it have to wait until Julia 2.0?

@DilumAluthge
Copy link
Member

DilumAluthge commented Aug 8, 2019

One declares a conditional dependency by adding an entry to the Project.toml file as:

[conditional-deps]
DataFrames = "$UUID_DATAFRAMES"

[compat]
DataFrames

An alternative possibility is to just put DataFrames inside [deps] and then have a list of names that are conditional.

[deps]
SomeOtherDep = "..."
DataFrames = "$UUID_DATAFRAMES"

[conditional-deps] = ["DataFrames", ]

I like the first one more. Listing the conditional dependencies under [deps] might get a little confusing.

@KristofferC
Copy link
Sponsor Member Author

This sounds fantastic. Is this a feature that would be available in a Julia 1.x release, e.g. Julia 1.4 or Julia 1.5? Or would it have to wait until Julia 2.0?

Some Julia 1.x.

What if I want to define a glue module to be loaded when both CuArrays and OrdinaryDiffEq are imported?

Yeah, I thought about this a little bit too. A first implementation of this might not support this but we should probably make sure that adding it will not be awkward.

@KristofferC
Copy link
Sponsor Member Author

Triaging to discuss what to do about multiple conditional dependencies (which kind of starts to sound like "features").

@DilumAluthge
Copy link
Member

DilumAluthge commented Aug 16, 2019 via email

@Roger-luo
Copy link

Regarding to multiple deps, maybe we could borrow something similar from rust-cargo (as features), it was included in this proposal: #977

@KristofferC
Copy link
Sponsor Member Author

AFAIU, the difference between features and conditional dependencies is that a feature is something that someone opts into from the current active Project while a conditional dependency is automatically "activated" whenever it's requirements are satisfied.

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Aug 28, 2019

I'm going to make an alternate proposal here. First: I think we should not call these conditional dependencies. They're NOT dependencies—they are packages that glue other packages together and are loaded automatically when the set of packages that they glue together are loaded. They depend on the packages that they glue together, not the other way around! This is crucial. So instead, I propose that we call them "glue packages". Here's how we could specify them in a package's P's Project.toml file:

name = "P"
uuid = "<uuid>"

[deps]
# P's dependencies here

# glue with a single dependency
[glue]
A = "<uuid>" # source at `glue/A.jl`
B = "<uuid>" # source at `glue/B.jl`

# glue with multiple dependencies
[glue.CD] # source at `glue/CD.jl`
C = "<uuid>"
D = "<uuid>"

There's a few ways that we can go with the glue/{A,B,CD}.jl files. One way is to treat them like normal packages that have to define a module of the appropriate name. This is a little weird, though because the file A.jl glues together P and A so it shouldn't define a module named A it should define a module named A_P or something like that. Also, the module's name doesn't matter at all: no one ever loads it by name. The only reason it needs to exist is so that we can save it in .ji files. So maybe these should be more implicit, as if this is done for you:

module P_A
    import P, A
    include("glue/A.jl")
end

Then inside of the file glue/A.jl all you have to do is define all the functionality needed to glue P and A together. Similarly for a multi-dependency glue file like CD.jl it would be implicitly loaded like this:

module P_CD
    import P, C, D
    include("glue/CD.jl")
end

Now, the actual loading would work like this: when all of the packages P, C and D have been loaded—however that happens—then the glue package P_CD is also loaded. As I mentioned before, since it is a package that depends on P, C and D it gets its own .ji file which can be reloaded whenever another Julia process using the same versions of these three modules runs.

@tkf
Copy link
Member

tkf commented Aug 29, 2019

They're NOT dependencies—they are packages that glue other packages together and are loaded automatically when the set of packages that they glue together are loaded.

I agree. I'm using on-import section in my earlier comment to emphasize that it is more like a "hook." So "import hook" may be an alternative terminology (not that "glue package" is bad).

I think one benefit of formalizing it as hooks is that code loading can respond to other "conditions" like feature flags like @Roger-luo is suggesting. Project.toml file can look like something like:

[extras]
CuArrays = "..."
Zygote = "..."

[hook.GPUImpl]
import = ["CuArrays", "Zygote"]
feature = ["GPU"]

which indicates that hook/GPUImpl.jl will be (precompiled and) loaded when CuArrays and Zygote are imported and feature flag GPU is specified.

I'm not suggesting adding feature flag support right now. I just thought this format is more extensible.

@StefanKarpinski
Copy link
Sponsor Member

I very intentionally don't want this mechanism to be too flexible. I don't want anything besides loading a glue package to happen when a set of packages are loaded. Of course, that's not very restrictive since loading a package can execute arbitrary code, but it does mean that there's a module that results which can be precompiled and saved—and that not being the case is precisely what's so problematic with the current requires system. Having arbitrary import hooks are likely to have all the problems that requires currently has.

I also very much do not want this to be a mechanism for changing the behavior of packages. The only liberty that a glue package should take is that it can define methods (and types, I guess) for that depend on the types the packages that it glues together. So, it would be considered type piracy for a normal package A that depends on B and C to define B.f(::C.T) but if it's a glue package gluing B and C together, then it's perfectly kosher to do that.

I don't know how we should handle package features like what Roger wants, but it must not be this, or we will completely screw up the ability of this feature to fix the current precompilation issues.

@tkf
Copy link
Member

tkf commented Aug 31, 2019

It was not my intention to suggest introducing any events for the hooks more dynamic than code loading events. If you think the term "hook" suggest features more dynamic than what is already possible by what you are suggesting, it probably is not the best term to use.

But (temporal) dynamism and flexibility are different and feature flags can be implemented in very static manner. For example, if MyPackage needs a set of feature flags, Pkg can create (say) ~/.julia/options/$manifest_slag/MyPackage.toml for each environment where manifest_slag is the hash of the full path of Manifest.toml. This option file can then be tracked as a dependency of the .ji files (using include_dependency) of the glue modules.

@tkf
Copy link
Member

tkf commented Aug 31, 2019

Actually, let me take back my earlier comment. Feature flags can be turned into consts of MyPackage and then can be checked inside the glue modules. This approach would waste precompile cache files (i.e., creates a no-op .ji files when certain glue is not needed when certain flags are not set) but it's probably better to orthogonalize glue modules and feature flags concepts.

@KristofferC
Copy link
Sponsor Member Author

KristofferC commented Sep 8, 2019

We need to be able to give compat info. How about making the glue packages look a lot like a "mini package" but each glue package is under a glue header:

[[glue]]
[glue.deps]
A = "<uuid>" # source at `glue/A.jl`

# Adding compat and file
[[glue]]
file = "glue/B_flue"
[glue.deps]
B = "<uuid>"
[glue.compat]
B = "0.4"

# Multiple
[[glue]]
file = "my_glue_C_D.jl" # source at `glue/my_glue_C_D.jl`
[glue.deps]
C = "<uuid>"
D = "<uuid>"
[glue.compat]
C = "0.2"
D = "0.1"

It's pretty ugly with all the [glue.] though.

@StefanKarpinski
Copy link
Sponsor Member

Just allow anything that appears in a glue stanza in the normal [compat] section?

@StefanKarpinski
Copy link
Sponsor Member

To elaborate, I think it would be confusing to use clashing names across glue packages so I think it's sane for them to have to match and it doesn't make sense for compat bounds not to match across glue packages, so we can just put glue bounds in [compat] with the name used in the [glue] stanzas.

@lkapelevich
Copy link

How would testing "glued packages" look? Could there be something similar for a test/Project.toml and files like test/glue/A.jl etc.?

@gustaphe
Copy link

gustaphe commented Aug 9, 2021

I'm not sure "Don't care about supporting that use" is resolved by this. If the maintainers don't want to maintain glue code for a specific package, they won't want to do so with more first-class glue code either.

@rafaqz
Copy link
Contributor

rafaqz commented Aug 9, 2021

I'm not sure "Don't care about supporting that use" is resolved by this. If the maintainers don't want to maintain glue code for a specific package, they won't want to do so with more first-class glue code either.

I think the idea is they wouldn't have to be involved at all? the conditional glue can just be maintained with the lightweight base package, but without the current overheads of doing that.

@timholy
Copy link
Sponsor Member

timholy commented Aug 9, 2021

I'm not sure this is distinct from the proposals above, but timholy/SnoopCompile.jl#253 wants

JET = "$UUID_JET" "if julia.VERSION >= 1.6"

(SnoopCompile already loads different functionality depending on the Julia version.)

@ericphanson
Copy link
Contributor

ericphanson commented Aug 10, 2021

Just to add, another place this comes up in the various scenarios Lyndon outlined in #1285 (comment) besides with plot recipes or chain rules is with serialization packages like StructTypes or ArrowTypes. Those are lightweight packages to define how to serialize Julia objects to various targets, but upstream packages may not want them for one of the reasons Lyndon mentioned. I would love it if every package that defines a struct also defined several robust serialization options like JSON or Arrow so that serialization "just worked" the way plotting sometimes does, and I think glue packages can help fill the holes there.

Also, one other reason upstream may not want to add a light dependency is compat; if the light dep requires a later Julia version than the package wants to support, then it might not want to add it. That's a scenario where a glue package would work great, since if you're loading both packages then you must be on a compatible Julia version. (Though I guess this is pretty much when Tim said in #1285 (comment)).

@toollu
Copy link

toollu commented Aug 4, 2022

So this issue is now 3 years old and still has many other open issues connected. Is there any path or timeline forward on resolving this?

@IanButterworth
Copy link
Sponsor Member

Just in case anyone here missed it, there's a proposal at JuliaLang/julia#47040

@KristofferC
Copy link
Sponsor Member Author

I think this is fixed now with the new "extension" functionality in Pkg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests