Support in code loading and precompilation for weak dependencies #47040

KristofferC · 2022-10-04T07:30:00Z

Weak dependencies are a solution to the problem where you want to be able to extend some package (e.g. via a method overload) but you don't really use the package itself so taking on a full dependency can be too expensive (in terms of e.g. load time).

Weak dependencies are a bit similar to optional dependencies in Rust (https://doc.rust-lang.org/cargo/reference/features.html#optional-dependencies), with the difference that the optional dependency is not explicitly opted into, but is based on the presence of the optional dependency in the current environment.

The problem weak dependencies solves is very similar to what Requires.jl solves but it has some advantages:

The "conditional code" is precompiled and does not use runtime eval like Requires.jl
There is support in Pkg for compat on weak dependencies.
Since it is precompiled, it is easier to use with PackageCompiler. Requires.jl right now tries to pattern match against includes and serializes a string of those files so that it can be evaluated during runtime even when the file does not exist.
It is simpler in the sense that there are no runtime callbacks. Everything is static based on the current environment.
The interaction with stacked environment is simpler, with Requires.jl, it will run conditional code no matter what environment a dependency is loaded from which weak dependencies only look at the environment of the package itself (which is also where the compat of the weak dependency is applied).
You have the ability to run code in the "else" case, where the weak dependency is not installed. With Requires, you never know up front if the conditional code will run at some point in the future.

In summary, there is virtually no disadvantage of a dependency being weak. If it can be weak it pretty much should. The end goal is to reduce artificial "dissection" of packages (c.f. StaticArraysCore.jl) and to make the size of the average dependency graph significantly smaller, leading to smaller load times and smaller artifacts from e.g. PackageCompiler.

An example of packages using weak dependencies can be found in https://github.com/IanButterworth/WeakDepsExamples (where the syntax in the package and the TOML files schema can be seen). The registry format can be seen in https://github.com/IanButterworth/General/tree/ib/weak_deps/H/HasWeakDeps.

For simplicity of testing, this PR also contains a change to the Pkg version that suports weak dependencies (JuliaLang/Pkg.jl#3216).

This work has been done in collaboration with @IanButterworth

Implementation

Base

An API for packages to check if a weak dependency is "active" (available to load). This is used at top-level for packages to determine if they will (can) load and run the code for the weak dependency.
Invalidation of precompile files based on changes to the set of active weak dependencies. This works similar to compile time Preferences where a piece of information is stored in the ji file (based on the Manifest.toml file or the LocalPreferences.toml file in the Preferences case) and then checked upon loading the ji file.

Pkg

Writing and reading projects / manifests with weak dependencies.
Support for compatibility on weak dependencies
Support for reading weak dependencies from the registry
Suport for status printing with weak dependencies (pkg> st --weak)

These are implemented in JuliaLang/Pkg.jl#3216.

Registrator

Support for writing packages with weak dependencies to the registry. This is still not done but should be fairly easy since it is just doing exactly the same what is done for normal dependencies except to another file.

giordano · 2022-10-04T08:29:02Z

First of all, thanks for addressing this issue!

It is simpler in the sense that there are no runtime callbacks. Everything is static based on the current environment.

Do I understand correctly that in large environments this may lead to accidentally increase loading time, as the weak dependency may be pulled in by some other random packages, instead of being added directly by the user to the environment? As seen in https://github.com/IanButterworth/WeakDepsExamples/blob/adb19bec21d763566c4419bf1d459fcf130ee9ae/HasWeakDeps.jl/src/HasWeakDeps.jl#L9-L19, the code one would typically write is

if Base.@hasdep CUDA
    using CUDA
    # do stuff with CUDA
else
    # do stuff without CUDA
end

so CUDA has always to be loaded if it just happens to be present in the environment. On the other hand I appreciate this works better with precompilation and makes code loading more deterministic.

LilithHafner · 2022-10-04T09:31:02Z

This is lovely! As a style nit, the pattern

if Base.@hasdep CUDA
    using CUDA

seems redundant. Is there any case where one would call Base.@hasdep CUDA and not subsequently load the package?
Perhaps we could roll those two lines into each other as

if Base.@optional using CUDA
    # do stuff with CUDA
else
    # without
end

or

if Base.@optional using CUDA: something_from_cuda
    # do stuff with CUDA
else
    # without
end

or

if Base.@optional import CUDA.something
    # do stuff with CUDA
else
    # without
end

Base.@optional could have a different name like @weak or @conditional.

KristofferC · 2022-10-04T11:40:51Z

Do I understand correctly that in large environments this may lead to accidentally increase loading time

For large environments where only a small amount of the environment is typically loaded in a session, yes. But generally, that is not a great idea in the first place.

As a style nit, the pattern
if Base.@hasdep CUDA
   using CUDA
seems redundant

Kind of yes. But I think the current one is so much simpler that a bit of redundancy is not too bad. And who knows, maybe you don't want to load the package in that block for some reason.

ararslan · 2022-10-04T16:13:57Z

Such awesome work! This is really exciting.

Would it be possible to spell out "dependency," e.g. hasdependency? Just calling it "dep" is somewhat ambiguous since we also use that to mean deprecation in e.g. depwarn. (It could also be nice to note somehow that it's a conditional specific to weak dependencies.)

KristofferC · 2022-10-04T16:18:45Z

Would it be possible to spell out "dependency," e.g. hasdependency? Just calling it "dep" is somewhat ambiguous since we also use that to mean deprecation in e.g. depwarn.

Yeah, that needs some bike shedding. I just went with something quick as to not get stuck thinking about it.

giordano · 2022-10-04T20:22:43Z

For large environments where only a small amount of the environment is typically loaded in a session, yes. But generally, that is not a great idea in the first place.

To elaborate on my comment above, my ultimate question is: how about defining "weak dependency" a dependency which is explicitly added to the environment (i.e., it's in the project file) and doesn't simply happen to be in the environment because it's been pulled in by some other packages?

gbaraldi · 2022-10-04T20:55:16Z

I agree with mose. Since we always load the package if it's in the environment, I guess requiring it to be explicitily added makes sense.

KristofferC · 2022-10-04T21:14:07Z

I agree with mose. Since we always load the package if it's in the environment, I guess requiring it to be explicitily added makes sense.

I don't really understand this. If you have a project with only package A in it, A depends on B and C, inside A you call C.compute(B.getobject()) and C has a weak dep on B to provide an optimized compute routine for the object returned by B.getobject(), wouldn't you want that optimized routine to be available in this scenario?

gbaraldi · 2022-10-04T21:30:18Z

To give a clear example, imagine package x has an optional CUDA.jl dep and it internally could use GPU processing in a function even if it returns a normal julia array.
Do we want it to use that CUDA version if it finds cuda from another random package or do we want an explicit CUDA added.
I'm not sure if I want a weak dep looking at deps of other packages, it seems a bit odd.

KristofferC · 2022-10-04T22:19:10Z

I am not sure why you didn't comment on my example at all but anyway.

I'm not sure if I want a weak dep looking at deps of other packages, it seems a bit odd.

To rephrase my previous example. Types can "flow" between packages that do not have a dependency relationship by moving up the dependency chain (by returning objects of that type) and then downwards again (by calling functions with those objects) thanks to generic code. You may want to be able to specialize code on the chance that you will be called with some specific type, without having to take on a full dependency on the package that defines the type for that value. This is true even if all of the packages here being considered are all somewhere "deep" in the dependency graph.

I don't think you would use weak dependencies for your example but probably something more like the Preference system and then say use_gpu = true or something along those lines. Weak dependencies would mostly be used for adding new dispatch rules for types defined in other packages.

Seelengrab · 2022-10-05T11:38:22Z

I think the split where you're talking past each other is about whether func(::CuArray) that's placed in a if Base.@hasdep CUDA in a module Foo should already be compiled/available even if CUDA is just a transitive dependency in a project explicitly using Foo.

I'd say that yes, these functions should be available - without large static analysis, it'd be quite hard to determine whether a callchain involving func can end up with a CuArray ahead of calling time. Whether that function will actually be called with a CuArray or not is not the concern of Foo, but of the package that's ultimately really depending on CUDA (either explicitly, or also by having only a weak dependency and the environment containing Foo having CUDA explicitly).

So "opting into" CuArrays in this model either means explicitly adding CUDA to the outermost project environment, with all packages only depending weakly on it (seems preferrable to me) or (for compatibility) have a dependency switch between Array and CuArray via Preferences. Ideally, packages would provide both though - weakly depend on CUDA and also having a switch whether it should use the package internally.

antoine-levitt · 2022-10-07T07:19:12Z

For large environments where only a small amount of the environment is typically loaded in a session, yes. But generally, that is not a great idea in the first place.

The implications of this (awesome) pr for this particular use case go over my head, but I, and most people I know, exclusively use one big environment. I know that's probably not optimal and feel vaguely guilty for doing so, but it's the simplest solution and it works sufficiently well that I've never bothered to change. Generally speaking people commenting here probably use more advanced workflows than the average julia user, so I just wanted to point out that yes, this is a prevalent use case.

KristofferC · 2022-10-07T08:48:02Z

I should mention that there is one alternative possible implementation that works better with large environments where typically only a small part of the environment is loaded (but it has other drawbacks). It does this by pushing the loading of conditional code to runtime based don't the loading of other packages, (similarly described in #43119 but with some tweaks to support precompilation) which is also more similar to Requires.jl.

The implementation would roughly be:

Each code block that should be conditionally executed ("glue code") gets put into separate files.
The Project.toml files lists the files that contains the "glue code" for a set of weak dependencies, for example:
```
name = "Package"

[gluecode]
"glue/AGlue.jl" = ["A"]
"glue/BCGlue.jl" = ["B", "C"]
```

The content of e.g. glue/A.jl is something like:

 module AGlue
 using Package
 using A
 Package.f(x::A) = ...
 end #module

When package A gets loaded, the file glue/A.jl is loaded as well and is treated as its own "mini package". For example, it has its own precompile file.

The advantages of this are:

You only pay the cost of precompiling and loading glue/A.jl when A is actually loaded.
Less need for recompilation. With weak dependencies, if you remove a package you might have to recompile a large part of your environment due to changes in active weak dependencies. With "glue code" you just stop loading that part.

The disadvantages are:

You need to separate out all the glue code in different files.
You need to come up with names for the modules (maybe there could be some convention for this).
The glue code "package" is not really a first-class package but there will still be methods defined in it
It is hard for Package to get a handle to the AGlue module since it is just loaded at some point during runtime. But AGlue could fire some hook in Package in its glue code to tell Package that it got loaded so that might be fine.
It is hard to get a handle to the AGlue module from the REPL. Say if you have a variable defined in there, how do you even access that? There could be some macro, say @gluecodemodule Package A that would give you that module.
You need to list the conditional deps => file mapping in the Manifest.toml since we do not want to have to look into all the packages Project.toml file when packages are loaded. Only the project and manifest of the current env.

I know @vtjnash prefers this implementation :P

ViralBShah · 2022-10-30T00:21:20Z

Is this likely to make it into 1.9?

rssdev10 · 2022-11-22T16:55:36Z

Hello, sharing a few words about CUDA. Some time ago we found an issue with CUDA drivers on a machine without NVidia hardware. We had a problem with a build agent located on AWS.

https://github.com/jw3126/ONNXRunTime.jl/pull/23/files

We used ONNXRunTime. And the initial code of that package looked like that:

function __init__()
    @require CUDA="052768ef-5323-5732-b1bb-66c8b64840ba" include("cuda.jl")
end

The issue was we couldn't control the usage of CUDA in 3-rd party packages. And using CUDA was present in some of them. As a result, @require CUDA sees that the package is already loaded and includes activation of drivers with further failure of that.

The correct way is to do an additional check:

function __init__()
  @require CUDA="052768ef-5323-5732-b1bb-66c8b64840ba" begin
      CUDA.functional() && include("cuda.jl")
  end
end

But CUDA is already loaded in that case. And we should use CUDA.functional() to check the hardware presence.

And, an additional question is binary compiling with CUDA support. If we want to prepare a docker image with CUDA support for production use, the build agent must be able to activate CUDA drivers too. But that might not be true with AWS or another cloud platform. The more preferable way is to build a package on a machine without NVidia hardware/CUDA support but use the docker image with a production cluster with NVidia hardware.

gbaraldi · 2022-12-08T19:58:03Z

@KristofferC Can we close this since package extensions got merged?

KristofferC · 2022-12-08T20:46:21Z

Yes

add support for weak dependencies in code loading and precompilation

1fc4138

KristofferC added the domain:packages Package management and loading label Oct 4, 2022

KristofferC mentioned this pull request Oct 4, 2022

WIP: Support for weak dependencies JuliaLang/Pkg.jl#3216

Closed

change Pkg to weak deps supported version

ff09903

KristofferC force-pushed the kc/weak_deps branch from 9e9d3e4 to ff09903 Compare October 4, 2022 12:21

IanButterworth mentioned this pull request Oct 7, 2022

Proposal for first class support of conditional dependencies in Pkg JuliaLang/Pkg.jl#1285

Closed

IanButterworth added the status:triage This should be discussed on a triage call label Oct 13, 2022

LilithHafner mentioned this pull request Oct 26, 2022

Automatically resolve most method ambiguities #47325

Open

ToucheSir mentioned this pull request Nov 6, 2022

Support multiple GPU backends FluxML/Flux.jl#1566

Closed

5 tasks

KristofferC mentioned this pull request Nov 24, 2022

Add support for "package extensions" to code loading #47695

Merged

This was referenced Nov 27, 2022

Make Pluto dependency optional? JuliaDocs/DemoCards.jl#130

Closed

[FR] redesign backend dependencies JuliaPlots/Plots.jl#4567

Closed

KristofferC closed this Dec 8, 2022

vchuravy deleted the kc/weak_deps branch December 8, 2022 20:50

oscardssmith removed the status:triage This should be discussed on a triage call label Feb 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support in code loading and precompilation for weak dependencies #47040

Support in code loading and precompilation for weak dependencies #47040

KristofferC commented Oct 4, 2022 •

edited

Loading

giordano commented Oct 4, 2022

LilithHafner commented Oct 4, 2022

KristofferC commented Oct 4, 2022

ararslan commented Oct 4, 2022 •

edited

Loading

KristofferC commented Oct 4, 2022

giordano commented Oct 4, 2022

gbaraldi commented Oct 4, 2022

KristofferC commented Oct 4, 2022 •

edited

Loading

gbaraldi commented Oct 4, 2022

KristofferC commented Oct 4, 2022

Seelengrab commented Oct 5, 2022

antoine-levitt commented Oct 7, 2022

KristofferC commented Oct 7, 2022 •

edited

Loading

ViralBShah commented Oct 30, 2022

rssdev10 commented Nov 22, 2022

gbaraldi commented Dec 8, 2022

KristofferC commented Dec 8, 2022

Support in code loading and precompilation for weak dependencies #47040

Support in code loading and precompilation for weak dependencies #47040

Conversation

KristofferC commented Oct 4, 2022 • edited Loading

Implementation

Base

Pkg

Registrator

giordano commented Oct 4, 2022

LilithHafner commented Oct 4, 2022

KristofferC commented Oct 4, 2022

ararslan commented Oct 4, 2022 • edited Loading

KristofferC commented Oct 4, 2022

giordano commented Oct 4, 2022

gbaraldi commented Oct 4, 2022

KristofferC commented Oct 4, 2022 • edited Loading

gbaraldi commented Oct 4, 2022

KristofferC commented Oct 4, 2022

Seelengrab commented Oct 5, 2022

antoine-levitt commented Oct 7, 2022

KristofferC commented Oct 7, 2022 • edited Loading

ViralBShah commented Oct 30, 2022

rssdev10 commented Nov 22, 2022

gbaraldi commented Dec 8, 2022

KristofferC commented Dec 8, 2022

KristofferC commented Oct 4, 2022 •

edited

Loading

ararslan commented Oct 4, 2022 •

edited

Loading

KristofferC commented Oct 4, 2022 •

edited

Loading

KristofferC commented Oct 7, 2022 •

edited

Loading