Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glue code (aka conditional dependencies) #43119

Closed
StefanKarpinski opened this issue Nov 17, 2021 · 18 comments
Closed

glue code (aka conditional dependencies) #43119

StefanKarpinski opened this issue Nov 17, 2021 · 18 comments

Comments

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Nov 17, 2021

This is related to the discussion in JuliaLang/Pkg.jl#1285 but the plan I propose doesn't touch Pkg at all, so I'm writing it up here. @Wimmerer contacted me about wanting to work on this feature and we had a back and forth about what needs to be done. Here's what I propose as a concrete solution to the glue code problem, aka conditional dependencies.

The problem we want to address occurs when there are packages, say A and B, which don't depend on each other, but when both are loaded at the same time, there is are some additional definitions—usually of methods—that are needed to make them work well together. For example, suppose A provides a function A.f and B defines a type B.T and there's a method A.f(::B.T) that would be useful to have. This method cannot be defined in A since B.T isn't available there and it can't be defined in B since A.f isn't available there. We need a way to provide that definition at the point when both A and B have been loaded.

The solution comes in two parts.

Part 1: post-load hooks for sets of packages

The first part is to implement and expose a mechanism whereby you can register a callback to be called when a set of packages have been loaded. The packages should be identified by UUID and it should be an arbitrary set of packages, not just one or two packages. This logic goes right after a new package has been loaded and should check for each registered hook, whether the set of loaded package UUIDs is a superset of the set of required package UUIDs, and if it is, then call the hook callback and then delete it.

Part 2: glue/*.jl entry-points

The second part leverages the first mechanism and uses it to load some glue code that is included in packages that are loaded. The idea is that package A will have a top-level directory called glue and a file glue/B.jl which contains the code that implements the methods that are useful when both A and B are loaded. In more detail:

  • When Julia loads a new package, say A, it should look at all glue/*.jl files
  • For each one, split the name on commas—those are the names of the glue dependencies
    • e.g. glue/B.jl in the package A would be the file providing glue definitions for A and B
    • If any part of a name isn't a valid package name, ignore the file and continue
  • To resolve each glue dependency name to a package UUID, look it up in a new section of A's project file entitled [glue] with the same format as the [deps] and [extras] sections, i.e. mapping names to UUIDs
    • If a name doesn't exist there, error or maybe ignore?
  • Use the contents of the glue file to define a hook that will trigger when the set of glue dependencies have all been loaded, using the mechanism created in part 1.

So, glue/B.jl in the package A would be the glue for A and B and glue/B,C.jl would be glue for the packages B and C together, i.e. definitions that depend on all three of A, B and C.

The next question is how to use the glue file to define a hook. We want the behavior of the glue files to be fairly constrained, so I propose that the hook generated look something like this:

()->@eval Module() begin
    import A, B
    include($(abspath(glue_file)))
end

This evaluates the glue code in an anonymous module that imports A and B. I'm not sure how easy it would be to rig it up so that this is done in a context where only A and B can be loaded and if that's really necessary or desirable, but otherwise the code can load anything that can be loaded in Main, which probably isn't ideal or sensible for glue hooks.

This hook could probably be expressed generically and be written as glue_hook(glue_path, A, B) or something like that, which might avoid code some code generation. Or in the other direction, we could insert the contents of the glue file into the above template and generate that function. But I suspect that's not what we want. Probably better to be late binding and do less code generation.

The actual contents of a glue file would be pretty simple. For example, the glue file for defining the method A.f(::B.T) would simply contain that method definition:

A.f(::B.T) = definition

This would either be glue/B.jl in package A or glue/A.jl package B.

Questions

What happens if glue/B.jl in package A and glue/A.jl in package B both exist? Load them both.

What about the order? Eh, whatever order they happen in is probably fine, but we could maybe have a defined ordering based on UUIDs or something.

What if they have conflicting definitions and it breaks? That's a normal package incompatibility between those versions of A and B, although it's a slightly weird one because neither A nor B depends on the other but their versions are incompatible. I'm not certain if we can express that in [compat]—it's possible that we can't. At the very least, [compat] probably needs to know about the [glue] section, which may end up being a reason to put glue dependencies in the [extras] section since I think [compat] may already know about that.

Why not make glue packages real registered external packages that live outside of both A and B? Because that complicates things massively. In that design, Pkg needs to know about glue packages and needs to makes sure that whenever A and B are both in a manifest, then Glue_A_B is also included in the manifest. Also, registering glue packages and versioning them separately seems like a lot of overhead.

What if the glue behavior depends on what features the particular version of B that gets loaded has? The glue code can do whatever metaprogramming it needs to, reflecting on B, it’s version, and features.

Edit: couple of added questions

Why does the glue code go in A/glue/ instead of somewhere in A/src/—isn't it part of the A package? No, not really. If you just load A then you load what's in A/src and you don't load anything from A/glue at all. The glue stuff is really external lightweight packages that depend on A, which is why it makes sense to put them outside of the main codebase of A.

What's really so bad about making glue packages explicit separate packages? For one thing, we don't even have a way of expressing that to Julia's version resolver and it's unclear how we would even do this. The resolver currently understands two things: (1) that a version of a package depends on some other set of packages and (2) that some versions of packages are incompatible with each other. The conditional dependency pattern can't be expressed in terms of these two features—it's a new kind of thing entirely. You need to express that for some subset of pairs of versions of two packages, you need to load yet another package if you've chosen both those versions. What about for other pairs of version of those two packages? Are those incompatible or are they just pairs for which the glue isn't necessary or doesn't work? Both are valid concepts. So now you need to express several things:

  1. That there are certain n-tuples of packages, which, if all present in a resolver solution, require that yet another package be present in the solution.
  2. Which versions of those packages that reverse dependency applies to. Do you express this as a Cartesian n-product of version specs? There are sets of version tuples that can't be expressed that way. Do you allow a union of Cartesian n-products of version specs? Or allow cutting out n-products in a nested fashion. Note that this is just for specifying which version of the n-tuple of packages the reverse depedency applies to.
  3. You still need to express compatibility. Which versions of each dependency is the glue package compatible with. The good news is that this would just be a normal compatibility constraint that we already know how to express and solve for. The bad news is that this is a different and separate concept from (2): you have to know when you need the glue package at all before you can decide which versions will work.

This is a lot of complicated new crap to cram into the registry and teach the resolver about. How does this proposal avoid this problem? It locks the glue code to the version of one or both reverse dependencies and then it's just the usual problem of picking those such that they're compatible.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Nov 17, 2021

Duplicate of #6195?

@StefanKarpinski
Copy link
Sponsor Member Author

StefanKarpinski commented Nov 17, 2021

No. I'll grant you that it's related, but your definition of "duplicate" is very odd. That issue is closed and has been for four years. It also doesn't have a specific concrete plan of action. Should we really close this issue and reopen that one? Is that in any way more likely to lead to progress on this problem?

@rayegun
Copy link
Member

rayegun commented Nov 17, 2021

My biggest question as I look into this is whether precompilation and PkgCompiler will work doing this with a little elbow grease?

@StefanKarpinski
Copy link
Sponsor Member Author

My guess is that they might just work, but @vtjnash and @KristofferC might have to weigh in.

@Seelengrab
Copy link
Contributor

May I suggest a different title for this issue, like "Environment based conditional glue code"? At first I thought this was also about allowing to specify that a given package is only a dependency if e.g. the current julia version is a concrete or earlier version than some specified version, without having to split the releases & maintenance over two different releases. At least, something like that is what I understand under the term "conditional dependency".

Sounds like a good idea in general though, to allow for special casing when some other package is available in the environment our package is loaded in and we have specialized code for that package.

@bvdmitri
Copy link
Contributor

Looks like a nice feature. I have some extra questions though:

  1. What package defines A.f(::B.T) = definition. Is it located in A/glue/B.jl or B/glue/A.jl and why? Probably it needs to be restricted somehow. For me there is no easy way to say what type of glue code should be allowed.
  2. I doubt that in case of conflicts it is "a normal package incompatibility". Returning to the previous bullet: I can imagine a situations where package's developers couldn't agree on an exact implementation and defined conflicting versions for A.f. This would certainly ruin user's experience because glue code is considered to be loaded implicitly? Why user should care about conflicting glue functionality if he/she wants to use packages separately without glue code in the same session? This also seems like a legitimate way to break other package functionality. Maybe introduce a new keyword to load glue code in addition to using?

@KristofferC
Copy link
Sponsor Member

KristofferC commented Nov 18, 2021

I feel this proposal should contrast how it works vs how Requires work (which already uses a post-load hook when a package is loaded to load arbitrary "glue" code) as well as the proposal in JuliaLang/Pkg.jl#1285. I must say, this looks very similar to Requires (I actually don't see what the difference is). It should probably also discuss how this fixes the problems identified under "What is the problem with Requires.jl" in JuliaLang/Pkg.jl#1285. In my opinion, this has the same weaknesses.

Personally, I feel that a proposal that is based on when running something when packages are "loaded" is set up to fail. It should be declarative and something that can be checked during precompile time (for example, if a package is present in the current manifest) so that the glue code can be precompiled together with the package. (and yes, this requires modifications to Project.toml files to add a conditional dependency there so you have a name -> UUID mapping to load the conditional dependency to run the glue code).

@StefanKarpinski
Copy link
Sponsor Member Author

StefanKarpinski commented Nov 18, 2021

@bvdmitri, I think you're treating a social problem as if it were a technological problem. It's fine for glue definitions to go in either A/glue/B.jl or B/glue/A.jl but one or the other package will probably decide to do it first and it will live there. If it makes sense for some reason to move it elsewhere, then people will move it.

I can imagine a situations where package's developers couldn't agree on an exact implementation and defined conflicting versions for A.f. This would certainly ruin user's experience because glue code is considered to be loaded implicitly?

Package authors have to not do this. Again, this is a social problem, not a technological one.

This also seems like a legitimate way to break other package functionality. Maybe introduce a new keyword to load glue code in addition to using?

This is already possible and Julia does nothing to prevent it.


@KristofferC: I feel this proposal should contrast how it works vs how Requires work

Ok, here's what you wrote about the issues with Requires over in JuliaLang/Pkg.jl#1285:

  1. It doesn't work well with precompilation. The way people tend to use Requires is by include-ing some file when the conditional dependnecy is available. Requires.jl runs inside __init__ which means the code evaluated by the include command does not end up in the precompile file.

This doesn't run inside of __init__ so it doesn't have that issue. Regarding precompilation though, it may have the same issue since I think we save precompile files only for loading a single package and there's no actual glue package here. We could, however, modify precompilation to save precompile files for these glue pseudo-packages.

  1. It is "implicit" in the sense that the conditional dependency is only defined in the Julia code. We typically want to put all dependency information inside the Project file.

This is explicit and can be entirely determined from the project source.

  1. There is currently no good way to set compat bounds on the conditional dependency.

In Requires the issue is that the external dependency isn't an explicit dependency. This has a similar issue in that neither A nor B depend on each other, but we can extend [compat] to allow compatibility constraints on [glue] dependencies.

  1. It has performance problems (Improving @require performance JuliaPackaging/Requires.jl#39)

I don't really understand the performance problem with Requires, so it's hard to address this.

  1. Basing the activation criteria of the conditional dependency on it simply being loaded might means that packages loaded from other places than the current project will affect whether the glue code is run or not. It would be better to base it only on the current active project.

That's a good point that this proposal doesn't address. We could base it on what's in the manifest instead. In that case the glue hook would only be registered if A/glue/B.jl exists and A and B are in the active manifest. If they're only somewhere later in the load path, we could skip loading the glue.

I'm not sure that's the right call though. I think that people would be confused that when they load A and B sometimes they are able to use A.f(::B.T) and other times they aren't. We don't currently worry all that much about the fact that something from later in the load path can type pirate something earlier in the load path. If you don't want that to happen, don't load stuff that's not in the active project. And in general, we discourage people from committing the kind of piracy that makes this a problem.

as well as the proposal in JuliaLang/Pkg.jl#1285.

Man, there's like three or four different morphing proposals over there. Not even sure what to cover. This is why this problem still isn't solved. Frankly, this isn't a pain point for me personally, it's something that @Wimmerer wanted to work on, @ViralBShah connected him with me, and this is the plan I conveyed to him, and then I promised to write it up on GitHub so he could work on it if he wanted to.

Personally, I feel that a proposal that is based on when running something when packages are "loaded" is set up to fail. It should be declarative and something that can be checked during precompile time (for example, if a package is present in the current manifest) so that the glue code can be precompiled together with the package.

This seems like it could be precompiled straightforwardly. Precompilation would need to learn to precompile glue "packages" like it precompiles real packages, but that's doable. And it is declarative: you can tell, without running any code, wether you'll need glue code or not, assuming only that all the packages in the manifest are actually loaded. Of course, we already can't accurately make that assumption in any Julia project, but it's fair to just pre-load the entire manifest and then run the code and should generally work close to the same. Glue code doesn't affect that.

@StefanKarpinski
Copy link
Sponsor Member Author

I've added some additional Q&A at the bottom of the original post in an attempt to keep the proposal all in one place rather than spreading it out in a long discussion.

@StefanKarpinski
Copy link
Sponsor Member Author

StefanKarpinski commented Nov 18, 2021

@Seelengrab: May I suggest a different title for this issue, like "Environment based conditional glue code"?

Well, that's why I don't call these "conditional dependencies"—because I think it's a confusing and bad name. I call it glue code. Other people have called it conditional dependencies, but that's what they'll probably search for, so that's why I used the term here.

@KristofferC
Copy link
Sponsor Member

KristofferC commented Nov 22, 2021

This is explicit and can be entirely determined from the project source.

You would need to extend the manifest format then to be able to see the "glue dependencies" of packages that are not direct dependencies though? This would require Pkg to emit those type of manifests.

I'm not sure that's the right call though. I think that people would be confused that when they load A and B sometimes they are able to use A.f(::B.T) and other times they aren't.

With the "static" approach, they would always be able to do A.f(::B.T) if A and B are in the same environment which I think is the assumption? Starting to support this in stacked environments sounds hard to get right.

In Requires the issue is that the external dependency isn't an explicit dependency. This has a similar issue in that neither A nor B depend on each other, but we can extend [compat] to allow compatibility constraints on [glue] dependencies.

This requires Pkg support.



Man, there's like three or four different morphing proposals over there. Not even sure what to cover.

I can do some comparisons:

To resolve each glue dependency name to a package UUID, look it up in a new section of A's project file entitled [glue]

That's identical to the linked proposal with the only difference being glue vs conditional-deps :

[conditional-deps]
DataFrames = "$UUID_DATAFRAMES"

[compat]
DataFrames

The idea is that package A will have a top-level directory called glue and a file glue/B.jl

This is identical to the proposal with the difference being the actual path:

The gluecode would be stored (based on a documented convention) in a file inside the package, eg src/DataFramesGlue.jl


The first part is to implement and expose a mechanism whereby you can register a callback to be called when a set of packages have been loaded.

This is the same as in the linked proposal:

When DataFrames gets loaded, we check all packages that declares a conditional dependency with it. If the version of DataFrames loaded is compatible with the compat entry for a package with DataFrames as a conditional dependency, we load the glue code which will act like a normal package and precompile.



So the proposal in JuliaLang/Pkg.jl#1285 looks functionally equivalent to the one here with the only difference being things like names and paths?

To make the difference between this proposal and Requires.jl more clear here are some concrete points:

  • You need to explicitly specify "glue" packages (I can't really get used to this name :P) in the Project.
  • Instead of specifying arbitrary code to run when a set of packages are loaded, you create files based on the names of the glue code packages.
  • Instead of using __init__ to add the package loading callbacks to run the arbitrary code, Julia basically adds "native" support for this. The advantage of this over using __init__ is that this will only run when the package is actually loaded and not run when loading a sysimage with the package (which also runs module initializers).

I think most of this is ok and would be an improvement (obviously since the proposal here is almost identical to JuliaLang/Pkg.jl#1285). I do think it can be slightly annoying to have to split up all the glue code between different files (like https://github.com/KristofferC/PGFPlotsX.jl/blob/master/src/requires.jl) but it allows the expression to be evaluated when the glue code "fires" to be statically known.

What I think is needed is a concrete proposal is the precompilation of the glue code. If the glue code is included as part of the precompilation process (because only the information from the Manifest) is used, this works automatically. It basically turns the glue code into a normal file that is part of the package that just happens to be conditionally included. If not, these glue code files need to be fitted into the precompilation system:

  • The files need to be modules (what should the name of the module be?)
  • They need to get a deterministic UUID (uuid5 I guess)
  • Support needs to be added for things like identify_package, locate_package for these modules (alt. the precompilation system needs to special-case these).

@KristofferC
Copy link
Sponsor Member

KristofferC commented Nov 22, 2021

Another thing that I think needs elaborating on is how the compat on conditional dependencies will work. In JuliaLang/Pkg.jl#1285 I wrote:

If the version of DataFrames loaded is compatible with the compat entry for a package with DataFrames as a conditional dependency, we load the glue code which will act like a normal package and precompile.

So in that proposal, compat on conditional dependencies never influence the resolver. The only thing they do is that decide on if the glue code should be loaded or not. So if you have a glue compat on B that is 2.0 and you happen to get B version 3.0 in your environment, the glue code will not be loaded. (This would require learning code loading a bit about compat). I think this can be slightly confusing but if the glue code compat should be enforced, then we would need to have that information in the registry which seems annoying to do at this point.

@DilumAluthge
Copy link
Member

So if you have a compat of B that is 2.0 and you happen to get B version 3.0 in your environment, the glue code will not be loaded.

We could print a warning in those cases, just so that the user is aware.

@StefanKarpinski
Copy link
Sponsor Member Author

You would need to extend the manifest format then to be able to see the "glue dependencies" of packages that are not direct dependencies though? This would require Pkg to emit those type of manifests

I don't follow this. Why would the manifest need to have anything about glue dependencies in it?

With the "static" approach, they would always be able to do A.f(::B.T) if A and B are in the same environment which I think is the assumption? Starting to support this in stacked environments sounds hard to get right.

Why is it hard to get right? If A or B have glue code for each other and both get loaded, then we load the glue code. End of story. What about stacked environments makes it harder? Yes, this means that we're ignoring potential incompatibility between things in different projects in the load path, but we already ignore those and let the chips fall where they may, I don't think the presence of glue code should change this.

This is the same as in the linked proposal:

When DataFrames gets loaded, we check all packages that declares a conditional dependency with it. If the version of DataFrames loaded is compatible with the compat entry for a package with DataFrames as a conditional dependency, we load the glue code which will act like a normal package and precompile.

Another thing that I think needs elaborating on is how the compat on conditional dependencies will work. In JuliaLang/Pkg.jl#1285 I wrote:

If the version of DataFrames loaded is compatible with the compat entry for a package with DataFrames as a conditional dependency, we load the glue code which will act like a normal package and precompile.

So in that proposal, compat on conditional dependencies never influence the resolver. The only thing they do is that decide on if the glue code should be loaded or not. So if you have a glue compat on B that is 2.0 and you happen to get B version 3.0 in your environment, the glue code will not be loaded. (This would require learning code loading a bit about compat). I think this can be slightly confusing but if the glue code compat should be enforced, then we would need to have that information in the registry which seems annoying to do at this point.

I have a different take on this: I think that [compat] with glue should affect version resolution, but that once you've chosen versions, if there is glue code, we should load it unconditionally. That way worrying about versions is purely an issue for the resolver and by the time we're loading code, we don't care about versions anymore, we only care about package identities and the presence of glue code. If the version of A that we've loaded has glue for B then we should load that glue no matter what version of B is loaded, even if it's one that the resolver would not have considered compatible. Of course, within a manifest, the resolver should always pick a version of B that is compatible—according to current registry information—but if you end up loading a version that isn't compatible, the glue code still gets loaded.

Suppose we do it the other way and we decide whether to load glue code or not based on compatibility information. What do we consider to the "one true source" of compatibility? Is it the registry or the project files of the packages we are loading? For everything else, we consider the registry to be definitive and only use project files as a source for registry data, but if we later edit the registry, then that is what we use for choosing versions. If we did that here, then a registry change could affect the behavior of a program: we resolved a manifest previously and picked compatible versions of A and B, so the A/glue/B.jl glue gets loaded. But then we realize those versions are actually broken together and we fix the registry compat info after-the-fact (maybe someone failed to follow semver, maybe the compat bounds were too loose). If we're using the registry as the definitive source of compat info, then by updating the registry but not re-resolving the manifest, we could end up in a situation where the same versions of A and B look incompatible and so the glue doesn't get loaded. Of course, mechanistically, this is also unnatrual because doing this would require us to look in the registry for compatibility information during code loading, which we would never really do.

Suppose we don't look in the registry and instead consider the compat info in package project files to be the source of compatibility truth for the purposes of deciding to load glue code or not. At least that is immutable since it's inside the package source, so it can't change over time. It still doesn't seem much better to me—it means that there are now two different notions of compatibility in play that both matter: the one in the registry, which is used to decide what versions to install, and the one in the project files that is used to decide whether to load glue code or not. Again, consider a situation where these have deviated. You could end up in a situation where two versions were marked as incompatible according to one or both of the project files, but that was wrong and the compatibility has since been fixed in the registry to indicate that the versions are actually compatible. Now you can resolve a single, apparently compatible manifest, where there is glue code for A and B but it mysteriously is not loaded. I can just imagine someone tearing their hair out about this. The very confusing reason is that that even though the versions of A and B are compatible in the registry, they are not compatible according to their project files, so the glue code doesn't get loaded.

The main potential issue with unconditional glue code loading, regardless of compatibility, is that the glue code might not work as it should or fail to load at all. However, I don't see this as being any different from cases where A depends on B and you end up loading incompatible versions, which can happen if A is later in the load path than B so that they haven't been resolved together. In that case, things break and we just rely on people to update their environments to fix the incompatibility. We could print a warning in those situations but I guess my point is that we already allow this with direct dependencies, so in my view this situation is no different and while it can cause problems it's not a radically different class of problems than things we already allow. Yes, we could print a warning when we load glue code that's incompatible, but then I think we should also print a warning when we try to load any versions that are incompatible. (And again: incompatible according to who?)

In Requires the issue is that the external dependency isn't an explicit dependency. This has a similar issue in that neither A nor B depend on each other, but we can extend [compat] to allow compatibility constraints on [glue] dependencies.

This requires Pkg support.

Fair point. Specifically, what is required is this: if we're resolving a manifest which happens to contain any of the packages in any [glue] stanza, then we need to respect any [compat] restrictions on those packages; if those packages aren't present, then we can ignore compat restrictions on them. In other words: we just apply [compat] restrictions to every package that will be present in the manifest and ignore them for packages that aren't present in the manifest.

So the proposal in JuliaLang/Pkg.jl#1285 looks functionally equivalent to the one here with the only difference being things like names and paths?

Yes, based on your breakdown, it's pretty similar. I think the key differences are these:

  1. Compatiblity: resolution vs conditional loading—my proposal puts the compat restrictions on the resolution step and makes glue loading unconditional; your proposal ignores compat restrictions on glue during resolution but makes glue loading condiational.
  2. Glue path: your proposal puts the glue code in src whereas my proposal puts it outside of src in a special glue directory; I addressee why I think this is appropriate in my post—because the glue code is not part of the package itself, it's any external thing that depends on the package.
  3. Multiple dependencies: Proposal for first class support of conditional dependencies in Pkg Pkg.jl#1285 says "When DataFrames gets loaded, we check all packages that declares a conditional dependency with it [and] we load the glue code". This doesn't handle cases where the glue code depends on more than one other package; my proposal handles that case since glue code is loaded when a set of dependencies are present.

I think most of the other differences are superficial. However, I do think that the section name is significant even though it's superficial: [conditional-deps] versus [glue]. I've mentioned that I feel that "conditional dependencies" is a confusing and inaccurate name: these aren't dependencies, they are little bits of code that depend on the things they're gluing together. "Conditional dependers" would be more accurate, but I think that "glue" gives the sense of what they do much better. I'd be open to other names but "glue" is short, accurate and suggestive. Anything that calls these dependencies is just incorrect.

What I think is needed is a concrete proposal is the precompilation of the glue code. If the glue code is included as part of the precompilation process (because only the information from the Manifest) is used, this works automatically. It basically turns the glue code into a normal file that is part of the package that just happens to be conditionally included. If not, these glue code files need to be fitted into the precompilation system:

  • The files need to be modules (what should the name of the module be?)
  • They need to get a deterministic UUID (uuid5 I guess)
  • Support needs to be added for things like identify_package, locate_package for these modules (alt. the precompilation system needs to special-case these).

Yes, this is the meat of what needs to be decided.

The files need to be modules (what should the name of the module be?)

We could name the module for A/glue/B.jl based on the UUIDs of the packages in question. I.e. something like glue:$(uuidA),$(uuidB). This would also mean that glue:$(uuidA),$(uuidB) and glue:$(uuidB),$(uuidA) can both exist, which is good because they might both exist.

They need to get a deterministic UUID (uuid5 I guess)

Yes, a uuid5 hash of the module name would work.

Support needs to be added for things like identify_package, locate_package for these modules (alt. the precompilation system needs to special-case these).

Yes, this is definitely part of what needs to be done. I haven't looked at it enough to know which is the better approach.

@KristofferC
Copy link
Sponsor Member

Why would the manifest need to have anything about glue dependencies in it?

For the same reason we have the dependencies entry in the manifest; so that code loading only need to look at the active Project and Manifest. That this is the case is a bit annoying with devved packages because it means that the state of the dependencies listed in the manifest and the one of the project file of the devved package can get out of sync but it is probably worth it overall. Of course, we could just read glue dependencies from all the Project files but then we should imo do that for all dependencies.

I think that [compat] with glue should affect version resolution, but that once you've chosen versions,

Okay, then compat for glue dependencies are needed in the registry.

Glue path: your proposal puts the glue code in src whereas my proposal puts it outside of src in a special glue directory

The path I chose was just an example, the core is that the filename is based on a convention of the names of the conditional dependencies:

The gluecode would be stored (based on a documented convention) in a file inside the package, eg src/DataFramesGlue.jl

The exact path is pretty not really interesting at this point when it comes to the design because it doesn't influence anything. In the end it will just be a glue_code_path(pkg, glue_pkg::String) function where it is defined.

We could name the module for A/glue/B.jl based on the UUIDs of the packages in question. I.e. something like glue:$(uuidA),$(uuidB).

If we want to talk about path names I think that this proposal would give very long module names in the stack traces and would look pretty ugly in the source code (>80 linewidth) (and you can't have commas and colon in module names). Files also tend to have the same name as the module (enforced for packages).

@StefanKarpinski
Copy link
Sponsor Member Author

For the same reason we have the dependencies entry in the manifest; so that code loading only need to look at the active Project and Manifest. That this is the case is a bit annoying with devved packages because it means that the state of the dependencies listed in the manifest and the one of the project file of the devved package can get out of sync but it is probably worth it overall. Of course, we could just read glue dependencies from all the Project files but then we should imo do that for all dependencies.

I'm still not getting it—what specifically needs to go in the manifest? Can you give an example?

Okay, then compat for glue dependencies are needed in the registry.

Yes. The big issue here is that we can't really put glue compat in normal Compat.toml files because older versions of Julia will either complain that there are package names it doesn't know about in there if we don't put those names in the Deps.toml file or treat them as normal dependencies if we do put those names in the Deps.toml file. So we need to create a new Glue.toml file to put glue compatibility in. Annoyingly, if we're going to mimic the design of Deps.toml and Compat.toml then we'd need two files for glue, which seems like a lot. One possibility would be to put glue deps in Deps.toml and Compat.toml and then also in Glue.toml to mark them as "fake" dependencies. That way old versions of Julia would still work and simply treat glue dependencies as real dependencies while new Julia versions would know they're fake and aren't necessary. That might be acceptable if we generally don't want to make those version compatible with older Julia versions anyway.

If we want to talk about path names I think that this proposal would give very long module names in the stack traces and would look pretty ugly in the source code (>80 linewidth) (and you can't have commas and colon in module names). Files also tend to have the same name as the module (enforced for packages).

We could call them whatever we want because they're top-level and top-level module names don't have to be unique anyway. So I guess A,B would be a fine name for the module.

@KristofferC
Copy link
Sponsor Member

I'm still not getting it—what specifically needs to go in the manifest? Can you give an example?

Since the design is that a certain file will be loaded deterministically based on when other packages are loaded, there is no need for the Requires style package callback of arbitrary code as is described in part 1. Instead, code loading looks at the manifest, sees what glue dependency a package each package have, and loads the corresponding file when both a package and its glue dependency happen to be loaded.

I don't see how Part 1 in this proposal has any benefit in a system where the code that will be run is deterministic based on the "input" packages. Might as well put that logic in code loading then instead of having all packages push this as callbacks. Right now, Requires.jl does exactly this except you push arbitrary code as a callback instead of a specific file to be included, as in this proposal.

The big issue here is that we can't really put glue compat in normal Compat.toml...

Yes, many choices in JuliaLang/Pkg.jl#1285 were made to reduce the number of separate places that need to be touched. That's why it doesn't involve the registry, or the resolver, or (with environment-based glue code inclusion during precompile time) not the precompilation system. Everything gets pushed to code loading.

@Seelengrab
Copy link
Contributor

Is this obsolete now with weakdeps?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants