Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for pre-built dependencies #1139

Open
marcbowes opened this issue Jan 9, 2015 · 51 comments
Open

Support for pre-built dependencies #1139

marcbowes opened this issue Jan 9, 2015 · 51 comments
Labels
A-caching Area: caching of dependencies, repositories, and build artifacts S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.

Comments

@marcbowes
Copy link
Contributor

Currently you can add dependencies using path or git. Cargo assumes this is a location to source code, which it will then proceed to build.

My use-case stems from integrating Cargo into a private build and dependency management system. I need to be able to tell Cargo to only worry about building the current package. That is, I will tell it where the other already-built libraries are.

Consider two projects: a (lib) and b (bin) such that b depends on a:

[package]

name = "b"
version = "0.0.1"
authors = ["me <plain@old.me>"]

[dependencies.a]

path = "/tmp/rust-crates/a"

A clean build will output something like:

> cargo build -v                                                                                                                                                                                                                                               master untracked
   Compiling a v0.0.1 (file:///private/tmp/rust-crates/b)
     Running `rustc /tmp/rust-crates/a/src/lib.rs --crate-name a --crate-type lib -g -C metadata=10d34ebdfa7a5b84 -C extra-filename=-10d34ebdfa7a5b84 --out-dir /private/tmp/rust-crates/b/target/deps --emit=dep-info,link -L dependency=/private/tmp/rust-crates/b/target/deps -L dependency=/private/tmp/rust-crates/b/target/deps`
/tmp/rust-crates/a/src/lib.rs:1:1: 3:2 warning: function is never used: `it_works`, #[warn(dead_code)] on by default
/tmp/rust-crates/a/src/lib.rs:1 fn it_works() {
/tmp/rust-crates/a/src/lib.rs:2     println!("a works");
/tmp/rust-crates/a/src/lib.rs:3 }
   Compiling b v0.0.1 (file:///private/tmp/rust-crates/b)
     Running `rustc /private/tmp/rust-crates/b/src/lib.rs --crate-name b --crate-type lib -g -C metadata=429959f67e51bc23 -C extra-filename=-429959f67e51bc23 --out-dir /private/tmp/rust-crates/b/target --emit=dep-info,link -L dependency=/private/tmp/rust-crates/b/target -L dependency=/private/tmp/rust-crates/b/target/deps --extern a=/private/tmp/rust-crates/b/target/deps/liba-10d34ebdfa7a5b84.rlib`
/private/tmp/rust-crates/b/src/lib.rs:1:1: 3:2 warning: function is never used: `it_works`, #[warn(dead_code)] on by default
/private/tmp/rust-crates/b/src/lib.rs:1 fn it_works() {
/private/tmp/rust-crates/b/src/lib.rs:2     println!("b works");
/private/tmp/rust-crates/b/src/lib.rs:3 }

Importantly:

--extern a=/private/tmp/rust-crates/b/target/deps/liba-10d34ebdfa7a5b84.rlib

Would it make sense to expose a extern option (in dependencies.a) for low level customization?

[dependencies.a]

extern = "/private/tmp/rust-crates/b/target/deps/liba-10d34ebdfa7a5b84.rlib"

This can be worked around by using a build script along the lines of:

use std::io::fs;

fn main() {
    let from = Path::new("/tmp/rust-crates/a/target/liba-b2092cdbfc1953bd.rlib");
    let to = Path::new("/tmp/rust-crates/b/blah/liba-b2092cdbfc1953bd.rlib");
    fs::copy(&from, &to).unwrap();
    println!("cargo:rustc-flags=-L /tmp/rust-crates/b/blah");
}

But it is not ideal to have to do this with every project.

@steveklabnik
Copy link
Member

This is a Rust problem even more than a Cargo problem. You can't guarantee that a pre-built Rust library will work unless it's built with the exact same SHA of the compiler.

@alexcrichton
Copy link
Member

Yes unfortunately this would require changes to rustc itself, so it's unable to be tackled at this time.

The specific restriction I'm referring to is that you're basically limited to only working with binaries generated by the exact revision of the compiler you're using, as well as the exact same set of dependencies.

@marcbowes
Copy link
Contributor Author

Is there something I can read/follow that explains the issues relating to why this requirement is so strict? Is this expected to change over time?

Regardless, assume I can meet the requirement of providing a set of prebuilt libraries with the exact same SHA. Is this a reasonable feature? Even something like letting the build script emit --extern as part of the whitelisted flags that cargo:rustc-flags can configure would help me out (assuming that is easier to implement than another top-level dependency option).

@marcbowes
Copy link
Contributor Author

/cc @aturon: I had a chat with Steve on IRC about this and he suggested getting your input.

One big reason I want to avoid building dependencies over and over is that our build system rebuilds consumers - in my example, a change to a would trigger a rebuild of b (including tests) and if b failed, the new version of a would not be released. The implication of this is that when building b, a would be rebuilt a second time. This becomes really wasteful.

I'm happy to contribute the change required to implement this if the team feels it is a worthwhile feature. I imagine there are plenty of companies out their with their own in-house build systems, so something like this could be an adoption blocker.

@steveklabnik
Copy link
Member

I actually meant @alexcrichton not @aturon :)

@alexcrichton
Copy link
Member

Is there something I can read/follow that explains the issues relating to why this requirement is so strict?

Unfortunately no :(. We don't have a ton of documentation in this area, just a bunch of cargo-culted knowledge. In general though this is largely because of two primary reasons (that I can think of):

  1. The ABI for a library is not stable between compilations, even when theoretical ABI-compatible modifications are made.
  2. The metadata format for libraries, while extensible, is not currently use in an extensible way as it regularly breaks backwards compatibility.

Is this expected to change over time?

Certainly! We probably won't invest too much energy into it before 1.0, but I'd love to see progress in this area!

Regardless, assume I can meet the requirement of providing a set of prebuilt libraries with the exact same SHA. Is this a reasonable feature?

I suppose it depends on how much cargo integration you want. In your example you gave in the second comment, the manifest probably says that b depends on a, in which case cargo will already pass --extern for a when it compiles b. Cargo would not only just have to forward your --extern flags, but it would also have to know to turn off its own --extern. Additionally it would then have to cut a out entirely from the dependency graph.

In principle allowing --extern from rustc-flags would be possible, but it may have surprising results!

I imagine there are plenty of companies out their with their own in-house build systems, so something like this could be an adoption blocker.

I agree this would definitely be bad! I'd like to hone in on what's going on here first though.

My first question would be: Does Cargo suffice? If you're using cargo build, then Cargo won't build a if it hasn't changed and you've already built it, but it sounds like you're not using Cargo to build libraries?

I suppose my other questions would be based on that answer, so I'll hold off for that :)

@marcbowes
Copy link
Contributor Author

Thanks for the detailed answer Alex!

My first question would be: Does Cargo suffice? If you're using cargo build, then Cargo won't build a if it hasn't changed and you've already built it, but it sounds like you're not using Cargo to build libraries?

Imagine a is built by Travis. It outputs liba, documentation and so forth - a collection of artifacts. People mostly discard these artifacts in practice, but you might imagine a system where those artifacts are retained. I'm sure this is not conceptually dissimilar to what your built bots do - you get some named and versioned output that you can later use either for development (ala rustup) of yet more projects or for deployment purposes.

To integrate with this build system, one only needs to implement the simple contract: provide something that can be executed that will produce build artifacts. This is just a one-line shell script that turns around and calls cargo build, and we're done.

Along comes project b. It starts off the same as a until we decide to use some of the functionality that a provides. In cargo, you just add the dependency to the manifest - name and version. This build system works in the same way so we add it to it's manifest too. The build system uses this manifest to provide the build artifacts of a for b at compile time. Now the only thing left to do is adjust the path attribute under [dependencies.a] to point to the build artifacts ($A_ARTIFACTS/src, if you will) and we're golden.

(We now have the dependency declared in two places. We can either live with the duplication, or adjust our build script to copy them from one into the other.)

However, we've just hit the first real problem: b needs the source code of a to compile but this doesn't really fit in with the concept of a build artifact. We can cheat by adjusting our shell script to also copy the src of a into it's build directory.

Hopefully, at this point, I've answered the latter question: the intention is to use cargo to build libraries such as a or binaries such as b. The reasons:

  • it does pretty much everything we'd otherwise have to implement (w.r.t. rustc)
  • Cargo.toml is nicer than other interfaces like make for customization
  • it's good to keep things similar to the way the rest of the world does it
  • it makes importing third-party projects easier

The question then, is what happens when a changes? At a high level, the system tracks dependencies, rebuilds them according to the graph and fails if something in the build breaks. This means that b would be rebuilt against the new artifacts of a. If we change our shell script to also execute cargo test, this means that b gets a chance to veto the build as a whole if the change breaks it in some way.

And this brings us to the second problem. a is being built twice. If c depends on b, then the build will build a three times and b twice. This becomes incredibly wasteful pretty quickly. In the context of this specific system, it is also redundant because any changes to a will trigger b to be rebuilt; whereas in the "normal" world, b will be rebuilt if a changes but only when b is explicitly built.

As I mentioned in the issue overview, I can work around this by using a cargo build script that adds a -L option to rustc, provided I just dump the libs output by the builds of all of it's (recursive) dependencies. This works and completely solves my problem. Incidentally, it removes the need for the other changes to the build script (don't need to copy source code in, don't need to declare dependencies in Cargo.toml).

But then we hit the problem of ABI compatibility, for which it sounds like there is no solution yet. This means I'll need to find a way of (effectively) adding the rustc SHA to the tuple that identifies an artifact (similar to disambiguating 32/64 bit builds). Or just going with the aforementioned option of building all dependencies for each consumer.

A question you might also ask is: "would hosting a crates.io mirror help with this?". It doesn't. Not because it doesn't "work", but because it only meets some of the requirements (such as private code, not having direct dependencies on external sources for security reasons).

One huge benefit we get out of a single extensible build system is that adding a dependency on a Rust package is no different to adding a dependency on a C, Ruby, Java, Python or Haskell package - they're just named and versioned artifacts. A big use case for me is going to be enabled by that: for example, authoring Rubygems in Rust to speed up performance-critical code paths.

I hope this detail makes my initial question more clear: cargo does things I'd otherwise have to implement myself, but it also does things I'd like to skip. Specifically, I'd like to be able to use something like path for exact control over where the dependency lives, but I don't want cargo to try build it.

FWIW, I'm probably going to go with:

  • include source code in build artifacts
  • rebuild dependencies for each consumer when that consumer is built
  • the build system's build script should copy in Rust dependencies from the build system's manifest to Cargo.toml and specify the path

@marcbowes
Copy link
Contributor Author

Alex points out that http://doc.crates.io/build-script.html#overriding-build-scripts could be extended to support overriding rust crates. We could then generate .cargo/config files on the fly.

@alexcrichton
Copy link
Member

Alright, after reading that over (thanks for taking the time to write it up!) it sounds like what we discussed on IRC is the best way to move forward with this. Specifically I'd be thinking of something like:

# .cargo/config
[target.$triple.rust.foo]
libs = ["path/to/libfoo.rlib", "path/to/libfoo.so"]
dep_dirs = [ ... ]

Note that the current overrides (target.$triple.$lib) I think may want to be renamed to target.$triple.native.$lib to give us some more leeway. When cargo detects this form of override, however, it will not build libfoo but instead just pass --extern foo=... to the paths listed and -L dependency=... to all of the values in dep_dirs.

One problem I can forsee, however, is that you mentioned about not wanting to share the source code between projects. Cargo would still need the source code, however, to read data such as the Cargo.toml. Cargo doesn't actually need the entire source code base, but it'll need at least that much.

Does that sound like what would work for you?

@Cxarli
Copy link

Cxarli commented May 27, 2017

bump?

@Boscop
Copy link

Boscop commented Jun 1, 2017

Please support this, it's frustrating that it doesn't work yet.
I have to prebuild the ring crate because on the server where I don't have root the GCC version is too old to build ring, so I build it on a server where I'm root and copy it over.
Is there any way right now to use the prebuilt rlib instead of compiling it from crates.io? Maybe with a build.rs script?

@aturon aturon reopened this Jul 12, 2017
@aturon aturon added the I-nominated-to-discuss To be discussed during issue triage on the next Cargo team meeting label Jul 12, 2017
@aturon
Copy link
Member

aturon commented Jul 12, 2017

Nominated for discussion at the Cargo team meeting.

@Twey
Copy link

Twey commented Oct 24, 2017

Was progress made on this at the meeting?

@Popog
Copy link

Popog commented Oct 27, 2017

I created DHL as a workaround for this issue.

@joshtriplett
Copy link
Member

I don't know about "pre-built" dependencies, but it'd be nice to be able to build a variety of leaf crates without building the dependency crates more than once, if the leaf crates request the same features from the dependencies.

@Twey
Copy link

Twey commented Mar 23, 2018

Bump?

@dwijnand
Copy link
Member

dwijnand commented Apr 25, 2018

Probably not: still nominated..

@alexcrichton alexcrichton removed the I-nominated-to-discuss To be discussed during issue triage on the next Cargo team meeting label Apr 25, 2018
@Twey
Copy link

Twey commented Jun 29, 2018

I'm still a bit in the dark about what happened here. Was anything discussed at the team meeting?

@mitchmindtree
Copy link

TL;DR Would just like to add another +1 for support for pre-built binaries. It would be great to get a follow up on what was discussed at the meeting.

Motivation Story

Last night we ran a workshop on the nannou creative coding framework. Seeing as nannou supports audio, graphics, lasers, etc along with quite a high-level API in a cross-platform manner, it has a lot of dependencies. It took between 5 minutes and 25 minutes (depending on the user's machine) for users just to build nannou and all of its dependencies for the first time in order for us to begin working through the examples together. Ideally in the future we would write a build script that attempted to first fetch pre-built dependencies before falling back to building from the src. It seems like the feature described within this issue would help to simplify this.

@robclouth
Copy link

What about some global cache on disk for both the downloaded source code the built binaries, so that if the compiler and crate version match it can avoid a rebuild? Similar to yarn.

@Twey
Copy link

Twey commented Jul 10, 2018

This is also what Nix wants to do. There won't be a compiler mismatch for packages built with Nix, because Nix will use the compiler as a build input to the package.

@Twey
Copy link

Twey commented Feb 20, 2019

Bump. Does anybody know what's happening here?

@jacderida
Copy link

I would be interested to know what's happening here. Our company has some products in Rust and we've been working on a new build system. If you use Docker to do your builds, it's actually a good way to get around this problem, because you can do a cargo build during the container build process to cache the built dependencies. However, we also have to support building on Windows and macOS and you obviously can't get those in a container. This issue makes it difficult to have good build times in a build environment that makes use of on demand slaves.

@miere
Copy link

miere commented May 11, 2021

Hey, do you have any feedback on this?

I reckon pre-build binaries not only benefit long term builds but also short-term ones. In my case, our serverless application is getting bigger, therefore our build times are increasing linearly. Pre-build dependencies would definitely reduce this build time, even if occasionally we have to re-build these libraries from the code.

I understand the argument that, due to Rust's unstable ABI, the generated lib might not be compatible anymore. But we could work around that by pinning a Rust version in the CI/CD - just as others have mentioned. It should be possible for cargo, given this deterministic build scenario, to generate pre-built binary (rlibs or dyn-libs) from crates and use them as valid dependencies for our binaries.

@Apteryks
Copy link

Apteryks commented Jun 4, 2021

FWIW, such a feature would be highly desirable for distributions based on functional package managers such as GNU Guix, where the whole dependency chain is controlled. In other words, the ABI compatibility problem of rustc is not really a problem for Guix.

@ffimnsr
Copy link

ffimnsr commented Jul 9, 2021

Up, need support for this to speed up compilation specially for small computing devices

@salotz
Copy link

salotz commented Aug 2, 2022

Be a good citizen in making it easy to integrate Rust code into bigger projects and implement this please.

@bjorn3
Copy link
Member

bjorn3 commented Aug 2, 2022

How would this help with bigger projects?

@salotz
Copy link

salotz commented Aug 2, 2022

In projects where you want to control all of the dependencies in a uniform detailed manner having a build system for some of them that demands to download packages itself is quite bothersome. Basically for all of the same reasons the Nix users above have given.

@bjorn3
Copy link
Member

bjorn3 commented Aug 2, 2022

In that case you should probably not use cargo, but instead have whichever build system builds those dependencies for you, build all rust crates using rustc. Mixing two build systems at the same time for your dependencies is bound to give trouble. Even having two instances of cargo may result in some dependencies being built twice, which either had unintended effects or fails depending on if the two cargo instances used different -Cmetadata values or not. For Nix there is already a program that converts cargo projects into native Nix build files directly invoking rustc. Similar things exist for other build systems like bazel. Is that not enough?

@salotz
Copy link

salotz commented Aug 3, 2022

For Nix there is already a program that converts cargo projects into native Nix build files directly invoking rustc. Similar things exist for other build systems like bazel. Is that not enough?

These are all hacks on top of Cargo being both build system and package manager. I don't use Nix, I am trying to build packages with Spack which is similar. I don't see why I should have to write build scripts for other people's packages. At most some light patching or flags are needed to be changed to accomodate a meta-build system. Not rewrite the whole build script.

As a non-Rust developer just looking to use Rust packages I don't know how to recreate the actual build logic that Cargo performs. Is it as easy as just running rustc ...?

@bjorn3
Copy link
Member

bjorn3 commented Aug 3, 2022

I don't see why I should have to write build scripts for other people's packages.

You don't? For nix crate2nix takes a cargo project as input and produces a nix build script for you. You don't need to write a build script yourself. If anything using pre-built dependencies would require more effort on your end as you need to perform most of the actions of cargo to build your dependencies and then edit the Cargo.toml to point cargo to them.

As a non-Rust developer just looking to use Rust packages I don't know how to recreate the actual build logic that Cargo performs. Is it as easy as just running rustc ...?

Not really. You can add -v to a cargo invocation to show all rustc and build script invocations of cargo. You will see options ranging from specifying dependency locations, to setting the optimization level, determining the output location and -Cmetadata with each invocation having a unique value to allow multiple versions to co-exist. Cargo also interprets the output of build script invocations to determine which extra arguments to pass to rustc. For example to link against a specific C library, or to set an env var pointing to generated source code.

@salotz
Copy link

salotz commented Aug 3, 2022

For nix crate2nix takes a cargo project as input and produces a nix build script for you

A manual rewrite, crate2nix rewrites it what is the difference? The point being that Cargo doesn't provide the mechanism needed and so a translation layer is necessary. If for instance it operated (or had the option to operate) more like a traditional build system then no translation would be necessary at all. You would just flip a flag and Cargo would not fetch and build all the dependencies.

If anything using pre-built dependencies would require more effort on your end as you need to perform most of the actions of cargo to build your dependencies and then edit the Cargo.toml to point cargo to them.

I think you do see my point now though, I want to build everything myself, and Cargo to not do as much. Yes this means I have to repackage everything in my system, but the end goal is a single tree of dependencies across all language ecosystems without recompilations. Its a tradeoff of control vs convenience that currently Cargo doesn't support.

Cargo also interprets the output of build script invocations to determine which extra arguments to pass to rustc. For example to link against a specific C library, or to set an env var pointing to generated source code.

I did look through these recommendations and I run into this build script which seems like quite a tight coupling of rustc to Cargo. I will see if I can run the build scripts without Cargo.

My current idea though is to patch the Cargo.toml to use local dependencies as you mentioned above. I tried to look at the nix solutions but unfortunately its quite obtuse to someone that doesn't know that DSL and terminology.

@dpaoliello
Copy link
Contributor

dpaoliello commented Aug 4, 2022

How would this help with bigger projects?

There are couple of scenarios that I can think of:

Layering

In a large project it can be useful to divide the code into various layers for both architectural cleanliness, but also build efficiency. Frequently developers working in the middle or leaf layers will have a huge amount of code in the root layers that they will never touch - being able to use prebuilt libraries from CI instead of having to rebuild those layers is a huge productivity win. Even if you don't have the infrastructure to leverage prebuilt libraries from CI, being able to do a single full build and then manually rebuild only small portions of your code is helpful.

Having layered builds can also help CI by allowing the build to be distributed: once the root layers are built, those libraries can be distributed to multiple other build machines to build the wider graph of middle and leaf layers.

Note that, in this scenario, Cargo would not check if the rlibs were out-of-date (the original source files might not even be available), but rustc would be responsible for checking if the rlibs are compatible with the current build and fail otherwise.

Distributing pre-built libraries

One can think of this like the layering scenario above, but where the layers may cross organizations/projects/companies - namely that pre-built libraries are available (through a package manager, installed in a known directory, etc.) and that no source is available at all.

For example:

FWIW, such a feature would be highly desirable for distributions based on functional package managers such as GNU Guix, where the whole dependency chain is controlled. In other words, the ABI compatibility problem of rustc is not really a problem for Guix.

Cargo

Note neither of the scenarios that I mentioned above require the use of an additional build system beyond Cargo: there may be some scripts required to grab the prebuilt binaries from somewhere, but one can also imaging just having multiple cargo.toml files in a repo that are unaware of each other, but that know that specific rlibs should be in a specific directory.

Even if another build system is involved, because Cargo handles things like build scripts, downloading/building dependencies, and setting up command line arguments to rustc, the idea of avoiding cargo and directly invoking rustc is a non-starter.

@bjorn3
Copy link
Member

bjorn3 commented Aug 5, 2022

You would just flip a flag and Cargo would not fetch and build all the dependencies.

Then what would cargo do if not fetching and building? It already supports separating the fetch and build steps. cargo fetch downloads dependency sources and cargo metadata downloads them and then prints information you can use in a different build system to build your project. cargo metadata is what crate2nix uses afaik. You can also use cargo vendor to get the sources in a format that allows checking into source control or adding to a source bundle provided for downloads.

A manual rewrite, crate2nix rewrites it what is the difference?

A lot IMHO. A manual rewrite is a lot of work. crate2nix or equivalent can be integrated directly in your build system so that it is done transparently for you. One way or another you have to get a build script to be executed by your build system. The crate2nix approach means you don't have to manually write it, but can simply point the build system to a Cargo.toml file making it just as easy as using cargo.

@bjorn3
Copy link
Member

bjorn3 commented Aug 5, 2022

Having layered builds can also help CI by allowing the build to be distributed: once the root layers are built, those libraries can be distributed to multiple other build machines to build the wider graph of middle and leaf layers.

You don't need layers for that, right? You could consider every crate a single task to be scheduled across the build farm.

One can think of this like the layering scenario above, but where the layers may cross organizations/projects/companies - namely that pre-built libraries are available (through a package manager, installed in a known directory, etc.) and that no source is available at all.

If you are doing that with the intent to keep source private, just be aware that the crate metadata lists every private function and type and a whole lot of other information that should theoretically make it possible to reconstruct something that looks a lot like the original source code. (modulo regular comments. doc comments do end up in the crate metadata)

In a large project it can be useful to divide the code into various layers for both architectural cleanliness, but also build efficiency. Frequently developers working in the middle or leaf layers will have a huge amount of code in the root layers that they will never touch - being able to use prebuilt libraries from CI instead of having to rebuild those layers is a huge productivity win. Even if you don't have the infrastructure to leverage prebuilt libraries from CI, being able to do a single full build and then manually rebuild only small portions of your code is helpful.

Have you considered using sccache for that? It supports caching build artifacts in the cloud or on other machines. Using it is a matter of setting the RUSTC_WRAPPER env var to the location of the sccache binary (or just RUSTC_WRAPPER=sccache if it is in your PATH). Note that sccache specifically requires you to trust everyone with access to the cache, as anyone could poison it with malicious artifacts. A similar approach should be possible to do in a more secure way though.

@jonhoo
Copy link
Contributor

jonhoo commented Oct 12, 2022

A thought occurred to me while thinking of a particular use-case for this that you'll run into difficulty with transitive dependencies that are shared between a pre-compiled rlib and the crate being built on top of it. Consider a crate x that depends on y and z. y also depends on z, and has types from z in its public API. Next, imagine that y is pre-built, and so all we have is its rlib. It was pre-built with z 1.0.0. Now a user wants to build x, but they don't have a lockfile or anything. They configure Cargo to use y.rlib, and run cargo build. Cargo chooses the most recent version of z, which let's say is now z 1.1.1. Now, we're suddenly trying to do a build that contains both z 1.0.0 and z 1.1.1, which feels like it would lead to trouble, especially if the code in x tries to pass a type from z 1.1.1 into y'sAPIs which expectz 1.0.0`.

Which is all to say: I think the lockfile needs to be embedded in/passed alongside an rlib so that it can appropriately lock version resolution for builds that try to use that rlib.

@crazyboycjr
Copy link

crazyboycjr commented Oct 28, 2022

I am having a problem that may relate to this thread.
image

I am building my crate B (crateB/target/release/libB_plugin.so) which depends on my crate A (libA.so). My crate A also produces an executable (let's say ./A) which links with crateA/target/release/libA.so. At the runtime of A, it dynamic loads libB_plugin.so by dlopen. However, this throws me a missing symbol issue, since libB_plugin.so links with crateB/target/release/libA.so (which is a fresh build of crate A). The dependent functions in libA.so are actually the same function, but the mangled names have different hash suffixes.

Is there any way to make crateB only link with crateA/target/release/libA.so?

I am kind of stuck on this problem and was hoping to get any suggestions. Thanks!

Update: My particular problem was tackled by

  1. switching to building everything into rlib (rather than depending on dylib). There is this diamond dependency issue when mixing the use of rlib and dylibs, and it is hard to avoid when you have enough more crates in your project.
  2. implement similar functionality of dlopen for rust rlibs by hand (Similar to what Linux kernel module has done for '.ko' and what Glasgow Haskell Compiler has done for its objects)
  3. having a cargo wrapper that captures rustc commands (generated by cargo -vv) and modify '--extern' to existing rlibs (tracking package dependencies and feature set compatibility is also required in the cargo wrapper)

Relevance to this topic:
A bigger software (hopefully implemented in Rust) might usually need be extensible or need that part of its code or behavior be dynamically changeable. This is usually done by a plugin system. What's good about Rust is if the plugin system is backed by WebAssembly (which there are many wasm runtime available), then the problem is solved. If the plugin has to be native code, such as dylibs, the current cargo's style of compiling everything from source from scratch, would be insufficient. Being able to build against pre-built dependencies at some point would be perfect.

@epage epage added A-caching Area: caching of dependencies, repositories, and build artifacts S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. labels May 3, 2023
@bitdivine

This comment was marked as off-topic.

@bjorn3

This comment was marked as off-topic.

@epage
Copy link
Contributor

epage commented Sep 19, 2023

There is a Pre-RFC for a subset of this: https://internals.rust-lang.org/t/pre-rfc-sandboxed-deterministic-reproducible-efficient-wasm-compilation-of-proc-macros/19359. That is blocked on some work from #5720.

There is also #5931. That could be extended with a plugin system to allow something like sccache for distributed caching.

@navi-desu
Copy link

i'm looking into this considering the possibility of packaging rust crates in a gentoo overlay as precompiled rlibs. the strict compiler version is not that big of an issue on gentoo because subslotting on rustc can ensure that every lib gets recompiled when the system-provided rustc updates (even tho it's kind of a subpar solution, works)

for now i'll try installing rust crates system-wide in source format (as one would with python packages for example), but cargo supporting pre-built libs would reduce build times for rust packages, and is overall a better solution, imo.

@bjorn3
Copy link
Member

bjorn3 commented Sep 26, 2023

i'm looking into this considering the possibility of packaging rust crates in a gentoo overlay as precompiled rlibs. the strict compiler version is not that big of an issue on gentoo because subslotting on rustc can ensure that every lib gets recompiled when the system-provided rustc updates (even tho it's kind of a subpar solution, works)

That has the issue that crates can have different cargo features enabled depending on who uses it. Each set of enabled cargo features results in a different rlib. Just enabling all cargo features is not really an option as it would likely pull in a lot of deps that are likely never used and some crates have mutually exclusive cargo features.

What you could try though is something like picking a couple of fixed sets of cargo features for each crate, building it with sccache as RUSTC_WRAPPER and then shipping the sccache cache entries and telling sccache to use those shipped cache entries. This way sccache will simply rebuild if the cargo features don't match and you don't need any cargo modifications.

@navi-desu
Copy link

That has the issue that crates can have different cargo features enabled depending on who uses it. Each set of enabled cargo features results in a different rlib. Just enabling all cargo features is not really an option as it would likely pull in a lot of deps that are likely never used and some crates have mutually exclusive cargo features.

gentoo has useflags as a core functionality, and useflags are basically features (build time options that the user can set and ebuilds can require). so i can represent features as useflags and an ebuild script can actually pick what should be present, by basically doing DEPEND="dev-rust/foo[nya]" would say "i need the crate foo with the feature nya compiled in", then portage would recompile foo if it was previously compiled without the feature nya enabled.

@bjorn3
Copy link
Member

bjorn3 commented Sep 26, 2023

How does that work with different programs that need mutually exclusive features of a crate? Or what about a program that needs a crate with the std feature enabled and another one which doesn't want to use libstd? While cargo features are supposed to be additive, many crates don't follow this advice. In some case they even disable features using a cargo feature by following the rationale that it is a feature to allow compiling for more targets even though it is an negative feature as it causes compile errors for crates that need those things.

@navi-desu
Copy link

navi-desu commented Sep 26, 2023

How does that work with different programs that need mutually exclusive features of a crate?

as far as i know, it conflicts, but

While cargo features are supposed to be additive, many crates don't follow this advice.
if the common behaviour is to have features work in additive ways, the packager could make an exception for a specific binary that doesn't follow that. would be the simplest solution

if there's enough binary programs for this to be quite common, i'd have to poke and see. probably drop the idea of pre-compiling rlibs, sticking to installing only the source, since i can't see a better solution yet (still a improvement over the current way gentoo packages rust apps imo, but sad). precompiling libs for each package is the same as just installing the source, but with extra steps, so that's eh.

it's either get packages to share precompiled rlibs, or install the sources and let packages build them

@dreamcat4
Copy link

it sounds analagous to re-inventing the wheel with respect to flatpaks, which are built ontop of versioned dependancies. such that multiple versions are installed, several duplicate clones of same libraries. and ostree to point to some git-esq btree structure or whatever it is.

i am not saying re-invent the wheel, but maybe there are already similar solutions out there that could be considered, evaluated, adapted to then fit the rust / cargo infrastructure?

@epage
Copy link
Contributor

epage commented Nov 16, 2023

One problem I've had with pre-compiled package is "pre-compiled for what?"

  • Target platform
  • Dependency tree versions (and how they get unified or not with yours)
  • Feature flags

#5931 takes the opposite approach. Instead of trying to have a fixed set of the above that is pre-compiled, it is reactive and caches it on-demand. Of course, plugins could then be extended to upload common sets.

In thinking of this, I thought of a good analogy for a way forward for pre-compiled. I will say that #5931 will likely involve a lot less design work and would be a faster path to something.

We have the build-std project to make std act more like a regular package. What if we allowed doing the opposite and allowed a workspace-of-packages to act like std.

  • It'd have its own lockfile
  • There is one fixed set of features (e.g. lib.required-features)
  • You are responsible for dealing with target platform

To put this in C++ terms, a crate lib would be like a headers only library while this new concept would be like creating a proper SO/DLL. It would be designed for it and the types / systems would be unique.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-caching Area: caching of dependencies, repositories, and build artifacts S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.
Projects
None yet
Development

No branches or pull requests