Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support llvm stable releases #11362

Open
sbc100 opened this issue Jun 5, 2020 · 40 comments · May be fixed by #23311
Open

Support llvm stable releases #11362

sbc100 opened this issue Jun 5, 2020 · 40 comments · May be fixed by #23311

Comments

@sbc100
Copy link
Collaborator

sbc100 commented Jun 5, 2020

Historically emscripten has been built around always using the very latest upstream version of LLVM. Essentially it depends on the top-of-tree version.

However, for the purposes of packaging, reducing install and download size, and making emscripten more portable, it would be nice to start being compatible with the stable version of LLVM.

We have have had many requests for this feature of the years (e.g. #11313)

This would mean more testing since each emscripten release would need to be tested fully against both LLVM stable and LLVM tot. For this reason it probably makes sense to make this move once we fully remove fastcomp (#11319).

The downside is that we will need to add version checks in the emscripten codebase when it depends on new/tot features.

I suggest that we perhaps give this a try starting with LLVM 11. E.g. we support LLVM 11 and tot until LLVM 12 is releases and which point we support LLVM 12 + tot..

@sbc100 sbc100 changed the title Support llvm stable releaeses Support llvm stable releases Jun 5, 2020
@tlively
Copy link
Member

tlively commented Jun 5, 2020

What will we do when a feature changes to be incompatible with the stable version of LLVM? Should we error out? Have a branch to handle both the old and new behaviors? I guess for unstable features that are changing frequently in upstream LLVM we can just error out eagerly if the LLVM version is not the one we test with.

@kripken
Copy link
Member

kripken commented Jun 5, 2020

Alternatively, we could refer people that want to use llvm 11 release to a specific emscripten version that is known to work with that? I.e. if they want to use an LLVM from the past, they can do so with the emscripten from the same time in the past.

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 5, 2020

If we take that alternative approach I still think it would be nice to test against the actual release versions, rather than just the nearest emscripten release to the release revision.

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 5, 2020

What will we do when a feature changes to be incompatible with the stable version of LLVM? Should we error out? Have a branch to handle both the old and new behaviors? I guess for unstable features that are changing frequently in upstream LLVM we can just error out eagerly if the LLVM version is not the one we test with.

I'm imagining there will be some places in emscripten where we need to do:

if LLVM_STABLE:
   error_out
else:
   use_new_feature

Nut hopefully they should be fairly few and far between. The frequency of these divergences should tell is fairly quickly if the approach is a sensible one.

Initially I think the testing burden will be the biggest obstacle we face.

@sbc100
Copy link
Collaborator Author

sbc100 commented Jul 15, 2020

Looks like its happening now: #11637

What do you all think about trying to support 11.0 and 12.0 going forward?

@kripken
Copy link
Member

kripken commented Jul 15, 2020

I think if have the CI resources to also test 11 stable, it makes sense to experiment with this. But I think it should be an experiment we are ok with stopping, since it may become too hard to be worthwhile, like if we start to need lots of ifs for what LLVM version is used, as mentioned earlier.

@sbc100
Copy link
Collaborator Author

sbc100 commented Jul 15, 2020

One advantage to testing against 11, once it is stable, is that we don't need to build our own versions of LLVM to test it.. we can just use the official binaries.

I suggest we try to find a sensible subset of tests to run against llvm 11.. perhaps monitoring the areas where we requires 12 and ensuring the subset includes those areas.

@kripken
Copy link
Member

kripken commented Jul 15, 2020

I'm worried about picking a subset, if we want to claim we support those binaries officially (do we?). Maybe we can check them less frequently or something, though?

@sbc100
Copy link
Collaborator Author

sbc100 commented Jul 15, 2020

I would imagine that as long its only run a subset of tests we would not claim full support.

Once we ditch fastcomp and can do more testing I think it might be worth increasing the level of testing and officially supporting stable llvm. But I'm not suggesting that now.. .so maybe we should still have some kind of warning message about the lack of full support?

@sbc100
Copy link
Collaborator Author

sbc100 commented Jul 15, 2020

Perhaps we can add this to the FYI bot along with the contrib ports tree?

@kripken
Copy link
Member

kripken commented Jul 15, 2020

Last two comments sgtm (not claim full support, use the FYI bot).

@kripken
Copy link
Member

kripken commented Aug 18, 2020

To expand a little on something mentioned earlier without detail: As an approximation to this, I think we can find the LLVM commit closest to a particular release, and then find the emscripten release in which that appeared. That emscripten release should be able to work with the official LLVM release. There are some minor worries like LLVM can change things after branching for release, and the emscripten release may contain more LLVM commits after the release, but it would probably work in most cases.

If we want to try this, then setting up way to test the official release binaries is basically all we need. We can then open a PR here, using the emsdk for the proper emscripten release normally (including binaryen from back then) but replacing LLVM with the release binaries. If that passes I think we can say it's good.

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 18, 2020

You mean do a one off test that verion X.Y of emscripten works with upstream official binaries from A.B of llvm?

I guess there is some value in that. Setting up testing to bring in the official binaries is main bit of work needed to do that.

I still think there is value is having support on ToT emscripten for at least one stable version of llvm rather than constantly being on ToT LLVM, but I admit that is a lot more work.

@kripken
Copy link
Member

kripken commented Aug 18, 2020

Yeah, a one-off test (here on github CI). Then we can officially "bless" that LLVM version and recommend it in our docs, "if you have LLVM release 9, use it with emscripten x.y.z (and here's how to use those binaries with it)".

(I'm not opposed to the other option, if we think it's worth it.)

@tlively
Copy link
Member

tlively commented Aug 18, 2020

This would be a more principled way to determine when Emscripten version is used on the Rust CI for testing the Rust emscripten targets, too.

@tlively
Copy link
Member

tlively commented Aug 19, 2020

rust-lang/rust#73526 (comment) is a timely example of how this would be beneficial to Rust.

@sbc100
Copy link
Collaborator Author

sbc100 commented Sep 15, 2020

I'd like to give this a try starting with llvm 11 which is due for release soon.

Here is what I think we should try:

  1. Start accepting 11 as will as ToT for llvm versions (that means 11 + 12 and soon 11 + 13)
  2. Use ToT version on github CI and emscripten-release CI
  3. Run all tests against llvm 11 on linux-test-suites builder (We already run the asan test suite and others in this way.)
  4. Perhaps run a small subset of llvm 11 tests on github CI to prevent trivial regressions.

In the long run this could make emscripten install-able without building either llvm or binaryen :)

@kripken
Copy link
Member

kripken commented Sep 15, 2020

What's the plan for when a diverging patch lands on LLVM trunk? For example we may get it to emit invokes differently, or you may change the dynamic linking ABI. That will have corresponding mandatory changes on emscripten (and maybe binaryen too). Would we be forced to support two "backends" in effect in emscripten?

Unless I'm missing something I think it's safer to go the other way - to "certify" a specific fixed emscripten hash for LLVM release 11 by running the full test suite on it once and then documenting that. That approach has no ongoing cost of tot emscripten supporting more than tot LLVM.

@sbc100
Copy link
Collaborator Author

sbc100 commented Sep 15, 2020

There would be an ongoing cost yes, because we would need to support both stable and tot versions.

But divergences should be fairly rare and I imagine will be often be linked to features that we can simply not support with the stable version. e.g we could say that native exception handling only works with tot llvm.

I think I don't like about pinning emscripten is that if the approach becomes popular we end up getting bug reports for older version of emscripten and people who are stuck on old versions. Maybe that cost is lower?

Did you see that #12218 llvm 10 very nearly "just works" with emscripten tot.. indeed if you can build compiler-rt with llvm tot then llvm 10 does seems to work today.

I agree that your solution is simpler though.. and probably less maintenance for us.

@kripken
Copy link
Member

kripken commented Sep 15, 2020

But divergences should be fairly rare and I imagine will be often be linked to features that we can simply not support with the stable version. e.g we could say that native exception handling only works with tot llvm.

If they are indeed very rare maybe this is fine. And I agree we can support new features only on tot LLVM. But I worry about things like us wanting to remove the old exceptions approach at some point, or remove the old dynamic linking, etc. - we'd be forced to keep those around until the next stable release, with your approach. And I also worry about other changes like how reactors work or other stuff (random recent example: #12220). As we do less in binaryen and just use the LLVM output as-is, we become more sensitive to any such changes.

Another issue is code size changes - tot LLVM may improve some optimization and as a result code size tests will fail on stable. We may want to disable those tests when running stable. But that's risky, as we may regress size on the emscripten side in a way that just affects stable - that may happen in practice if we add some optimization that interacts with the new code patterns tot LLVM emits.

Overall this seems like a maintenance burden that may be large and unpredictable, to me.

I think I don't like about pinning emscripten is that if the approach becomes popular we end up getting bug reports for older version of emscripten and people who are stuck on old versions. Maybe that cost is lower?

Good question. I don't know how popular it could end up being.

I suspect this approach's overall cost will be lower. We do already get bug reports with old versions, and those are at least quickly diagnosed as such today.

@sbc100
Copy link
Collaborator Author

sbc100 commented Oct 12, 2020

Apparently llvm 11 was just released so if we want to try your approach perhaps we should tag an emscripten release now to match the llvm release?

@kripken
Copy link
Member

kripken commented Oct 12, 2020

We can tag now, but it won't match, I think? Emscripten is already on newer LLVM code after the branch.

We'd need to go back in history to find where the branch happened, so we find the emscripten tot build that is closest to the release. (And then we should run our tests on it.)

@willcohen
Copy link

@jonassmedegaard For whatever it's worth, as a 2023 data point, I can confirm that emscripten 3.1.36 builds using LLVM 15 with https://github.com/llvm/llvm-project/commit/1532be98f99384990544bd5289ba339bca61e15b.patch backported to LLVM 15 (see willcohen/nixpkgs@14dacfe). As of now I still don't have it working with LLVM 16.0.0 but it's possible it works with a later point release.

@willcohen
Copy link

I can also confirm that latest emscripten 3.1.39 builds with no LLVM backports with LLVM 16.0.1 (ie nothing in 17 is needed yet).

@willcohen
Copy link

As best as I can tell, nixos/nixpkgs is also always going to need to generally pick a stable version of LLVM, since getting LLVM to build in the immutable environment requires a bit of packaging. Due to the immutable store of the various package dependencies, the LLVM maintainers on nix already have to do a lot of massaging to get LLVM to build correctly, and trying to maintain another parallel tip-of-tree LLVM build would unreasonably increase the package size of LLVM.

Any time nix tries to bump the emscripten version, that already basically triggers a check for all downstream dependent packages, so I think that more or less addresses nixpkgs's concerns re compatibility. That's not quite as robust as the full test suite. I'll probably look into trying to make sure the full test suite is checked there going forward during builds, so at least it's not encouraging regressions.

@willcohen
Copy link

As a small update here, nix now has a weekly build of the latest LLVM from tip-of-tree, which emscripten now depends upon. While this doesn't precisely line up with emscripten's use of tip-of-tree at time of release, should allow nixpkgs to stay up to date with emscripten releases, with dramatically reduced moments where LLVM and emscripten have the same mismatch as when trying to depend on LLVM stable. NixOS/nixpkgs#326633

@epozuelo
Copy link

epozuelo commented Dec 6, 2024

Is there a recommended version to use emscripten with llvm 18 releases? And one recommended for llvm 19?

Also, fastcomp was dropped a while ago. Has there been any progress on better supporting at least one llvm release?

@sbc100
Copy link
Collaborator Author

sbc100 commented Dec 9, 2024

No progress that I know of I'm afraid.

@epozuelo can I ask about your use case? What are your requirements here? i.e. why is compatibility with stable llvm releases important to you?

@epozuelo
Copy link

No progress that I know of I'm afraid.

@epozuelo can I ask about your use case? What are your requirements here? i.e. why is compatibility with stable llvm releases important to you?

It's the Debian use case. We build emscripten using Debian's llvm, which is a stable release (and we provide several at a time, e.g. 17, 18 and 19 currently). We can't have every package shipping their own copy of the toolchain, and emscripten is no exception there.

Having emscripten supporting the last stable llvm version at the time would help immensely. Alternatively, knowing what emscripten version we should use if we want to build against llvm x.y (e.g. the emscripten version that used the closest revision to that llvm version) would be useful information as well.

@kripken
Copy link
Member

kripken commented Dec 10, 2024

knowing what emscripten version we should use if we want to build against llvm x.y (e.g. the emscripten version that used the closest revision to that llvm version) would be useful information as well.

@epozuelo For finding an emscripten version closest to an LLVM release, that should be possible like this:

  • Find the LLVM git commit on the main branch from which a release branched. I actually don't know LLVM well enough to know how best to do that (e.g. I see a branch llvmorg-19.1.0 but no 19.0.0..?)
  • Find where that git commit landed in Emscripten. The releases repo contains all the rolls, including from LLVM. grep over git log could find that LLVM git commit, giving us a git commit in the releases repo.
  • You can then use Emscripten from that commit in the releases repo: ./emsdk install [THAT RELEASES HASH] will work, and given how it was picked, the LLVM there should be the one you want. And you can see in the DEPS file the hashes in the other relevant repos, Emscripten and Binaryen. Using those projects from those commits should match that LLVM.

@sbc100
Copy link
Collaborator Author

sbc100 commented Dec 10, 2024

  • Find the LLVM git commit on the main branch from which a release branched. I actually don't know LLVM well enough to know how best to do that (e.g. I see a branch llvmorg-19.1.0 but no 19.0.0..?)

For this you would do git merge-base upstream/release/19.x upstream/main in the llvm tree. The result is f2ccf80136a01ca69f766becafb329db6c54c0c8 BTW.

sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
@sbc100 sbc100 linked a pull request Jan 6, 2025 that will close this issue
sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
sbc100 added a commit to sbc100/emscripten that referenced this issue Jan 6, 2025
Run the majority of the `other` under llvm stable.  There are still
some features that don't seem to work.

Generate and experimental warning for now, at least until we can fix
the known issues.

Fixes: emscripten-core#11362
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants