Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autodiff Upstreaming - rustc_codegen_ssa, rustc_middle #133429

Merged
merged 1 commit into from
Jan 31, 2025

Conversation

ZuseZ4
Copy link
Contributor

@ZuseZ4 ZuseZ4 commented Nov 25, 2024

This PR should not be merged until the rustc_codegen_llvm part is merged.
I will also alter it a little based on what get's shaved off from the cg_llvm PR,
and address some of the feedback I received in the other PR (including cleanups).

I am putting it already up to

  1. Discuss with @jieyouxu if there is more work needed to add tests to this and
  2. Pray that there is someone reviewing who can tell me why some of my autodiff invocations get lost.

Re 1: My test require fat-lto. I also modify the compilation pipeline. So if there are any other llvm-ir tests in the same compilation unit then I will likely break them. Luckily there are two groups who currently have the same fat-lto requirement for their GPU code which I have for my autodiff code and both groups have some plans to enable support for thin-lto. Once either that work pans out, I'll copy it over for this feature. I will also work on not changing the optimization pipeline for functions not differentiated, but that will require some thoughts and engineering, so I think it would be good to be able to run the autodiff tests isolated from the rest for now. Can you guide me here please?
For context, here are some of my tests in the samples folder: https://github.com/EnzymeAD/rustbook

Re 2: This is a pretty serious issue, since it effectively prevents publishing libraries making use of autodiff: EnzymeAD#173. For some reason my dummy code persists till the end, so the code which calls autodiff, deletes the dummy, and inserts the code to compute the derivative never gets executed. To me it looks like the rustc_autodiff attribute just get's dropped, but I don't know WHY? Any help would be super appreciated, as rustc queries look a bit voodoo to me.

Tracking:

r? @jieyouxu

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 25, 2024
@ZuseZ4
Copy link
Contributor Author

ZuseZ4 commented Nov 25, 2024

To expand on 2)
Assume you have the following code

#[autodiff(bar, Reverse, ...)]
fn foo(x: f32) -> f32 { x*x }

it will expand to

#[rustc_autodiff]
fn foo(x: f32) -> f32 {x*x}
#[rustc_autodiff(Reverse,...)]
fn bar(x: f32, scalar_factor: f32) -> (f32, f32) {
   // some_dummy_code()
}

Now I have some logic in this PR which picks up the rustc_autodiff attributes and passes them onto the backend, where for every single rustc_autodiff attribute with arguments we pick the function (thus bar here) and replace the dummy code with the right code to return the derivative. So bar would afterwards return (x*x, 2.0 * x). But as mentioned above, once you use autodiff in a library and call it in another module, the dummy code get's executed, as shown in the linked PR.
Any hints would be appreciated.

@traviscross traviscross mentioned this pull request Nov 4, 2024
7 tasks
Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some interim feedback for this draft PR

@@ -996,6 +997,35 @@ mod parse {
}
}

pub(crate) fn parse_autodiff(slot: &mut Vec<AutoDiff>, v: Option<&str>) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remark: this has no error messages if the autodiff options failed to parse, acceptable for unstable flag but still poor UX, unacceptable when it comes to stabilization time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be handled with a proper description in parse_autodiff above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume there is some rustc function to generate suggestions when users make a typo?
I'll update it for now to print the unrecognized value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked again at all the other options which accept more values (e.g. instrument_xray), and they have no error-handling either. Is there a preferred way to print errors here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this later imo.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 25, 2024
@jieyouxu
Copy link
Member

jieyouxu commented Nov 25, 2024

My test require fat-lto.

For codegen/assembly/ui tests you can make it build w/ fat LTO via

//@ compile-flags: -Clto=fat

I also modify the compilation pipeline. So if there are any other llvm-ir tests in the same compilation unit then I will likely break them.

I don't quite understand the implication of this. Is there some small example I can refer to?

I will also work on not changing the optimization pipeline for functions not differentiated, but that will require some thoughts and engineering, so I think it would be good to be able to run the autodiff tests isolated from the rest for now. Can you guide me here please?

Can you elaborate on what test conditions you need? What do you mean exactly by "isolated"? Can you not run autodiff but only if there is autodiff support, or do you mean like don't run by default even if there is autodiff support?

@ZuseZ4
Copy link
Contributor Author

ZuseZ4 commented Nov 25, 2024

https://github.com/rust-lang/rust/pull/130060/files#diff-a56b374664e290a55d70fa80e456b6280913830b382b73fb70c4483d3d4cf246
adjust's the llvm opt pipeline (if autodiff is enabled at build time and used).
We have a first opt run which skips the late llvm opts (which tend to increase code size), and runs opt a second time (now with the full pipeline) once autodiff is done. I will manage to not make it optimize unrelated code in the future, but for now it means other functions are now optimized 1.5 times by llvm. And in reality llvm opts don't really run to a fixpoint, so that is highly likely to change the IR.

@ZuseZ4 ZuseZ4 changed the title upstream rustc_codegen_ssa/rustc_middle changes for enzyme/autodiff Autodiff Upstreaming - rustc_codegen_ssa, rustc_middle Nov 26, 2024
@jieyouxu
Copy link
Member

functions are now optimized 1.5 times by llvm. And in reality llvm opts don't really run to a fixpoint, so that is highly likely to change the IR.

This just means that for now, you'll have to gate autodiff-related tests with //@ needs-autodiff or somehow allow compiletest to determine that the llvm used is built with autodiff support and the test is exercising said autodiff support.

Note that this cannot break and should not modify code that does not use autodiff at all, which is indicated by codegen tests that do not use / opt-in to autodiff support.

@oli-obk oli-obk self-assigned this Dec 6, 2024
@ZuseZ4 ZuseZ4 marked this pull request as ready for review December 13, 2024 00:57
@rustbot
Copy link
Collaborator

rustbot commented Dec 13, 2024

This PR modifies config.example.toml.

If appropriate, please update CONFIG_CHANGE_HISTORY in src/bootstrap/src/utils/change_tracker.rs.

Some changes occurred in coverage instrumentation.

cc @Zalathar

Some changes occurred in cfg and check-cfg configuration

cc @Urgau

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

@rust-log-analyzer

This comment has been minimized.

@ZuseZ4
Copy link
Contributor Author

ZuseZ4 commented Jan 2, 2025

I rebased now that the other autodiff PR got merged, fixed all conflicts, and got it to compile locally.
I will work through the existing feedback over the next days.

@bors
Copy link
Contributor

bors commented Jan 25, 2025

☔ The latest upstream changes (presumably #136030) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@ZuseZ4
Copy link
Contributor Author

ZuseZ4 commented Jan 25, 2025

@oli-obk Just to keep track, so far we have 3/4 things which should be fixed here. Potentially not all in this specific PR, but preferably before enabling it for default nightly builds.

  1. (Most orthogonal) the macro getting lost when used in dependencies.

  2. Performance: Have two opt runs, with AD being applied at the end of the first run. I need to look again at it, last time I wasn't 100% sure how to trigger these two runs after the refactoring.

  3. Build setup: Don't force users to pass RUSTFLAGS="-Z llvm-plugins=/home/manuel/prog/rust-working/build/x86_64-unknown-linux-gnu/enzyme/build/Enzyme/LLVMEnzyme-19.so -C passes=enzyme. I'll probably ask on zulip/bootstrap for help with that.

  4. (of course I'll also need to apply the other code quality recommendations above).

After thinking about it for a bit I'll probably just do 4) for now, and leave 1-3 for a follow-up PR, just for the sake of having a working version upstream. I'll need to talk to jieyouxu to see if we then add test's already here, or in the follow-up PR where I fix 3).

@rust-log-analyzer

This comment has been minimized.

@ZuseZ4
Copy link
Contributor Author

ZuseZ4 commented Jan 27, 2025

I'll need to add the fn-ptr error message and left a few questions.
I decided to leave the bootstrapping change for an extra PR, so the tests will probably also need to wait for that (since we use a -Z llvm-plugin flag which requires the absolute path to Enzyme, something which we probably don't want to add as a temporary workaround to the test infra).

@ZuseZ4 ZuseZ4 requested a review from jieyouxu January 27, 2025 01:19
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 27, 2025
@ZuseZ4
Copy link
Contributor Author

ZuseZ4 commented Jan 29, 2025

Let's go!
@bors r=@oli-obk

@bors
Copy link
Contributor

bors commented Jan 29, 2025

📌 Commit 48d4a6c has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 29, 2025
@ZuseZ4
Copy link
Contributor Author

ZuseZ4 commented Jan 30, 2025

Since it hasn't been added to a rollup yet, I quickly removed an unwanted print statement and especially adjusted a type check in my wrapper, which wasn't working correctly by checking the return type of a function, instead of a call. Wrappers are fun. But now even higher-order derivatives from my (not yet upstream) testset pass again!

@ZuseZ4
Copy link
Contributor Author

ZuseZ4 commented Jan 30, 2025

@bors r=@oli-obk

@bors
Copy link
Contributor

bors commented Jan 30, 2025

📌 Commit 1f30517 has been approved by oli-obk

It is now in the queue for this repository.

bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 31, 2025
Rollup of 9 pull requests

Successful merges:

 - rust-lang#132156 (When encountering unexpected closure return type, point at return type/expression)
 - rust-lang#133429 (Autodiff Upstreaming - rustc_codegen_ssa, rustc_middle)
 - rust-lang#136281 (`rustc_hir_analysis` cleanups)
 - rust-lang#136297 (Fix a typo in profile-guided-optimization.md)
 - rust-lang#136300 (atomic: extend compare_and_swap migration docs)
 - rust-lang#136310 (normalize `*.long-type.txt` paths for compare-mode tests)
 - rust-lang#136312 (Disable `overflow_delimited_expr` in edition 2024)
 - rust-lang#136313 (Filter out RPITITs when suggesting unconstrained assoc type on too many generics)
 - rust-lang#136323 (Fix a typo in conventions.md)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit c19c4b9 into rust-lang:master Jan 31, 2025
6 checks passed
@rustbot rustbot added this to the 1.86.0 milestone Jan 31, 2025
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Jan 31, 2025
Rollup merge of rust-lang#133429 - EnzymeAD:autodiff-middle, r=oli-obk

Autodiff Upstreaming - rustc_codegen_ssa, rustc_middle

This PR should not be merged until the rustc_codegen_llvm part is merged.
I will also alter it a little based on what get's shaved off from the cg_llvm PR,
and address some of the feedback I received in the other PR (including cleanups).

I am putting it already up to
1) Discuss with `@jieyouxu` if there is more work needed to add tests to this and
2) Pray that there is someone reviewing who can tell me why some of my autodiff invocations get lost.

Re 1: My test require fat-lto. I also modify the compilation pipeline. So if there are any other llvm-ir tests in the same compilation unit then I will likely break them. Luckily there are two groups who currently have the same fat-lto requirement for their GPU code which I have for my autodiff code and both groups have some plans to enable support for thin-lto. Once either that work pans out, I'll copy it over for this feature. I will also work on not changing the optimization pipeline for functions not differentiated, but that will require some thoughts and engineering, so I think it would be good to be able to run the autodiff tests isolated from the rest for now. Can you guide me here please?
For context, here are some of my tests in the samples folder: https://github.com/EnzymeAD/rustbook

Re 2: This is a pretty serious issue, since it effectively prevents publishing libraries making use of autodiff: EnzymeAD#173. For some reason my dummy code persists till the end, so the code which calls autodiff, deletes the dummy, and inserts the code to compute the derivative never gets executed. To me it looks like the rustc_autodiff attribute just get's dropped, but I don't know WHY? Any help would be super appreciated, as rustc queries look a bit voodoo to me.

Tracking:

- rust-lang#124509

r? `@jieyouxu`
@ZuseZ4 ZuseZ4 deleted the autodiff-middle branch January 31, 2025 15:48
@lqd
Copy link
Member

lqd commented Feb 1, 2025

Hey @ZuseZ4, it looks like this PR had significant perf implications on a small number of benchmarks, as you can see in the post-merge perf results for the rollup where it landed. It could be related to LTO, from a quick look.

I don’t see a pre-merge perf run here, so I assume this is not expected. Could you take a look? Thanks!

Let’s also cc the reviewer @oli-obk.

@oli-obk
Copy link
Contributor

oli-obk commented Feb 1, 2025

Oh yea that was absolutely not expected. looking...

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 1, 2025
Add link attribute for Enzyme's LLVMRust FFI

Since rust-lang#133429 landed, the compiler doesn't build with `-Zcross-crate-inline-threshold=always`. I don't expect anyone else to test or fix issues with that goofy configuration, so I'm fixing it.

This PR adds a link attribute just like rust-lang#118142 for all the new LLVMRust functions. They were actually added in rust-lang#130060 but weren't used until just now.
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Feb 1, 2025
Rollup merge of rust-lang#136374 - saethlin:enzyme-linkage, r=oli-obk

Add link attribute for Enzyme's LLVMRust FFI

Since rust-lang#133429 landed, the compiler doesn't build with `-Zcross-crate-inline-threshold=always`. I don't expect anyone else to test or fix issues with that goofy configuration, so I'm fixing it.

This PR adds a link attribute just like rust-lang#118142 for all the new LLVMRust functions. They were actually added in rust-lang#130060 but weren't used until just now.
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 2, 2025
…ssion, r=<try>

test autodiff compile time fixes

Tries to fix the regression from rust-lang#133429
@oli-obk
Copy link
Contributor

oli-obk commented Feb 2, 2025

perf regression fixed by #136413

@oli-obk oli-obk added the perf-regression-triaged The performance regression has been triaged. label Feb 2, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 3, 2025
…ssion, r=oli-obk

fix autodiff compile time regression

Tries to fix the regression from rust-lang#133429

Tracking:

- rust-lang#124509
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this pull request Feb 4, 2025
…oli-obk

fix autodiff compile time regression

Tries to fix the regression from rust-lang/rust#133429

Tracking:

- rust-lang/rust#124509
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Feb 4, 2025
…oli-obk

fix autodiff compile time regression

Tries to fix the regression from rust-lang/rust#133429

Tracking:

- rust-lang/rust#124509
lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request Feb 10, 2025
…oli-obk

fix autodiff compile time regression

Tries to fix the regression from rust-lang/rust#133429

Tracking:

- rust-lang/rust#124509
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants