Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cap the maximum basic block padding in Cranelift #6736

Merged
merged 4 commits into from
Jul 19, 2023

Conversation

alexcrichton
Copy link
Member

This commit is a fix for a fuzz failure that came in the other day where a module took more than 2GB of memory to compile, triggering the fuzzer to fail. Investigating locally it looks like the problem lies in the basic-block padding added recently. Turning on that option increases the memory usage of Cranelift from ~70M to ~2.5G. The setting in the test was to add 1<<14 bytes of padding between all basic blocks, and with a lot of basic blocks that can add up quite quickly.

The solution implemented here is to implement a per-function limit of the amount of padding that can be injected. I've set it to 1MB for now which is hopefully enough to continue to stress the various bits of the MachBuffer while not blowing limits accidentally while fuzzing.

After this commit the fuzz test case in question jumps from 70M to 470M with basic block padding enabled, which while still large is a bit more modest and sticks within the limits of the fuzzer.

This commit is a fix for a fuzz failure that came in the other day where
a module took more than 2GB of memory to compile, triggering the fuzzer
to fail. Investigating locally it looks like the problem lies in the
basic-block padding added recently. Turning on that option increases the
memory usage of Cranelift from ~70M to ~2.5G. The setting in the test
was to add `1<<14` bytes of padding between all basic blocks, and with a
lot of basic blocks that can add up quite quickly.

The solution implemented here is to implement a per-function limit of
the amount of padding that can be injected. I've set it to 1MB for now
which is hopefully enough to continue to stress the various bits of the
`MachBuffer` while not blowing limits accidentally while fuzzing.

After this commit the fuzz test case in question jumps from 70M to 470M with
basic block padding enabled, which while still large is a bit more
modest and sticks within the limits of the fuzzer.
@alexcrichton alexcrichton requested a review from a team as a code owner July 17, 2023 14:50
@alexcrichton alexcrichton requested review from abrown and removed request for a team July 17, 2023 14:50
if !bb_padding.is_empty() {
buffer.put_data(&bb_padding);
buffer.align_to(I::LabelUse::ALIGN);
total_bb_padding += bb_padding.len();
if total_bb_padding > (1 << 20) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would limit it to a single conditional branch where the branch destination doesn't fit and thus a veneer needs to be inserted on AArch64. Maybe bump it to 1 << 21?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point yeah and was something I was worried about as well. I'm a bit worried though that bumping this to 21 (predictably) doubles the memory usage.

I've instead opted for a different strategy now:

  • Each function gets at most 64M of padding
  • Padding is only enabled for fuzz test cases with <10 functions

That should help continue to explore interesting cases while avoiding the large memory increases seen prior I think

@alexcrichton alexcrichton requested a review from a team as a code owner July 17, 2023 16:19
@alexcrichton alexcrichton requested review from pchickey and removed request for a team July 17, 2023 16:19
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. fuzzing Issues related to our fuzzing infrastructure labels Jul 17, 2023
@github-actions
Copy link

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "cranelift", "cranelift:area:machinst", "fuzzing"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

// regardless of how many basic blocks there are.
// * Here it's limited to enable this setting only when there's at most
// 10 functions to ensure that the overhead for all functions is <1G
// ish or otherwise dosn't run away.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*doesn't

Copy link
Contributor

@abrown abrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this but wonder if @cfallin might have an opinion: limiting the Cranelift functions to a padded 64MB makes it possible that some of the larger islands never get fuzzed? IIRC, don't we have 2GB islands on x86?

@cfallin
Copy link
Member

cfallin commented Jul 18, 2023

Ah, yes, we actually do have 26-bit ranges that are relevant on AArch64: direct branches have a range of +/- 128MiB (2^26, times 4 because of PC alignment, signed). A function size up to 128MiB plus epsilon (144MiB, to pick a round number?) would I think give sufficient coverage.

@cfallin
Copy link
Member

cfallin commented Jul 18, 2023

(And x86-64 does have 2GiB ranges for all branches and RIP-relative labels, but we actually don't have façades to extend those; we just don't support a single function body that large!)

@alexcrichton
Copy link
Member Author

Sounds reasonable! I've bumped the per-function limit to 150M and reduced the maximum number of functions to 5 to keep the overhead under 1G here

@alexcrichton alexcrichton added this pull request to the merge queue Jul 19, 2023
Merged via the queue into bytecodealliance:main with commit 475d1ba Jul 19, 2023
19 checks passed
@alexcrichton alexcrichton deleted the fix-fuzz-oom branch July 19, 2023 15:44
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Aug 7, 2023
This commit removes the option to generate padding between basic blocks
when fuzzing Wasmtime. Over the weekend lots of OOMs were discovered
related to this option and its most recent update in bytecodealliance#6736. The new OOMs
appear to be related to:

* If multiple modules are generated then the configured limits in bytecodealliance#6736
  aren't relevant because they only cap one module.
* Each imported function generates a new trampoline which has its own
  set of padding which wasn't previously accounted for.
* Spec tests have a lot of functions and the limits didn't account for
  this.

While each of these is probably individually fixable I think it's
probably not worth the whack-a-mole any more. The `cranelift-fuzzgen`
target should cover the relevant cases for padding without the need for
Wasmtime's fuzzing to cover it as well.
github-merge-queue bot pushed a commit that referenced this pull request Aug 9, 2023
This commit removes the option to generate padding between basic blocks
when fuzzing Wasmtime. Over the weekend lots of OOMs were discovered
related to this option and its most recent update in #6736. The new OOMs
appear to be related to:

* If multiple modules are generated then the configured limits in #6736
  aren't relevant because they only cap one module.
* Each imported function generates a new trampoline which has its own
  set of padding which wasn't previously accounted for.
* Spec tests have a lot of functions and the limits didn't account for
  this.

While each of these is probably individually fixable I think it's
probably not worth the whack-a-mole any more. The `cranelift-fuzzgen`
target should cover the relevant cases for padding without the need for
Wasmtime's fuzzing to cover it as well.
eduardomourar pushed a commit to eduardomourar/wasmtime that referenced this pull request Aug 18, 2023
This commit removes the option to generate padding between basic blocks
when fuzzing Wasmtime. Over the weekend lots of OOMs were discovered
related to this option and its most recent update in bytecodealliance#6736. The new OOMs
appear to be related to:

* If multiple modules are generated then the configured limits in bytecodealliance#6736
  aren't relevant because they only cap one module.
* Each imported function generates a new trampoline which has its own
  set of padding which wasn't previously accounted for.
* Spec tests have a lot of functions and the limits didn't account for
  this.

While each of these is probably individually fixable I think it's
probably not worth the whack-a-mole any more. The `cranelift-fuzzgen`
target should cover the relevant cases for padding without the need for
Wasmtime's fuzzing to cover it as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift Issues related to the Cranelift code generator fuzzing Issues related to our fuzzing infrastructure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants