-
-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Miri support #937
Miri support #937
Conversation
92ae87d
to
5d5098e
Compare
very cool work!!! I love this :) |
5d5098e
to
cb8be48
Compare
I wrote many new shims for Miri to handle file system operations and other bits. See my merged-for-sled branch on my fork of miri for all the updates so far. I'm working on cleaning them up in parallel and submitting PRs to upstream. Currently, when using these two branches, and running the Edit: I re-ran the one test and captured the output before the leak report:
Success! |
Miri is very strict about leaks, and leak reports may be a fact of life with real-world programs. I think we're seeing reports from some combination of For comparison, running
Since memory leaks are already covered by running lsan in CI, I won't pay too much attention to Miri leak reports going forward. |
If this seems like a good approach to you, in the |
At this point, crossbeam isn't causing any problems during run time, and it's only contributing a small slice of static variables in the leak report at the end, but I'll keep that in mind. |
@divergentdave Very cool work -- love it! I found this right after filing crossbeam-rs/crossbeam#464. Do you think the leak reports are actually misleading then? |
These are probably spurious; see crossbeam-rs/crossbeam#464 and spacejam/sled#937 (comment) And besides, we're now running both the leak sanitizer and the address sanitizer.
Just commented over there, but I think I'm seeing objects owned by containers with 'static lifetimes in the leak report. With that in mind, I don't think it's an issue for sled. |
I love this, this is awesome. :) Once this works, I also know how to answer when people ask me "what would be a good benchmark for finding bottlenecks in Miri performance". :D Let me know if there is any way I can help to make progress here. |
I got a backtrace out of this from uninitialized data in an IoBufs being written to the log. Re-running this test case did not reproduce the report, but I bet another hour of runtime could shake it out again. Backtrace
|
Here's a similar backtrace, this time with logging on. Log and backtrace
Edit: New one with a dump of the relevant allocation Log, memory dump, and backtrace
|
I think I have identified what is going on with the above uninitialized memory read. I think there are two ways this could be solved, either we could make this change to only write the initialized bytes to the file: (potentially creating a small "file hole" or leaving old bytes from the last write of this segment untouched) - let total_len = if maxed { capacity } else { bytes_to_write };
+ let total_len = if should_pad { capacity } else { bytes_to_write }; Alternately, we could zero out the remaining bytes of the iobuf when |
Some of the bigger tests can eat up all available RAM after a few hours, and trigger the oom-killer if left unchecked, or fail to allocate memory sooner if I set |
Yeah, stacked borrows accumulates a lot of metadata. I tried at least deduplicating adjacent identical stacks, but there's still a lot of room -- and probably also a lot of just inherent things that need to be stored.
|
d3745f3
to
f2ff764
Compare
This PR is at a good juncture to merge now. Over the last few days, I rewrote and rebased the commit history to get rid of merges and combine like changes, and added a few small things to help with memory consumption when using Miri. f2ff764 cuts the memory limit when I found that the Without modification, I decided to use a feature flag instead of Lastly, here are some further notes on memory overhead: Of the three The interpreter keeps a With my latest changes, Miri's allocations were dominated by short-lived vectors used for sorting struct members inside a visitor function used in the Retag statement, and the peak memory usage, after a slow and steady rise, came from the stacked borrows stacks. Over 8 hours of runtime on the size_leak test, it churned through 146GB of allocations, and hit a maximum size of 9GB before I stopped it. Individual stacked borrows stacks were much shorter, with an average size around 256 Items. (they use Vec, so almost all must have been of the same capacity) As such, my tracing didn't identify any particular part of sled that was inducing disproportionate memory consumption. As mentioned above, |
348909e
to
f57eb28
Compare
f57eb28
to
76a8860
Compare
The pointer `self.current_block` is separately turned into a shared reference and a Box in the same method, and moreover, the reference is created before the Box and used after the Box. This results in a unique reference from the Box and a shared reference from a separate pointer-to-reference conversion coexisting, with different tags. To fix this issue, the conversion from a raw pointer to a Box is put off until the shared reference is no longer in use, so that the two different tags won't pop each other off the stack.
Can't use compression because zstd relies on calling a native C library
DirEntry::metadata() is built upon dirfd() and openat(), which aren't yet shimmed in Miri. Instead, get the PathBuf for the DirEntry and call std::fs::metadata() on that when using Miri.
Early return from debug_delay() and type-punning construction of Histograms, to reduce the amount of Stacked Borrows overhead
76a8860
to
13e5efe
Compare
All the necessary shims for this are now in the latest nightly builds of Miri! These commands now work.
|
Great work, @divergentdave :) |
This PR will add enough conditional compilation to support running tests with miri. So far, I have disabled use of the process's PID, turned off the body of try_lock(), and ignored the quickcheck tests, since they would take far too long to run. Miri is slow as tar, so using it would only be useful for spot-checking for unsoundness, not a CI job or anything.
Miri requires additional shims for mkdir and rmdir before this will work, I have changes in progress for that as well.
The first issue I've discovered thus far is that
crossbeam-epoch
has an unsound use ofstd::mem::uninitialized
, creating astd::Mem::ManuallyDrop<T>
. There is a PR at crossbeam-rs/crossbeam#458 that will fix this soon.Relevant commands:
cargo +nightly miri test --features=testing -- -Zmiri-disable-isolation -- bug_22
cargo +nightly install --path . --force --locked --offline
Edit - updated commands:
./rustup-toolchain
one time, then./miri install
cargo +miri miri test --features=testing -- -Zmiri-disable-isolation -Zmiri-ignore-leaks -- bug_22
RUSTFLAGS="--cfg miri" cargo +nightly expand --lib --tests serialization::qc
More commands:
cargo +miri miri test --features=testing -- -Zmiri-disable-isolation -Zmiri-ignore-leaks --
(setulimit -v
first or this may eat all your swap space)cargo +miri miri test --features=testing -- -Zmiri-disable-isolation -Zmiri-ignore-leaks -Zmiri-disable-stacked-borrows -Zmiri-disable-validation --
(takes ~1hr on my computer)cargo +miri miri test --features=testing -- -Zmiri-disable-isolation -Zmiri-ignore-leaks -- --ignored tree_bug_01