Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create RFC for MIR linker #1600
Create RFC for MIR linker #1600
Changes from 4 commits
d4cc9f0
1c56998
5091258
9c5d0f6
09a8dc4
ceacdfb
17bae26
ebe3b65
05a42e4
b6a2cd0
b3658e2
4c6d237
ffd8ec6
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much larger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approx 26MB bigger for linux 64bits. Kani 0.9 bundle size was 22MB. So this will more than double the size. That said, we also download the entire rustup toolchain as part of
kani setup
and I believe it has more than 50MB.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the text a bit. Let me know if it's clearer now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider one per harness to get even more slicing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally. I put that as a future work for now. Let me know if you think this should be in the scope of the initial work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danielsn is it OK if I leave it like this for the first version of the MIR Linker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit unclear on this section. What standard library constructs are looked up in
sysroot
? Doessysroot
have MIR lying around?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The rustup sysroot does have MIR lying around. They are inside the *rlib files shipped with the toolchain (see more details here: https://rustc-dev-guide.rust-lang.org/backend/libs-and-metadata.html#metadata).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhassan-aws what kind of information would you like me to add here to make this more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can... we just use what MIRI ships?
We install a rust toolchain on install. I'd been thinking about whether we could reduce this to a minimal profile as an optimization, but we should also be able to go the other way and add the miri component?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great point. I'm assuming the overhead will be the same, but it might save us from maintaining yet another build script.
Let me investigate that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be awkward to use MIRI to grab the sysroot. MIRI only build the sysroot when you first execute
cargo miri [run | test]
. It also requiresxargo
in order to do that.That said, it might not be that bad to build the sysroot instead of downloading it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tedinski please let me know your thoughts on this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shucks, I was hoping MIRI would ship it. I'm ok with adding it as part of
setup
or pre-building it. Though I'm guessing MIRI doesn't pre-build it for a good reason?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oli-obk what's the motivation to build MIRI sysroot as part of the first execution of MIRI opposed to pre-building it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miri can cross interpret, so we'd need to deploy all targets somewhere once (instead of once per host). Also sometimes you'll want to change how the sysroot is built I guess.
Also we need the ability to build the sysroot from scratch anyway for when we modify miri out of tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to not monomorphize everything, what would be the implications? Does it make the process impossible or the only side-effect would be to potentially add include too many items?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a great question! Conceptually, I don't see any reason why the reachability code couldn't work with polymorphic definitions. If we decide to contribute this code to upstream or put in separate library, it would be a great thing to do.
For our current goal, it does make sense to monomorphize here since Kani's compiler just wants a list of items to generate code for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes totally agree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rustc already has logic for this in https://github.com/rust-lang/rust/blob/master/compiler/rustc_monomorphize/src/collector.rs
Maybe it could be generalized to allow you to plug into it and make it work cross crate (rustc only needs cross crate for generic functions and will assume nongeneric functions got codegened upstream)
We have for now given up on mir-only rlibs, but if we want to pick up that work, we'd need exactly what you are describing anyway, so it doesn't seem unreasonable to change the collector to support it optionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be great! Most of the code I've written so far was based on the collector logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that under
--test
tests are marked public and that keep them here?I also wonder if we should just... include both by default and not have options here? This avoids the complication of setting/flags/configuration, doesn't seem like a major downside with slightly more code compiled, and makes everything more flexible.
E.g. a few minutes of thought comes up with the case of Rust calling C code and that C calling Rust code that's "extern C" and which we can't inspect the reachability of, but including both harnesses and public items both would handle seamlessly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is two parts to this:
1- I believe the reachability code should be kept generic to take whatever entry point we want.
2- For now, we will only provide two possible choices, harnesses and public functions.
I do see other future usages for the starting point. C + Rust verification will likely be a combination of harness + extern C functions.
For
--tests
, I believe we still want to use the harness mode unless if the user chooses--function
. For the later, I might just keep the legacy mode for now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tedinski does that make sense? Let me know if you want me to change anything or make the text more clear about the customization points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with whichever course you choose, but my preference is that we should error on the side of simplicity here. IMO it should just use "public + harnesses" as roots and then we don't need to worry about other options. It seems like the worse case scenario is just codegen'ing a bit more than we need, while the advantage is that everything just works and nobody has to think about this again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, goto-programs will still be generated, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. With
none
we won't generate any goto-program. This will be used while compiling the dependencies.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tedinski and @adpaco-aws I added a bit more information here. Let me know if that makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about when we run
cargo kani --tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question. At some point I thought about adding this to the "Open Questions", but I totally forgot. I'll do that now.
I don't have a great answer yet. The two possibilities I thought so far are on the hacky side, but I believe they would work:
--reachability
, maybe "on-tests" or something like that. This would behave as either "harness" or "none" depending whether RUSTFLAGS has--tests
.--reachability
argument to also take into account the existence of--tests
.Both would ensure that we only run the reachability and codegen at the last compilation step. Any other suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tedinski any suggestions? I added this to the open questions too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question was less on the reachability side here, and more on the side of the actual pragmatics of doing the build.
cargo build
becomescargo rustc
but what doescargo test --no-run
become?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will likely stay as is and we might add a
--reachability=on-tests
or something like that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tedinski I have great news, we can replace
by
I'll add some details here and remove that from the open questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, we've been primarily using the
KANIFLAGS
environment variable to pass things to thekani-compiler
? If this is correct, any reason to switch to usingcargo rustc
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason is that we want to pass an extra flag to the target crate which we wouldn't pass to the dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this will still keep KANIFLAGS working as is today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about if its used as a potential function pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great point. CBMC should catch that once the following PR gets merged: diffblue/cbmc#6987
Another possibility is to change how we invoke
--goto-instrument
as pointed out here: diffblue/cbmc#6983 (comment)I created #1621 to address that issue and added to https://github.com/model-checking/kani/milestone/12
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The results are amazing! I'd expect memory usage to be significantly lower with the MIR linker as well.
If you still have the log files, can you compare the size of the SAT instance (clauses/literals) between the two alternatives? I'm curious to know how much CBMC's slicer is able to reduce the problem size, and how close it can get to the one generated by the MIR linker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I totally forgot to include the memory usage here.
Both prototypes run the test described in CI. So you can see the log files for the CI run for MIR linker and the alternative.
The specific data I believe you are asking for is:
and
for MIR Linker and the alternative, respectively.
It just occurred to me that I should try to check the difference under visualize. It might make it more clear to understand the difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we handle cases during development when we modify something the sysroot depends on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During development, we shall build the sysroot together with all our binaries. I haven't figured out the best mechanism yet though. Building the sysroot as part of the
kani-compiler
build isn't quite working for me. The sysroot will require different cargo configuration, so I am considering splitting our workspace and creating a new workspace for thelibrary
folder.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danielsn Would you like me to add this information here? I tried to focus on the end user point of view, but it might make sense to clarify this.