model-checking · celinval · Sep 7, 2022 · Aug 26, 2022 · Aug 30, 2022 · Aug 30, 2022
diff --git a/rfcs/src/rfcs/0001-mir-linker.md b/rfcs/src/rfcs/0001-mir-linker.md
@@ -0,0 +1,244 @@
+- **Feature Name:** `MIR` Linker (mir_linker)
+- **RFC Tracking Issue**: <https://github.com/model-checking/kani/issues/1588>
+- **RFC PR:** <https://github.com/model-checking/kani/pull/1600>
+- **Status:** Under Review
+- **Version:** 0
+
+-------------------
+
+## Summary
+
+Fix linking issues with the rust standard library in a scalable manner by only generating `goto-program` for
+code that is reachable from the user harnesses.
+
+## User Impact
+
+The main goal of this RFC is to enable Kani users to link against all supported constructs from the `std` library.
+Currently, Kani will only link to items that are either generic or have an inline annotation.
+
+The approach introduced in this RFC will have the following secondary benefits.
+- Reduce spurious warnings about unsupported features for cases where the feature is not reachable from any harness.
+- In the verification mode, we will likely see a reduction on the compilation time and memory consumption
+ by pruning the inputs of symtab2gb and goto-instrument.
+  - Linking against the standard library goto-models takes more than 5 GB which is not feasible with the current
+  GitHub workers.
+- In a potential assessment mode, only analyze code that is reachable from all public items in the target crate.
+
+One downside is that our release bundle will be bigger.
+We are going to include compilation artifacts for the `std` and `kani` libraries.
+This will negatively impact the time taken to set up Kani
+(triggered by either the first time a user invokes `kani | cargo-kani` , or explicit invoke the subcommand `setup`).
+
+## User Experience
+
+Once this RFC has been stabilized users shall use Kani in the same manner as they have been today.
+Until then, we wil add an unstable option `--mir-linker` to enable the cross-crate reachability analysis
+and the generation of the `goto-program` model only when compiling the target crate.
+
+Kani setup will likely take longer and more disk space as mentioned in the section above.
+This change will not be guarded by `--mir-linker` option above.
+
+## Detailed Design
+
+In a nutshell, we will no longer generate a `goto-program` model for every crate we compile.
+Instead, we will generate the `MIR` for every crate, and we will generate only one `goto-program` model.
+This model will only include items reachable from the target crate's harnesses.
+
+The current system flow for a crate verification is the following (Kani here represents either `kani | cargo-kani`
+executable):
+
+1. Kani compiles the user crate as well as all its dependencies.
+   For every crate compiled, `kani-compiler` will generate a `goto-program` model.
+   This model includes everything reachable from the crate's public functions.
+2. After that, Kani links all models together by invoking `goto-cc`.
+   This step will also link against Kani's `C` library.
+3. For every harness, Kani invokes `goto-instrument` to prune the linked model to only include items reachable from the given harness.
+4. Finally, Kani instruments and verify each harness model via `goto-program` and `cbmc` calls.
+
+After this RFC, the system flow would be slightly different:
+
+1. Kani compiles the user crate dependencies up to the `MIR` translation.
+   I.e., for every crate compiled, `kani-compiler` will generate an artifact that includes the `MIR` representation
+  of all items in the crate.
+2. Kani will generate the `goto-program` only while compiling the target user crate.
+  It will generate one `goto-program` model that includes all items reachable from any harness in the target crate.
+3. `goto-cc` will still be invoked to link the generated model against Kani's `C` library.
+4. Steps #3 and #4 above will be performed without any change.
+
+This feature will require three main changes to Kani which are detailed in the sub-sections below.
+
+### Kani's Sysroot
+
+Kani currently uses `rustup` sysroot to gather information from the standard library constructs.
+The artifacts from this `sysroot` include the `MIR` for generic items as well as for items that may be included in
+a crate compilation (e.g.: functions marked with `#[inline]` annotation).
+The artifacts do not include the `MIR` for items that have already been compiled to the `std` shared library.
+This leaves a gap that cannot be filled by the `kani-compiler`;
+thus, we are unable to translate these items into `goto-program`.
+
+In order to fulfill this gap, we must compile the standard library from scratch.
+This RFC proposes a similar method to what [`MIRI`](https://github.com/rust-lang/miri) implements.
+We will generate our own sysroot using the `-Z always-encode-mir` compilation flag.
+This sysroot will be pre-compiled and included in our release bundle.
+
+We will compile `kani`'s libraries (`kani` and `std`) also with `-Z always-encode-mir`
+and with the new sysroot.
+
+
+### Cross-Crate Reachability Analysis
+
+`kani-compiler` will include a new `reachability` module to traverse over the local and external `MIR` items.
+This module will `monomorphize` all generic code as it's performing the traversal.
+
+The traversal logic will be customizable allowing different starting points to be used.
+The two options to be included in this RFC is starting from all local harnesses
+(tagged with `#[kani::proof]`) and all public functions in the local crate.
+
+The `kani-compiler` behavior will be customizable via a new flag:
+
+  ```
+  --reachability=[ harnesses | pub_fns |  none | legacy* ]
+  ```
+
+where:
+
+ - `harnesses`: Use the local harnesses as the starting points for the reachability analysis.
+ - `pub_fns`: Use the public local functions as the starting points for the reachability analysis.
+ - `none`: This will be the default value if `--reachability` flag is not provided. It will basically skip
+ reachability analysis (and `goto-program` generation).
+   `kani-compiler` will still generate artifacts with the crate's `MIR`.
+ - `legacy`: Keep `kani-compiler` current behavior by using
+   `rustc_monomorphizer::collect_and_partition_mono_items()` which respects the crate boundary.
+   This will generate a `goto-program` for each crate compiled by `kani-compiler`, and it will still have the same
+   `std` linking issues.
+
+*This option will be removed as part of the `rfc` stabilization.
+
+These flags will not be exposed to the final user.
+They will only be used for the communication between `kani-driver` and `kani-compiler`.
+
+### Dependencies vs Target Crate Compilation
+
+The flags described in the section above will be used by `kani-driver` to implement the new system flow.
+For that, we propose the following mechanism:
+
+- For standalone `kani`, we will pass the option `--reachability=harnesses` to `kani-compiler`.
+- For `cargo-kani`, we will replace
+  ```
+  cargo build <FLAGS>
+  ```
+
+  with:
+
+  ```
+  cargo rustc <FLAGS> -- --reachability=harnesses
+  ```
+
+  to build everything.
+  This command will compile all dependencies without the `--reachability` argument, and it will only pass `harnesses`
+  value to the compiler when compiling the target crate.
+
+## Rational and Alternatives
+
+Not doing anything is not an alternative, since this fixes a major gap in Kani's usability.
+
+### Benefits
+
+- The MIR linker will allow us to fix the linking issues with Rust's standard library.
+- Once stabilized, the MIR linker will be transparent to the user.
+- It will enable more powerful and precise static analysis to `kani-compiler`.
+- It won't require any changes to our dependencies.
+- This will fix the harnesses' dependency on the`#[no_mangle]` annotation
+([Issue-661](https://github.com/model-checking/kani/issues/661)).
+
+### Risks
+
+Failures in the linking stage would not impact the tool soundness. I anticipate the following failure scenarios:
+- ICE (Internal compiler error): Some logic is incorrectly implemented and the linking stage crashes.
+  Although this is a bad experience for the user, this will not impact the verification result.
+- Missing items: This would either result in ICE during code generation or a verification failure if the missing
+  item is reachable.
+- Extra items: This shouldn't impact the verification results, and they should be pruned by CBMC's reachability
+  analysis.
+  This is already the case today. In extreme cases, this could include a symbol that we cannot compile and cause an ICE.
+
+The new reachability code would be highly dependent on the `rustc` unstable APIs, which could increase
+the cost of the upstream synchronization.
+That said, the APIs that would be required are already used today.
+
+Finally, this implementation relies on a few unstable options from `cargo` and `rustc`.
+These APIs are used by other tools such as MIRI, so we don't see a high risk that they would be removed.
+
+### Alternatives
+
+The other options explored were:
+1. Pre-compile the standard library and ship the generated `*symtab.json` files.
+2. Pre-compile the standard library, convert them to `goto-programs` (via `symtab2gb`) and link them into
+   one single `goto-program` model. Ship the generated model.
+
+Both alternatives are very similar. They only differ on the artifact that would be shipped.
+They require generating and shipping a custom `sysroot`;
+however, there is no need to implement the reachability algorithm.
+
+We implemented a prototype for the MIR linker and one for the alternatives.
+Both prototypes generate the sysroot as part of the `cargo kani` flow.
+
+We performed a small experiment (on a `c5.4xlarge` ec2 instance running Ubuntu 20.04) to assess the options.
+
+For this experiment, we used the following harness:
+```rust
+#[kani::proof]
+#[kani::unwind(4)]
+pub fn check_format() {
+    assert!("2".parse::<u32>().unwrap() == 2);
+}
+```
+The experiment showed that the MIR linker approach is much more efficient.
+
+See the table bellow for the breakdown of time (in seconds) taken for each major step of
+the harness verification:
+
+
+| Stage                     | MIR Linker | Alternative |
+----------------------------|------------|-------------|
+| compilation               |   22.2s    |    64.7s    |
+| `goto-program` generation |   2.4s     |    90.7s    |
+| `goto-program` linking    |   0.8s     |    33.2s    |
+| code instrumentation      |   0.8s     |    33.1     |
+| verification              |   0.5s     |    8.5s     |
+
+It is possible that `goto-cc` time can be improved, but this would also require further experimentation and
+expertise that we don't have today.
+
+Every option would require a custom sysroot to either be built or shipped with Kani.
+The table below shows the size of the sysroot files for the alternative #2
+(`goto-program` files) vs compiler artifacts (`*.rmeta` files)
+files with `-Z always-encode-mir` for `x86_64-unknown-linux-gnu` (on Ubuntu 18.04).
+
+| File Type      | Raw size | Compressed size |
+|----------------|----------|-----------------|
+| `symtab.json`  |   950M   |     26M         |
+| `symtab.out`   |   84M    |     24M         |
+| `*.rmeta`      |   92M    |     25M         |
+
+These results were obtained by looking at the artifacts generated during the same experiment.
+
+## Open questions
+
+- Should we build or download the sysroot during `kani setup`?
+- What's the best way to enable support to run Kani in the entire `workspace`?
+  - One possibility is to run `cargo rustc` per package.
+- Should we codegen all static items no matter what? Static objects can only be initialized via constant function.
+  Thus, it shouldn't have any side effect.
+  That relies on all constant initializers being evaluated during compilation.
+
+## Future possibilities
+
+- Split `goto-program` models into two or more items to optimize compilation result caching.
+  - Dependencies: One model will include items from all the crate dependencies.
+    This model will likely be more stable and require fewer updates.
+  - Target crate: The model for all items in the target crate.
+- Do the analysis per-harness. This might be adequate once we have a mechanism to cache translations.
+- Add an option to include external functions to the analysis starting point in order to enable verification when
+calls are made from `C` to `rust`.
+- Contribute the reachability analysis code back to upstream.