-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add #[inline]
to functions which were missing it, and #[track_caller]
to ones with runtime panics from user input
#347
Conversation
FYI since rust-lang/rust#109247, MIR inlining is permitted without |
Yeah that makes sense. Even if it's not a hard block for the MIR inliner now, these are things that really really should be inlined. |
I think this is uncontroversial and mostly oversights. That lint is helpful |
Very sensible. It is a shame that rust does not inline methods by default like C++ does. |
By default, Rust has exactly the same inlining behaviors as C++ does. Removing indirect calls from final codegen is not what is being discussed here. The desirable effect here is from the MIR inliner, which runs before rustc hands off codegen to LLVM, because the LLVM optimizer (which was designed for C/C++ and written in C++) seems to de-optimize unless we do this. |
Experienced C++ users will generally build the entire project as a single compilation unit If we do a default build with Rust (i.e. large numbers of CUs), we get the same slow compilation time Of course inline semantics are more than just a hint to the compiler. In C++ they promote functions to I know that calling on all functions to be inlinable will fall on deaf ears, but it would be a simple experiment |
In C++ I think it is more that building as a single unit allows not parsing and typechecking headers again for every source file. Rustc already doesn't do this. As for the backend side, rustc in fact intentionally splits a single crate into multiple codegen units to improve compilation time by allowing parallelism. Cross crate inlining without
In rust we put every function in their own section and tell LLVM that it is allowed to merge functions. Unless you enable ICF for the linker this doesn't allow merging identical functions on the linker side, but LLVM does merge functions within a codegen unit and the linker will remove all unused functions because we use |
…er]` to ones with runtime panics from user input
Rationale for inlining is basically twofold. First: even though
#[inline]
doesn't help the LLVM inliner for generics, it apparently can help the MIR inliner (and the lack of it has been commented on before by @saethlin, at least).Second, and perhaps more controversially, this library depends on inlining in a way that basically no other part of the stdlib does -- failing to inline most of these functions kind of defeats the point of the design we've chosen.
Concretely, the
simd_foo
intrinsics this library will codegen to take advantage of whatever vector operations are allowed in the function where they finally get codegenned (e.g. after all inlining occurs).This means that if we define our high-level wrappers for some
simd_foo
intrinsic and the wrapper gets called from a downstream users#[target_feature(enable="foo")]
function (for a relevant feature), then failing to inline that wrapper into the call-site is squandering the approach we've gained with by using thesimd_foo
intrinsic design.Arguably, this justifies use of
#[inline(always)]
. I am in favor of that (it would also could helpcore::simd
-using code have improved performance in unoptimized builds compared tocore::arch
use (which is absurdly slow without opts). That said, I didn't bother since that can be done later, and might be better as just some extra tuning on the MIR inliner. Not to mention, we support simd ops on unreasonably huge vectors and it seems completely unreasonable to use#[inline(always)]
on functions ofSimd<f64, 4096>
or whatever.And I also added a clippy warning for missing inline on public functions (although private functions that are called from public functions should get ideally get
#[inline]
too in most cases).The rationale for
track_caller
should be obvious (much easier for users to track down the issue). While track_caller can have a very small perf impact, it does not matter here (the overhead it adds is one extra pointer argument passed into the function, but we expect these to be inlined, so that argument doesn't need to be passed).