Closures in generic code can create duplicate monomorphizations #83010
Labels
A-typesystem
Area: The type system
I-compiletime
Issue: Problems and improvements with respect to compile times.
I-heavy
Issue: Problems and improvements with respect to binary size of generated code.
T-compiler
Relevant to the compiler team, which will review and decide on the PR/issue.
Summary
When a closure is defined in a generic context like a generic function or a method of a generic class, its type includes, in some sense or another, the parent's generic parameters. If the closure doesn't depend on some of those generic parameters, it can lead to the instantiation of mono items which are functionally identical, but distinct by virtue of false dependencies on unused generic parameters of the parent. I'll call these mono items "duplicates".
These duplicates create needless work and code bloat. At the very least, they go through the codegen and optimization pipeline, and if not optimized away, are included in the binary. I imagine there is other overhead, such as creation of more distinct types.
Example 1
The
map
call is passed a closure|x| x
which does not depend the parent function's generic parameterT
. The code for that closure is identical in bothfoo::<i32>
andfoo::<i64>
, so we should be able to only generate onemap
instantiation andMap
type (the return type ofmap
, which is generic over the closure thatmap
was passed), and use it in bothfoo
instances. Yet this creates twomap
instantiations, and twoMap::new
instantiations, because the closure type is polluted by the parent'sT
parameter, which, again, the closure doesn't use in any way.Excerpt from
-Z print-mono-items=lazy
:(I don't know why, but there weren't any mono items in the output for the closure itself in this example.)
If we actually did anything interesting with the
Map
s, it would create yet more duplicate instantiations. And if the closure captured anything by value that needed dropping, it would create duplicatedrop_in_place
instantiations for the closure andMap
types.Example 2
This isn't a rare event. Here's an example from the standard library which was quickly and easily found. This case is different, because the parent generics come from the struct for which the function is implemented:
The closure passed to
map_or
really only depends onV
, yet it has false dependencies onK
andS
. Creating twoHashMap
s with the same value type will cause duplicate mono items to be created for this closure andmap_or
:Excerpt from
-Z print-mono-items=lazy
:Impact
Okay, so there are some duplicate mono items. Maybe even a lot in a real codebase. But is it impactful? I haven't measured, but I suspect it is. Take a look at the Rust and assembly for
Option::map_or
(for the above closure) in debug mode, and consider the above example:Assembly
(Wow, that's surprisingly complicated.)
In debug mode, this code is included in the binary twice, once for each closure. In release mode, the code may be optimized out, but it's still processed for some part of compilation. This was a trivial example. Imagine if
map_or
was a complicated function!If there are hundreds or thousands of duplicate mono items in moderate to large codebases, which seems quite possible, then this may have a noticeable impact on compile times and binary sizes.
Solution
If possible, we should make closure types depend only on the parent generics which they actually use.
The
ClosureSubsts
comment explains why parent generics are included:See #27087, where this was introduced. #27086 may also be relevant. Here and here are FIXMEs for that issue, later removed.
I have a feeling the case where the parent generics come from a generic struct on which the function including the closure is defined (example 2) may be harder to solve.
@rustbot label T-compiler A-typesystem I-compiletime I-heavy
The text was updated successfully, but these errors were encountered: