Feature request: Use 'trampolines' to make most of the body of a generic function non-generic #77960

jyn514 · 2020-10-15T03:38:53Z

A common pattern in Rust, especially in the standard library, is to have two versions of a function: one generic and one with the actual implementation. For example, Error::new() immediately delegates to a non-generic function _new():

rust/library/std/src/io/error.rs

Lines 250 to 260 in 5565241

    
               #[stable(feature = "rust1", since = "1.0.0")] 
        
               pub fn new<E>(kind: ErrorKind, error: E) -> Error 
        
               where 
        
                   E: Into<Box<dyn error::Error + Send + Sync>>, 
        
               { 
        
                   Self::_new(kind, error.into()) 
        
               } 
        
               fn _new(kind: ErrorKind, error: Box<dyn error::Error + Send + Sync>) -> Error { 
        
                   Error { repr: Repr::Custom(Box::new(Custom { kind, error })) } 
        
               }

This benefits both compile- and run-time: compiletime because LLVM has to do less work compiling the large implementation of the function body many times, and runtime because generic version can safely be marked as #[inline] without bloating the instruction cache.

However, most libraries do not do this, for the simple reason that it's tedious and annoying to do (and you have to already know about it). It would be amazing if rustc could do this transformation itself. I expect this would help greatly with #65031, as well as compile times across the ecosystem.

The text was updated successfully, but these errors were encountered:

jyn514 · 2020-10-15T03:39:14Z

Thanks to @matklad for bringing this up in https://matklad.github.io/2020/10/15/study-of-std-io-error.html !

dlight · 2020-10-15T05:29:43Z

@llogiq wrote a library to perform this translation with a function attribute, momo. It works with a small number of traits (Into, AsRef and AsMut). Having rustc first optimize those three traits would be a good first step before working with arbitrary traits.

In the far future, I wish that rustc were able to automatically transform generics into code that uses trait objects, whenever applicable (the opposite of the "devirtualization" optimization).

dylni · 2020-10-15T05:38:45Z

#77767 is a similar issue for const generics.

thomcc · 2020-10-15T07:39:37Z

This is a specific case of "outlining", and while outlining would help various things in a lot of places, its probably better to handle this specifically.

camelid · 2020-12-18T19:55:11Z

For those (like me!) who don't know what 'trampoline' means in this sense of the word, I think it's related to this part of the Wikipedia article "Trampolines (computing)":

In embedded systems, trampolines are short snippets of code that start up other snippets of code. For example, rather than write interrupt handlers entirely in assembly language, another option is to write interrupt handlers mostly in C, and use a short trampoline to convert the assembly-language interrupt calling convention into the C calling convention.

jyn514 · 2020-12-18T20:13:13Z

@camelid well sort of - I was mostly thinking of how the PLT is a trampoline: https://stackoverflow.com/questions/43048932/why-does-the-plt-exist-in-addition-to-the-got-instead-of-just-using-the-got. But yeah, same idea of having a short, well-known setup address that calls a much larger implementation.

tgnottingham · 2021-01-04T23:28:11Z

The effect on runtime should be elaborated a bit. If I understand correctly, we wouldn't want to apply the transformation indiscriminately.

For example, functions that we actually want to be inlined shouldn't have this applied to them, because that would defeat the purpose of inlining. And I don't think we can expect LLVM to undo the transformation in these cases. At least it doesn't in the Error::_new example in the summary. That happens to be desirable in this case, or it's theorized to be, which is also something to think about. Maybe LLVM would inline if the non-generic impl was marked #[inline] though.

Anyway, there's probably more subtlety involved than that, but I just wanted to point out the general issue since it hadn't been mentioned.

There was also some discussion about this and the implementation of dispatching to the non-generic function in this thread about Momo.

bluss · 2021-02-17T09:14:02Z

One of the culprits in std (but not the only one) - is that closures inherit all type parameters in their scope, not just those that affect it, and solving that separately would allow us to remove some of the workarounds in Iterator specifically.

I don't know the status of polymorphization, is that solving enough to remove the need for the workarounds in Iterator?

jumbatm · 2021-03-15T10:31:08Z

Happy to take this :)

@rustbot claim

cjgillot · 2021-04-17T17:26:30Z

Investigation report

This feature is close to type-erased code generation that other programming languages (Java, OCaml...) do. My approach was the following.

Consider a function

fn foo<T>(a: M<T>) -> R<T> {
  let c: C<T> = T::CONST;
  // stuff with T
  T::bar(a, c)
}

During MIR analyses, the trampoline and the callee version can be computed generically:

gather all constants out a MIR body, and replace them by appropriately-typed parameters;
gather all functions, and replace them by function pointer parameters.

fn foo<T>(a: M<T>) -> R<T> {
  let _2: C<T> = T::CONST;
  let _3: fn(M<T>, C<T>) -> R<T> = T::bar;
  foo_impl<T>(a, _2, _3)
}

fn foo_impl<T>(a: M<T>, c: C<T>, _3: fn(M<T>, C<T>) -> R<T>) -> R<T> {
  // stuff with T
  _3(a, c)
}

The next step is to only have one instance of foo_impl. Next, we introduce a placeholder type T_op. The code becomes:

extern { type T_impl; }

fn foo<T>(a: M<T>) -> R<T> {
  let _2: C<T> = T::CONST;
  let _3: fn(M<T>, C<T>) -> R<T> = T::bar;
  transmute(foo_impl(transmute(a), transmute(_2), transmute(_3)))
}

fn foo_impl(a: M<T_impl>, c: C<T_impl>, _3: fn(M<T_impl>, C<T_impl>) -> R<T_impl>) -> R<T_impl> {
  // stuff with T_impl
  _3(a, c)
}

As a last step, we need to check that all those transmutes are safe.

The problem is that Rust layouts all monomorphized types independently: in general M<T> and M<T_impl> have no reason to have a similar layout if T != U. In special cases, this can happen, for instance if M<T> only points to sized T through a reference.

As a consequence, we need to check that M<T> and M<T_impl> and all recursively accessible types have the same layout. Likewise for C<T> and R<T>. If any such test fails, the type-erasure is unsound, we need to abort and go back to full monomorphization. Actually, we should be able to restrict the check to all types that are involved in the MIR: types of locals, of accessed fields, of dereferenced pointers...

This last checking part is (IMHO) the most subtle part for a MVP of this feature.

jumbatm · 2021-04-19T00:42:48Z

Thank you for documenting your investigation.

For contrast and discussion's sake, I'll describe my approach.

Instead of type erasure, it does something similar to what Momo already does. First, we perform a dataflow analysis to determine whether the arguments' uses converge to a single type. For example:

fn bar<S: Into<String>>(s: S) {
    some_work(); // s: S
    more_work(); // s: S
    let s: String = s.into(); // s: String
    even_more_work(&s); // s: String
    yet_even_more_work(s); // s: String
}

We see in the example above that after the third line of bar, s is a String for all code paths. This means that from this statement onwards, the function is non-generic, so we can extract that out into a single instance. The generic "trampoline", therefore, just becomes the generic start of the function:

#[inline]
fn bar<S: Into<String>>(s: S) {
    some_work();
    more_work();
    let s: String = s.into();
    bar_impl(s);
}

fn bar_impl(s: String) {
    even_more_work(&s);
    yet_even_more_work(s);
}

As you can see, this is basically what Momo does, but generalised out with a dataflow analysis so the optimisation can be applied automatically.

I think our approaches actually target different scenarios, and may be orthogonal: While my approach is, like Momo, specifically targeted to cases like the Error example given above, your approach uses the trampoline to merge monomorphisations of parameters with the same layout. If I'm not mistaken, your approach makes the observation that when two generic functions differ in type parameters, they may really be equivalent on layout (which your approach handles by erasing the type) and only differ on specific handling (which your approach handles with the function pointer parameter). My concern with this approach is that it may not apply to scenarios such as in the Error example above: while two types may both implement Into<Box<_>>, it does not necessarily mean that their layouts will be identical, or even similar. I'm also concerned about the call via function pointer, because we don't want to introduce a dynamic dispatch where a user would expect a static dispatch (though this concern may be moot if const prop is able to resolve the call directly).

Again, thank you for documenting your investigation. I appreciate the write-up -- it's given me a lot to think about. Please do let me know what you think of my approach, and if I've missed anything.

[...] we need to check that M and M<T_impl> and all recursively accessible types have the same layout. [...] This last checking part is (IMHO) the most subtle part for a MVP of this feature.

Ooh! I've actually previously implemented a similar check for layout equivalence, which may be of interest: https://github.com/rust-lang/rust/blob/master/compiler/rustc_lint/src/builtin.rs#L2696-L2704.

jyn514 changed the title ~~Feature request: Use 'trampolines' to de-monomorphize large parts of a function~~ Feature request: Use 'trampolines' to most of the body of a generic function non-generic Oct 15, 2020

jyn514 changed the title ~~Feature request: Use 'trampolines' to most of the body of a generic function non-generic~~ Feature request: Use 'trampolines' to make most of the body of a generic function non-generic Oct 15, 2020

jonas-schievink added the A-mir-opt Area: MIR optimizations label Oct 15, 2020

jyn514 mentioned this issue Dec 15, 2020

[Optimization] Equivalent functions are not merged because their panic information differs. #80021

Closed

jyn514 mentioned this issue Feb 17, 2021

Use internal iteration in Iterator::is_sorted_by #82198

Merged

rustbot assigned jumbatm Mar 15, 2021

cjgillot mentioned this issue May 14, 2021

[EXPERIMENT] Reduction of monomorphization load through type erasure #85308

Closed

marcospb19 mentioned this issue Aug 8, 2023

Optimize generics to a dynamic dispatch for size #114598

Open

Noratrieb added the A-monomorphization Area: Monomorphization label Jul 8, 2024

Noratrieb unassigned jumbatm Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Use 'trampolines' to make most of the body of a generic function non-generic #77960

Feature request: Use 'trampolines' to make most of the body of a generic function non-generic #77960

jyn514 commented Oct 15, 2020 •

edited by rustbot

Loading

jyn514 commented Oct 15, 2020

dlight commented Oct 15, 2020

dylni commented Oct 15, 2020

thomcc commented Oct 15, 2020

camelid commented Dec 18, 2020 •

edited

Loading

jyn514 commented Dec 18, 2020

tgnottingham commented Jan 4, 2021 •

edited

Loading

bluss commented Feb 17, 2021

jumbatm commented Mar 15, 2021

cjgillot commented Apr 17, 2021

jumbatm commented Apr 19, 2021 •

edited

Loading

Feature request: Use 'trampolines' to make most of the body of a generic function non-generic #77960

Feature request: Use 'trampolines' to make most of the body of a generic function non-generic #77960

Comments

jyn514 commented Oct 15, 2020 • edited by rustbot Loading

jyn514 commented Oct 15, 2020

dlight commented Oct 15, 2020

dylni commented Oct 15, 2020

thomcc commented Oct 15, 2020

camelid commented Dec 18, 2020 • edited Loading

jyn514 commented Dec 18, 2020

tgnottingham commented Jan 4, 2021 • edited Loading

bluss commented Feb 17, 2021

jumbatm commented Mar 15, 2021

cjgillot commented Apr 17, 2021

Investigation report

jumbatm commented Apr 19, 2021 • edited Loading

jyn514 commented Oct 15, 2020 •

edited by rustbot

Loading

camelid commented Dec 18, 2020 •

edited

Loading

tgnottingham commented Jan 4, 2021 •

edited

Loading

jumbatm commented Apr 19, 2021 •

edited

Loading