Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

format_args!() could 'inline' literal arguments #78356

Closed
m-ou-se opened this issue Oct 25, 2020 · 26 comments
Closed

format_args!() could 'inline' literal arguments #78356

m-ou-se opened this issue Oct 25, 2020 · 26 comments
Assignees
Labels
A-fmt Area: std::fmt C-enhancement Category: An issue proposing an enhancement or a PR with one. I-heavy Issue: Problems and improvements with respect to binary size of generated code. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@m-ou-se
Copy link
Member

m-ou-se commented Oct 25, 2020

Arguments::as_str allows code to handle argument-less fmt::Arguments like format_args!("literal") specially by not pulling in any formatting code.

We should investigate if we can improve format_args!("hello: {}", "literal") to result in the same trivial fmt::Arguments as format_args!("hello: literal"), such that any .as_str()-based optimizations are also available for it.

That could also make things like panic!("{}", "message"); work with const_panic, removing the need for panic!(CONST_STR).

This makes panic!("hello") exactly as cheap (and const) as panic!("{}", "hello"), which is also important for #78088 to make sure there are no downsides to its suggestions.

It'd also reduce the need for using concat!(..), since it'd no longer be less efficient to add a {} placeholder to concat literals in a formatting string instead.

As a bonus: format_args!("a: {}", format_args!("b: {}", ..)) could maybe also improved to produce the same as format_args!("a: b: {}", ..).


Assigning to myself to investigate some time Soon™.

@m-ou-se m-ou-se added C-enhancement Category: An issue proposing an enhancement or a PR with one. A-fmt Area: std::fmt labels Oct 25, 2020
@m-ou-se m-ou-se self-assigned this Oct 25, 2020
@m-ou-se m-ou-se added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. I-heavy Issue: Problems and improvements with respect to binary size of generated code. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Oct 25, 2020
@sfackler
Copy link
Member

Do we expect this kind of thing to actually happen right now? It seems like we should if anything just lint against it, and the workaround for the single-arg-panic string could be to do panic::panic_any("{}") instead of panic!("{}", "{}").

@programmerjake
Copy link
Member

Do we expect this kind of thing to actually happen right now? It seems like we should if anything just lint against it, and the workaround for the single-arg-panic string could be to do panic::panic_any("{}") instead of panic!("{}", "{}").

would that actually work in #![no_std]? I don't think there's a core::panic::panic_any...

@m-ou-se
Copy link
Member Author

m-ou-se commented Oct 25, 2020

Do we expect this kind of thing to actually happen right now?

If you're asking if anyone has time to implement this: Yes, I believe I can find the time for this some time soon.

the workaround for the single-arg-panic string could be to do panic::panic_any("{}") instead of panic!("{}", "{}").

panic_any will only exist in std, not core. The bloat problem is mostly a problem in no_std programs.

@sfackler
Copy link
Member

If you're asking if anyone has time to implement this

Sorry, I'm asking if we expect that something like println!("hello: {}", "world") actually exists in codebases in significant numbers.

panic_any will only exist in std, not core. The bloat problem is mostly a problem in no_std programs.

Suggesting core::panicking::panic would be the thing then I suppose.

@m-ou-se
Copy link
Member Author

m-ou-se commented Oct 25, 2020

I'm asking if we expect that something like println!("hello: {}", "world") actually exists in codebases in significant numbers.

Ah. I think it mostly happens as the result of macro expansion. E.g. error!("x") expanding to eprintln!("error: {}", "x"). Though, many macros use concat!() and panic(thing) directly. E.g. println!("Failure: {}", stringify!(expr)) is sometimes println!(concat!("Failure: ", stringify!(expr)) (which then breaks for expressions with braces). Or panic!("error: {}", format_args!(fmt, args...)) is sometimes written panic!(concat!("error: ", fmt), args...), etc.

Suggesting core::panicking::panic would be the thing then I suppose.

That one requires a 'static lifetime, so then the suggestion would be different for 'static strings and non-'static strings. Would be nice if it didn't require a special case. (It'd also require that function to become stable.)


Anyway, I opened this issue just as a reminder to investigate the possibility, not because I think we should definitely do this. There are good ways to avoid the need for this like you mention (core::panicking::panic), but it might be nice if no special cases were necessary while implementing macros like assert!(), and "{}", ".." was just as efficient as any other solution.

@workingjubilee
Copy link
Member

My observations have consistently suggested that people use format! or write! for seemingly "trivial" reasons All The Time because when you start with println! it's what feels like the most natural and easiest way. And then they do wrap it in other macros, yes, so it is best to try to take the effort to reduce the amount of overhead going on here.

@m-ou-se
Copy link
Member Author

m-ou-se commented Jun 16, 2021

Hey @rust-lang/wg-const-eval, I'd like to hear your thoughts about the feasibility of this feature.

Some examples of things it could do:

  1. format_args!("a {}", "b")format_args!("a b"): Implementing this does not need constant evaluation. The format_args macro itself can directly see the literal and inline it.
  2. format_args!("a {}", SOME_STR)format_args!("a b"): The macro can't see the contents or type of the constant (or even see that it is a constant), so this will need to happen at some other point. How could this work?
  3. format_args!("a {}", 123)format_args!("a 123"): The macro could do this, but will have to make assumptions about how integers are formatted. Can work for simple cases. More advanced cases would need constant evaluation of the Display trait, but I'm guessing that's still far away.
  4. format_args!("a {}", SOME_INT)format_args!("a 123"): Just like 2, requires constant evaluation after the macro is already expanded, and like 3 requires knowledge about how to format integers.
  5. format_args!("a {}", format_args!("b"))format_args!("a b"): The macro could do this directly if it can know that the format_args token actually refers to itself. (I think it can, since it can also expand concat!() etc.)
  6. format_args!("a {}", format_args!("b {}", c))format_args!("a b {}", c): Same as the previous one, but needs more complicated logic.

For const_panic specifically, 2 would be very useful to have panic!("{}", MSG) work (to replace panic!(MSG) which won't work in Rust 2021). But 4 could also be very useful to give more useful panic messages during constant evaluation.

Many of these I could implement by modifying how the format_args builtin macro works, but that won't work for the cases where it refers to constants instead of only using literals, which are probably the most useful ones for const_panic.

So my question is: Can that be implemented without making things extremely complicated? Recognizing and reversing+rebuilding the fmt::Arguments::new call that results from the format_args!() invocation when processing the mir sounds complicated to me. Is there an easier way? Could we add special metadata to make it easier to process at a later stage? Could we add a new expression type specifically for 'format args' such that it doesn't appear as a big complicated call expression to fmt::Arguments::new, but simply as a 'format args expression' that gets substituted a fmt::Arguments constructor call much later? Or any other ideas?

@oli-obk
Copy link
Contributor

oli-obk commented Jun 16, 2021

The problem isn't format_args at all. We can make new_v1 const fn today, even new_v1_formatted. The problem is panic_fmt which wants to format the Arguments thing properly. Since at this point the integers or strings to be formatted are only available via

pub struct ArgumentV1<'a> {
value: &'a Opaque,
formatter: fn(&Opaque, &mut Formatter<'_>) -> Result,
}
as part of
pub struct Arguments<'a> {
// Format string pieces to print.
pieces: &'a [&'static str],
// Placeholder specs, or `None` if all specs are default (as in "{}{}").
fmt: Option<&'a [rt::v1::Argument]>,
// Dynamic arguments for interpolation, to be interleaved with string
// pieces. (Every argument is preceded by a string piece.)
args: &'a [ArgumentV1<'a>],
}
, we can't really do anything at CTFE time, unless we start adding some more lang items. Most notably we'd need to make ::fmt a lang item, so we can check whether the pointer in ArgumentsV1 is a pointer to that function, and if it is, print that string to our compile-time-panic.

This scheme does not scale to anything but panicking, because if the formatting arg is not a string (or another potentially hardcoded type like integers), then we have to abort with an error. Since we are already panicking, we can just continue and print something like "{unprintable}".

@RalfJung
Copy link
Member

I'd rather not add support for integers -- if we go down that route we will end up with a (hacky) const-side reimplementation of a subset of the formatting machinery. As much as possible, we should use Rust code for this, not special hacks in the interpreter engine.

Given that the CTFE core engine already supports the entire display machinery (demonstrated by the fact that it works in Miri), we might alternatively poke a small hole into the const checks that allows executing formatting machinery from panic_fmt only -- something like "miri unleashed", but more targeted.

@m-ou-se
Copy link
Member Author

m-ou-se commented Jun 16, 2021

The problem isn't format_args at all. We can make new_v1 const fn today, even new_v1_formatted. The problem is panic_fmt which wants to format the Arguments thing properly.

const_panic is not the only reason to this. I want .as_str() to return a Some(..) in more cases, such that a macro like this can be just as efficient as when it somehow had used used concat!() whenever possible:

macro_rules! log_newline {
    ($($t:tt)*) => {
        $crate::log_fmt(format_args!("log: {}\n", format_args!($($t)*)))
    }
}

Even if constant evluation could properly handle this nested format_args, it would not result in log_newline!("literal") avoiding an allocation in situations like this:

#[inline]
pub fn log_fmt(x: fmt::Arguments) {
    if let Some(s) = x.as_str() {
        log_str(s); // No allocation needed. The whole str is already available. \o/
    } else {
        log_str(&x.to_string());
    }
}

Ideally, log_newline!("literal") should result in a fmt::Arguments with a &'static str "log: literal\n" returned from .as_str(). This requires not just being able to display the fmt::Arguments during compile time, but also substituting it with a (partially) evaluated one.

Also, log_newline!("a {}", a) ideally shouldn't have to pay the cost of the extra layer of format_args!() to add the prefix and newline, as shown in example 6 above. (Otherwise we need hacks with concat!() to avoid that, which result in bad edge cases.)

@m-ou-se
Copy link
Member Author

m-ou-se commented Jun 16, 2021

Expanding a bit on this idea I mentioned above:

Could we add special metadata to make it easier to process at a later stage? Could we add a new expression type specifically for 'format args' such that it doesn't appear as a big complicated call expression to fmt::Arguments::new, but simply as a 'format args expression' that gets substituted a fmt::Arguments constructor call much later?

If I understand correctly, getting the values of constants needs to happen at the mir level. So, this idea would mean, I think:

  • Adding a ast::ExprKind::FormatArgs
  • Making the format_args! builtin macro not expand to a ast::ExprKind::Call("fmt::Arguments::new", ..) anymore, but to a ast::ExprKind::FormatArgs instead.
  • Lowering that to a hir::ExprKind::FormatArgs transparently.
  • Lowering that to a mir::??? transparently.
  • Transform it by inlining constants and flattening it, etc.
  • Lowering(?)/transforming it into a call to fmt::Arguments::new.

Then const_panic should just have the FormatArgs available witout having to decode/evaluate core's fmt::Arguments, after inlining constants and flattening. And .as_str() can use every optimization/transformation that happened at the mir level.

(Also, --pretty=expanded would get less noisy when using format args, because the macro would no longer expand directly to the full fmt::Arguments construction.)

I guess this is in some ways similar to what happens for asm!(..).

@RalfJung
Copy link
Member

I am not sure I like the idea of making formatting a language primitive based on MIR primitives.

I guess this is in some ways similar to what happens for asm!(..).

Indeed, and I view that as a rather strong negative.^^ asm! really needs to be a language primitive, formatting really shouldn't.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 16, 2021

I think the better comparison here are generators. We could expand generators on the HIR, it would just be a mess, so we do it on MIR where this is much more manageable. Formatting machinery stuff could be similar, but in any case, we could start with a minimal version that works as a HIR lowering similar to how for loops don't exist in HIR anymore.

@AaronKutch
Copy link
Contributor

Would implementing this fix #87466? It would be great if this were done before const_panic and/or edition 2021 is stabilized.

@jhpratt
Copy link
Member

jhpratt commented Jul 26, 2021

@AaronKutch in the issue description

That could also make things like panic!("{}", "message"); work with const_panic, removing the need for panic!(CONST_STR).

This doesn't really have anything to do with the edition, though. const_panic is still unstable for a reason.

@clarfonthey
Copy link
Contributor

clarfonthey commented Sep 6, 2021

Slowly making my way through some of the tickets for const_panic while investigating stuff. I'm a bit confused why this is so controversial. Surely, at least inlining string literals wouldn't be adding so much code we could consider it identical to reimplementing all of fmt as MIR optimisations.

I could definitely see making format_args!("hello {}", s).as_str().unwrap() a possible const operation as something useful.

@fee1-dead
Copy link
Member

fee1-dead commented Sep 6, 2021

I can come up with a few ways that can make it possible to make format_args! callable from const contexts.

The easiest ways will first make ArgumentV1::new a const function, then we can either

a) in const_check, somehow retrieve the trait from the second argument (FormatTrait::fmt function pointer), and validate the type parameter T implements const FormatTrait.

b) instead of using function pointers, use something else. F: Fn or some type that we define. The new function ensures that we can call the fmt function at compile time. (we could also just split it into different methods e.g. new_display, new_debug etc.) This might affect the performance of runtime though.

@clarfonthey
Copy link
Contributor

format_args! is already callable in const contexts. The issue is actually doing anything with the result.

@fee1-dead
Copy link
Member

You can't call format_args!. Not even when you only have the format string. (format_args!("a") does not work)

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 7, 2021

format_args!("a") does not work

That's on purpose, to avoid making that part of the stability guarantees for now. We have an unstable/internal const_format_args!("a") that does work, which is used by panic!("a").

@m-ou-se
Copy link
Member Author

m-ou-se commented Jan 13, 2023

some time Soon™.

2½ years later, I finally got around to implementing it: #106824

@m-ou-se
Copy link
Member Author

m-ou-se commented Jan 13, 2023

Opened #106823 to change the documentation of fmt::Arguments::as_str() to allow for inlined/flattened format arguments.

@zhassan-aws
Copy link

Would inlining literal arguments allow format_args to be used in const contexts: #108595?

@m-ou-se
Copy link
Member Author

m-ou-se commented Mar 6, 2023

The inlining logic could play some role if we decide to allow format_args in const, but whether to allow format_args in const is an entirely separate discussion and decision.

bors added a commit to rust-lang-ci/rust that referenced this issue Mar 16, 2023
Flatten/inline format_args!() and (string and int) literal arguments into format_args!()

Implements rust-lang#78356

Gated behind `-Zflatten-format-args=yes`.

Part of rust-lang#99012

This change inlines string literals, integer literals and nested format_args!() into format_args!() during ast lowering, making all of the following pairs result in equivalent hir:

```rust
println!("Hello, {}!", "World");
println!("Hello, World!");
```

```rust
println!("[info] {}", format_args!("error"));
println!("[info] error");
```

```rust
println!("[{}] {}", status, format_args!("error: {}", msg));
println!("[{}] error: {}", status, msg);
```

```rust
println!("{} + {} = {}", 1, 2, 1 + 2);
println!("1 + 2 = {}", 1 + 2);
```

And so on.

This is useful for macros. E.g. a `log::info!()` macro could just pass the tokens from the user directly into a `format_args!()` that gets efficiently flattened/inlined into a `format_args!("info: {}")`.

It also means that `dbg!(x)` will have its file, line, and expression name inlined:

```rust
eprintln!("[{}:{}] {} = {:#?}", file!(), line!(), stringify!(x), x); // before
eprintln!("[example.rs:1] x = {:#?}", x); // after
```

Which can be nice in some cases, but also means a lot more unique static strings than before if dbg!() is used a lot.
@m-ou-se
Copy link
Member Author

m-ou-se commented Mar 19, 2023

This is now available on nightly with -Zflatten-format-args=yes.

@m-ou-se
Copy link
Member Author

m-ou-se commented Apr 24, 2023

This is now implemented and enabled by default in Rust 1.71.0: #109999

@m-ou-se m-ou-se closed this as completed Apr 24, 2023
thomcc pushed a commit to tcdi/postgrestd that referenced this issue Jun 1, 2023
Flatten/inline format_args!() and (string and int) literal arguments into format_args!()

Implements rust-lang/rust#78356

Gated behind `-Zflatten-format-args=yes`.

Part of #99012

This change inlines string literals, integer literals and nested format_args!() into format_args!() during ast lowering, making all of the following pairs result in equivalent hir:

```rust
println!("Hello, {}!", "World");
println!("Hello, World!");
```

```rust
println!("[info] {}", format_args!("error"));
println!("[info] error");
```

```rust
println!("[{}] {}", status, format_args!("error: {}", msg));
println!("[{}] error: {}", status, msg);
```

```rust
println!("{} + {} = {}", 1, 2, 1 + 2);
println!("1 + 2 = {}", 1 + 2);
```

And so on.

This is useful for macros. E.g. a `log::info!()` macro could just pass the tokens from the user directly into a `format_args!()` that gets efficiently flattened/inlined into a `format_args!("info: {}")`.

It also means that `dbg!(x)` will have its file, line, and expression name inlined:

```rust
eprintln!("[{}:{}] {} = {:#?}", file!(), line!(), stringify!(x), x); // before
eprintln!("[example.rs:1] x = {:#?}", x); // after
```

Which can be nice in some cases, but also means a lot more unique static strings than before if dbg!() is used a lot.
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Apr 16, 2024
… r=lcnr

Outline default query and hook provider function implementations

The default query and hook provider functions call `bug!` with a decently long message.
Due to argument inlining in `format_args!` ([`flatten_format_args`](rust-lang#78356)), this ends up duplicating the message for each query, adding ~90KB to `librustc_driver.so` of unreachable panic messages.
To avoid this, we can outline the common `bug!` logic.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Apr 16, 2024
Rollup merge of rust-lang#124016 - DaniPopes:dedup-default-providers, r=lcnr

Outline default query and hook provider function implementations

The default query and hook provider functions call `bug!` with a decently long message.
Due to argument inlining in `format_args!` ([`flatten_format_args`](rust-lang#78356)), this ends up duplicating the message for each query, adding ~90KB to `librustc_driver.so` of unreachable panic messages.
To avoid this, we can outline the common `bug!` logic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-fmt Area: std::fmt C-enhancement Category: An issue proposing an enhancement or a PR with one. I-heavy Issue: Problems and improvements with respect to binary size of generated code. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests