Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: patchable-function-entry #3543

Merged
merged 13 commits into from
Mar 27, 2024

Conversation

maurer
Copy link
Contributor

@maurer maurer commented Dec 12, 2023

This RFC proposes to add support for the patchable-function-entry family of flags and attributes from clang and gcc in order to support instrumentation and patching as done in the Linux kernel.

I have a candidate implementation and a MCP to cover just the compiler flag.

Rendered

Support adding a user-specified number of nops before functions
and before the prelude.
@maurer maurer force-pushed the patchable-function-entry branch from 3255007 to 83aa13b Compare December 12, 2023 22:28
@ehuss ehuss added T-lang Relevant to the language team, which will review and decide on the RFC. T-compiler Relevant to the compiler team, which will review and decide on the RFC. labels Dec 13, 2023
@WaffleLapkin
Copy link
Member

How much do you want to allow users to depend on this? I.e. is this a guarantee, or just a hint?

Asking because there might potentially be backends that can't/don't support this and it would be useful to document if they should ignore this, or error.

@maurer
Copy link
Contributor Author

maurer commented Dec 13, 2023

How much do you want to allow users to depend on this? I.e. is this a guarantee, or just a hint?

Asking because there might potentially be backends that can't/don't support this and it would be useful to document if they should ignore this, or error.

I think that probably the behavior should be to error if the backend doesn't support it and a non-zero padding has been specified. This would allow users to set #[patchable_function_entry()] or #[patchable_function_entry(prefix(0), entry(0))] on a function to indicate it shouldn't have extra nop padding when built with a -C patchable-function-entry, but still have it compile when -C patchable-function-entry is not passed but using a backend that doesn't support it without needing to cfg throughout the code.

Thanks for the suggestion; I've added it to the design.

@maurer maurer force-pushed the patchable-function-entry branch from 18fa6f0 to 0b0e428 Compare December 13, 2023 17:22
@zachs18
Copy link
Contributor

zachs18 commented Dec 13, 2023

Seems useful!

One thing I'd mention is that seeing #[patchable_function_entry()] on a function at first glance seems to me like it should be enabling "patchability", but since the defaults are prefix(0), entry(0), #[patchable_function_entry()] without either specified actually either does nothing, or disables "patchability" (depending if the compiler flag is in use) . Thus I might suggest that the attribute should require at least one argument to be passed; users can still explicitly specify #[patchable_function_entry(prefix(0), entry(0))] as you mentioned.

@joshtriplett
Copy link
Member

joshtriplett commented Dec 17, 2023

First of all, 👍 for adding this functionality.

Having both the command-line option and the attribute take two numeric parameters, but with different meanings, seems error-prone. I do appreciate that projects might not want to have to compute different numbers in their build systems for rustc, though.

Given that, I do think we should use some syntax that requires the codegen options to specify which parameters they're defining, as suggested in your alternatives section (but without allowing any bare numeric versions), allowing prefix, entry, and total, and requiring that the user specify at most two of those (or just one of prefix and entry if they want the other to be zero). I also think you should avoid having the word "entry" duplicated.

As for the attribute, bikeshedding slightly, I had the same concern as @zachs18 that the attribute with no parameters feels like it should mean patchable rather than unpatchable. I also think it seems awkward to duplicate the word "entry" in patchable_function_entry(entry(...)). And also, based on precedent in other attributes, I think these should be key-value pairs with = rather than using inner parens. Given all that, I think we should spell this:

  • #[patchable(prefix = N, entry = M)], where at least one of prefix or entry is required
  • #[unpatchable] as an alias for #[patchable(prefix = 0, entry = 0)]. (Both for simplicity and because I think projects will want an abbreviated version badly enough that they'll end up defining it as a macro if we don't offer one.)

@joshtriplett joshtriplett added the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Dec 17, 2023
@tmandry tmandry added T-compiler Relevant to the compiler team, which will review and decide on the RFC. and removed T-compiler Relevant to the compiler team, which will review and decide on the RFC. labels Dec 20, 2023
@tmandry
Copy link
Member

tmandry commented Dec 20, 2023

@rfcbot merge

This seems like a reasonable use case to support, so I think we should do it.

I agree with @joshtriplett that the differing meanings between the CLI flag and the attribute will create confusion. We should do something to mitigate that.

@rfcbot concern difference between flag and attribute count meanings is confusing

@rfcbot
Copy link
Collaborator

rfcbot commented Dec 20, 2023

Team member @tmandry has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Dec 20, 2023
// Code goes here
```

Nop padding may not be supported on all architectures. As of the time of writing, support includes:
Copy link
Member

@nagisa nagisa Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until this point I was under an impression that this is a pretty trivial feature and that there shouldn’t be any major reasons why it wouldn't be supported for an architecture. Is this entirely an engineering resource issue on the backend side?

The RFC should specify how we’ll handle unsupported platforms (do we need to make it stable on a per-architecture basis? Similar to #[target_feature(enable=...)] flag?) I guess we also need to account for backends like cranelift which might not support the patchable entry feature at all...

EDIT: ah, I see you mentioned errors below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that it's an engineering issue for most architectures/backends. For example, I don't think wasm can meaningfully do this.

The listed supported architectures are based on what LLVM currently supports.

- `-C patchable-function-entry=nop_count=m,offset=n` (`nop_count=m`, `offset=n`, modern format, offset optional)
- `-C patchable-function-entry=prefix=m,entry=n` (`prefix=m`, `entry=n`, modern format, either optional)

This would have the benefit of making it more clear what's being specified and allowing users to employ the simpler format on the command line if not integrating with an existing build.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another benefit is that it would allow a potential design choice of -Cpatchable-function-entry=nop_count=n -Cpatchable-function-entry=offset=m or e.g. overriding only a part of the specification for individual rustc invocations: env RUSTFLAGS=-Cpatchable-function-entry=nop_count=n,offset=m rustc $RUSTFLAGS ... -Cpatchable-function-entry=nop_count=x

Copy link
Contributor Author

@maurer maurer Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The counterpoint here is that -Cpatchable-function-entry=offset=m for m nonzero is illegal without a corresponding nop_count, because the default nop count is 0, so the default offset needs to be smaller.

If we support both styles, then specifying just one could be nice, since prefix and entry can be meaningfully specified separately.


To set this for all functions in a crate, use `-C patchable-function-entry=nop_count,offset` where `nop_count = prefix + entry`, and `offset = prefix`. Usually, you'll want to copy this value from a corresponding `-fpatchable-function-entry=` being passed to the C compiler in your project.

To set this for a specific function, use `#[patchable_function_entry(prefix(m), entry(n))]` to pad with m nops before the symbol and n after the symbol, but before the prelude. This will override the flag value.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bikeshedding, but is there a reason to have these as a MetaList instead of a NameValue? To me, prefix = n seems more intuitive than a function call-type syntax.

@traviscross
Copy link
Contributor

@rustbot labels -I-lang-nominated

We discussed this in the T-lang triage meeting on 2023-12-20 and that resulted in the proposed FCP merge and also the concern that divergent meanings between the CLI flag and the attribute may create confusion.

Please renominate if there are questions or progress on addressing that concern that should be discussed by T-lang.

@rustbot rustbot removed the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Dec 21, 2023
@Noratrieb
Copy link
Member

How does this interact with inlining? Isn't this entirely nonsensical if a function gets inlined? I'd a user expected to use #[inline(never)] strategically in the places they care about? Or what's the story here?

maurer added 2 commits January 2, 2024 23:01
This should begin to address concerns about confusions between different
flag vs attribute. Specifically:

* Require command line flag to specify parameter names
* Switch attribute name to `patchable`
* Use metalist format for `patchable`, and require at least one
  parameter
* Add `unpatchable` alias
@tmandry
Copy link
Member

tmandry commented Jan 3, 2024

Since prefix and offset are the same, I think we should pick one term and use it in both places. Otherwise I would have to consult the docs every time I wanted to make sure I remembered the relationship between the two ways of specifying patchable entry.

I do agree that #[unpatchable] (or a similar spelling) is better than the normal attribute with no arguments to mean the same thing, though I would also be comfortable with a version that kept one attribute but required specifying explicit zeroes.

I'm not sure how I feel about #[patchable], and wish I'd read @joshtriplett's comment more closely the first time to respond to it. It seems like we are over-rotating somewhat on the ergonomics of a niche feature, and I would prefer to align the naming of the flag and attribute more closely.

One way of framing this is: "What do we call this feature?" "Patchable" seems pretty vague and high-level. It doesn't provide many "memory hooks" as to what it actually does. "Patchable function entry" is much better in that regard. "Patchable entry" is almost as good, given the context of being applied to a function.

On that note, I think the name nop_count provides a lot more of a memory hook than prefix and entry. I propose that something like prefix_nops and entry_nops would be better. In fact, I think those would be so clear that it would completely remove my concern with #[patchable].

@maurer
Copy link
Contributor Author

maurer commented Jan 4, 2024

Before I update the RFC again, let me put some responses inline to check things.

Since prefix and offset are the same, I think we should pick one term and use it in both places. Otherwise I would have to consult the docs every time I wanted to make sure I remembered the relationship between the two ways of specifying patchable entry.

The logic here at the moment is that it is referred to as the offset in C/C++ toolchains, but offset would make no sense in the prefix/entry world where there is no nop_count.

I do agree that #[unpatchable] (or a similar spelling) is better than the normal attribute with no arguments to mean the same thing, though I would also be comfortable with a version that kept one attribute but required specifying explicit zeroes.

I'm not sure how I feel about #[patchable], and wish I'd read @joshtriplett's comment more closely the first time to respond to it. It seems like we are over-rotating somewhat on the ergonomics of a niche feature, and I would prefer to align the naming of the flag and attribute more closely.

I'm fine naming this either way. I don't think it will be written commonly, so I don't have strong feelings about an ergonomic name.

One way of framing this is: "What do we call this feature?" "Patchable" seems pretty vague and high-level. It doesn't provide many "memory hooks" as to what it actually does. "Patchable function entry" is much better in that regard. "Patchable entry" is almost as good, given the context of being applied to a function.

I called it patchable_function_entry to name it the identical thing as the C/C++ feature. The goal is that when someone searches "fpatchable-function-entry rust" they'll get documentation explaining how to make Rust match up with their existing C/C++ instrumentation.

On that note, I think the name nop_count provides a lot more of a memory hook than prefix and entry. I propose that something like prefix_nops and entry_nops would be better. In fact, I think those would be so clear that it would completely remove my concern with #[patchable].
Again, fine on bikeshedding names to just about anything. My expectation is that the primary workflow for a user of this feature is going to be:

  1. My C/C++ code uses -fpatchable-function-entry, and the Rust functions aren't getting built right.
  2. Ah, the flag in Rust is called -C patchable-function-entry=floozle=m,glorble=n
  3. Oh, I want to put a Rust function into my instrumentation, how do I annotate it? Oh, #[unpatchable].

I think that even for users for whom this feature is extremely important, they will write each of these keywords somewhere between 1 and 3 times.

@tmandry
Copy link
Member

tmandry commented Jan 4, 2024

I think you're doing a good job of weighing the viewpoint of the user who writes the annotation (or flag), @maurer. I'm considering things more from the perspective of a reader of the code. The code will be read many more times than it is written.

Anyway, to that end, I'll make a concrete proposal from the current state of the RFC:

  • Rename prefix and entry in the attribute to prefix_nops and entry_nops
  • Rename offset in the CLI flag to prefix_nops

The reason being that these are more self-explanatory names. We lose the property that offset is the exact name used in C/C++, but the fact that nop_count is already the same plus the self-explanatory (IMO) nature of prefix_nops means that it should hopefully be easy to map.

@nikomatsakis
Copy link
Contributor

@rfcbot reviewed

I am generally aligned with @tmandry here, and I concur with @wesleywiser that we should try not to make monomorphization behavior hard-coded. I'm signalling as reviewed because I trust the thread to reach reasonable conclusions.

My general preferences are:

  • more explicit (and fewer) names = good for niche features
  • align with gcc/clang = good, people should be able to copy/paste, at least for the values if not necessarily the names of the fields

I'm of mixed minds about #[unpatchable], it seems mildly better to have a single attribute with explicit zeros or something like that, but I'm trying to put my finger on why. I guess it seems easier to document, grep for, etc. (Another option would be #[patchable_function_entry(no)] or some similar opt-out.)

@nikomatsakis
Copy link
Contributor

Re-reading, I see @joshtriplett wrote...

(Both for simplicity and because I think projects will want an abbreviated version badly enough that they'll end up defining it as a macro if we don't offer one.)

That makes sense to me, it'd be useful to know how often "unpatchable" functions arise in practice in existing codebases. I had imagined this being a fairly rare thing.

@maurer
Copy link
Contributor Author

maurer commented Jan 4, 2024

Re-reading, I see @joshtriplett wrote...

(Both for simplicity and because I think projects will want an abbreviated version badly enough that they'll end up defining it as a macro if we don't offer one.)

That makes sense to me, it'd be useful to know how often "unpatchable" functions arise in practice in existing codebases. I had imagined this being a fairly rare thing.

To give you an idea of frequency, the way to do this in Linux in C is spelled notrace:

#define notrace			__attribute__((patchable_function_entry(0, 0)))

notrace occurs 416 times in the latest Linux source (with the tools/ directory filtered out, which references it a number of times as a field name, command name, or in documentation rather than referencing the macro, and spaces added to avoid situations where notrace is included in the name of another symbol - actual query rg --stats ' notrace ' -g '!tools' -q)

We could definitely use a more specific name than unpatchable to avoid polluting the global namespace, but I do think Josh is right w/regards to projects defining one of these for themselves if we don't provide one - Linux already does it in C after all!

That said, I don't think it would be a show-stopper to only provide #[patchable_function_entry()] and require someone to specify both zeros, as users could create a macro if it deeply bothered them, and it's still not that common.

@tmandry
Copy link
Member

tmandry commented Jan 5, 2024

I too am generally a fan of fewer names, but there are realistic downsides to doing this with a macro today. You would have to use an attribute proc macro, which is annoying to implement and makes IDE integration worse, or wrap your function definitions inside of a macro_rules! macro, which is annoying to implement and use and makes IDE integration worse.

@maurer
Copy link
Contributor Author

maurer commented Jan 5, 2024

Plan for next draft edits, in case someone thinks something's missing or wants to object:

  1. Define behavior if patchability is specified differently for multiple crates to be
    a. Functions may end up with the patchability configuration specified by any crate in the compilation graph.
    b. The compiler may produce an error if it detects multiple patchability configurations in the crate graph.
  2. Explicitly call out that for some platforms (PPC), only some values of entry/prefix may be legal, and the compiler may error.
  3. entry renamed entry_nops, prefix renamed prefix_nops, nop_count renamed to total_nops, offset renamed to prefix_nops
  4. Docs note explaining the equivalence of prefix_nops and total_nops to offset and nop_count in gcc/clang
  5. patchable renamed back to patchable_function_entry
  6. unpatchable moved to future work
  7. Docs updated to suggest #[patchable_function_entry(entry_nops = 0, prefix_nops = 0)] to disable patchability

Explicitly allow either an error or code generation at any patchability
configuration in the crate graph. This allows all of:
* Disallowing this (detect it and emit an error)
* Doing what will happen by default today (functions will have the
  patchability of their codegen site)
* Adding additional support down the line to use the patchability of the
  declaring crate.

This is unlikely to be needed, but this leaves our options open for the
future.
Some existing codegens (notably ELFv2+GCC+PPC) can only do some values
of `prefix`/`entry`. Call out that it is possible for some values to
cause errors, even if the target/backend combination supports
patchability.
@maurer
Copy link
Contributor Author

maurer commented Jan 23, 2024

I've made the edits I said I'd make in the previous comment.

Notably:

  • Applied the latest (hopefully final?) round of name bikeshedding.
  • Punted #[unpatchable] to future work if users of this feature turn out to need it.
  • Punted compiler behavior around multiple defaults in the crate graph to future work. The compiler is now allowed to throw an error or use any default specified in the graph for any function, which encompasses every behavior we considered being the right choice.

@tmandry
Copy link
Member

tmandry commented Jan 31, 2024

I'm happy with those changes. Thanks @maurer!

@rfcbot resolve difference between flag and attribute count meanings is confusing

@Darksonn
Copy link
Contributor

What is needed to move this forward?

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Mar 2, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented Mar 2, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Mar 12, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented Mar 12, 2024

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

Under "future possibilities", a sentence was missing the word
"crate".  Let's fix that.
RFC 3543 has been accepted.  We've created the tracking issue.  Let's
add that to the RFC.
@traviscross traviscross merged commit c39fdca into rust-lang:master Mar 27, 2024
@traviscross
Copy link
Contributor

This RFC has been accepted by the teams and has now been merged. To track further progress on this, follow the tracking issue:

Thanks to @maurer for authoring this RFC and for pushing it forward and to all those who reviewed it and provided useful feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-compiler Relevant to the compiler team, which will review and decide on the RFC. T-lang Relevant to the language team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.