Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for RFC #2972: Constrained Naked Functions #90957

Open
4 of 6 tasks
nikomatsakis opened this issue Nov 16, 2021 · 94 comments
Open
4 of 6 tasks

Tracking Issue for RFC #2972: Constrained Naked Functions #90957

nikomatsakis opened this issue Nov 16, 2021 · 94 comments
Assignees
Labels
A-naked Area: `#[naked]`, prologue and epilogue-free, functions, https://git.io/vAzzS B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. F-naked_functions `#![feature(naked_functions)]` finished-final-comment-period The final comment period is finished for this PR / Issue. S-tracking-ready-to-stabilize Status: This is ready to stabilize; it may need a stabilization report and a PR T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Nov 16, 2021

This is a tracking issue for the RFC "Constrained Naked Functions" (rust-lang/rfcs#2972).
The feature gate for the issue is #![feature(naked_functions)].

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Unresolved Questions

None.

Implementation history

@nikomatsakis nikomatsakis added B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Nov 16, 2021
@nikomatsakis nikomatsakis added the F-naked_functions `#![feature(naked_functions)]` label Nov 16, 2021
@safinaskar
Copy link
Contributor

"produce a clear warning if any of the above suggestions are not heeded" in RFC seems to contain typo

@bstrie bstrie self-assigned this Jan 25, 2022
@bstrie
Copy link
Contributor

bstrie commented Jan 25, 2022

This is implemented, checking the box.

Adding a new step: "Confirm that all errors and warnings are emitted properly".

@bstrie
Copy link
Contributor

bstrie commented Feb 2, 2022

Request for Stabilization

A proposal that the naked_functions feature be stabilized for Rust 1.60, to be released 2022-04-07. A stabilization PR can be found at #93587.

Summary

Adds a new attribute, #[naked], which may be applied to functions. When applied to a function, the compiler will not emit a function prologue when generating code for this function. This attribute is analogous to __attribute__((naked)) in C. The use of this feature allows the programmer to have precise control over the assembly that is generated for a given function.

The body of a naked function must consist of a single asm! invocation. This asm! invocation is heavily restricted: the only legal operands are const and sym, and the only legal options are noreturn (which is mandatory) and att_syntax. In lieu of specifying operands, the asm! within a naked function relies on the specified extern calling convention in order to determine the validity of registers.

An example of a naked function:

const THREE: usize = 3;

#[naked]
/// Adds three to a number and returns the result.
pub extern "sysv64" fn add_n(number: usize) -> usize {
    // SAFETY: the validity of these registers is guaranteed according to the "sysv64" ABI
    unsafe {
        std::arch::asm!(
            "add rdi, {}",
            "mov rax, rdi",
            "ret",
            const THREE,
            options(noreturn)
        );
    }
}

Documentation

The Rust Reference: rust-lang/reference#1153

Tests

  • codegen/naked-functions.rs: tests for the absence of a function prologue.
  • codegen/naked-noinline.rs: tests for the presence of naked and noinline LLVM attributes.
  • ui/asm/naked-functions.rs: tests that naked functions cannot accept patterns as function parameters, that referencing their function parameters is prohibited, that their body must consist of a single asm! block, that they must specify the noreturn option, that they must not specify any other option except att_syntax, that they only support const and sym operands, that they warn when used with extern "Rust", and that they are incompatible with the inline attribute.
  • ui/asm/naked-functions-ffi.rs: tests that a warning occurs when the function signature contains types that are not FFI-safe.
  • ui/asm/naked-functions-unused.rs: tests that the unused_variables lint is suppressed within naked functions.
  • ui/asm/naked-invalid-attr.rs: tests that the naked attribute is only valid on function definitions.

History

This feature was originally proposed in RFC 1201, filed on 2015-07-10 and accepted on 2016-03-21. Support for this feature was added in #32410, landing on 2016-03-23. Development languished for several years as it was realized that the semantics given in RFC 1201 were insufficiently specific. To address this, a minimal subset of naked functions was specified by RFC 2972, filed on 2020-08-07 and accepted on 2021-11-16. Prior to the acceptance of RFC 2972, all of the stricter behavior specified by RFC 2972 was implemented as a series of warn-by-default lints that would trigger on existing uses of the naked attribute; these lints became hard errors in #93153 on 2022-01-22. As a result, today RFC 2972 has completely superseded RFC 1201 in describing the semantics of the naked attribute.

Unresolved Questions

  • In C, __attribute__((naked)) prevents the compiler from generating both a function prologue and a function epilogue. However in Rust it was discovered that under some circumstances LLVM will choose to emit instructions following the body of a naked function, which is seemingly non-trivial to prevent. Since most of the utility of naked functions comes from preventing the prologue rather than the epilogue, for the time being this feature has restricted itself to only guaranteeing the absence of the prologue. Instead, the reference will specify that users must not rely on the presence of code following the body of a naked function, and that a future version of Rust reserves the right to guarantee the absence of such code.
  • Prior to stabilization, concerns were raised that the current implementation might not be capable of guaranteeing that a naked function is never inlined in every circumstance. Obviously the correctness of naked functions relies on Rust setting up the callstack as if a function call were being performed, even in cases where the function is ultimately inlined. However, there may be observable differences in behavior based on whether or not a naked function is inlined, e.g. due to exported symbols or named labels. Thus the reference will specify that implementations are not currently required to prevent the inlining of naked functions, but that a future version of Rust reserves the right to guarantee that naked functions are not inlined, and that users must not rely on any observable differences that may arise due to the inlining of a naked function.

bstrie added a commit to bstrie/rust that referenced this issue Feb 2, 2022
This stabilizes the feature described in RFC 2972,
which supersedes the earlier RFC 1201.

Closes rust-lang#32408
Closes rust-lang#90957
@bjorn3
Copy link
Member

bjorn3 commented Feb 2, 2022

Since most of the utility of naked functions comes from preventing the prologue rather than the epilogue, for the time being this feature has restricted itself to only guaranteeing the absence of the prologue.

There is no observable difference between LLVM adding an epilogue and a following unnamed function starting with instructions identical to a regular epilogue.

@bstrie
Copy link
Contributor

bstrie commented Feb 2, 2022

@bjorn3 I believe it can cause problems in circumstances such as #32408 (comment) . If someone were to work around that issue by assuming the existence of extra instructions and manually accounting for it, then their code would be broken if Rust ever stopped generating those instructions. It is this sort of edge case that this defensive wording is designed to address.

@roblabla
Copy link
Contributor

roblabla commented Feb 2, 2022

Since most of the utility of naked functions comes from preventing the prologue rather than the epilogue, for the time being this feature has restricted itself to only guaranteeing the absence of the prologue.

There is no observable difference between LLVM adding an epilogue and a following unnamed function starting with instructions identical to a regular epilogue.

I believe it's possible to observe a difference when naked is coupled with #[link_section]. You could apply that attributed to a naked function to put it in a well-known section that you expect to only contain your function.

@joshtriplett
Copy link
Member

👍 for stabilizing, though I'd like to make sure that there's consensus on #32408 (comment) .

@joshtriplett
Copy link
Member

Shall we stabilize constrained naked functions?

@rfcbot merge

@rfcbot
Copy link

rfcbot commented Feb 2, 2022

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Feb 2, 2022
@Amanieu
Copy link
Member

Amanieu commented Feb 2, 2022

I'd like to have a whitelist of attributes that are allowed to be applies to a naked function. In the future we may want to lower naked functions directly to LLVM module-level assembly instead of using LLVM's native naked function support. For that to happen rustc needs to be able to manually translate any attributes (e.g. #[link_section]) to the corresponding asm directive.

@comex
Copy link
Contributor

comex commented Feb 3, 2022

In the future we may want to lower naked functions directly to LLVM module-level assembly instead of using LLVM's native naked function support.

Why, because of the epilogue issue? Personally I would much rather see that fixed on LLVM’s end, rather than going to great lengths to work around it in rustc. Or is there another reason?

@bjorn3
Copy link
Member

bjorn3 commented Feb 3, 2022

Lowering to global_asm! would allow all codegen backends to share the same naked function handling. Also codegen backends without native inline assembly support (like cg_clif) will have to lower to global_asm! even if no code is shared between codegen backends, so it will need this same restriction.

@bstrie
Copy link
Contributor

bstrie commented Feb 3, 2022

@comex In addition to the epilogue issue, it seems the inlining issue could also be resolved by lowering to global_asm!.

@Amanieu I'd be interested in seeing such a list. Obviously part of the appeal of naked functions is that they can be treated just like a normal function in most cases, which includes the ability to be marked with attributes (e.g. #[doc]), so I'm curious how restrictive this list would be. It occurs to me that, in addition to forbidding #[inline], naked functions also already forbid #[track_caller]; I should probably mention this in the reference.

@roblabla
Copy link
Contributor

roblabla commented Feb 3, 2022

Most attributes that work with arbitrary functions should work with naked functions IMO. Taking the list here, those make sense to use with naked fns:

  • Conditional compilation: cfg, cfg_attr
  • Diagnostics: allow and company, deprecated
  • ABI...: link_section, export_name, no_mangle, used
  • Documentation: doc

The only ones that I think make sense to disallow are those in the "Code Generation" category: inline, cold, track_caller.

target_feature is an interesting one. I'm not sure what it does, it may make sense to allow it in naked fns.

github-actions bot pushed a commit to rust-lang/miri that referenced this issue Dec 12, 2024
codegen `#[naked]` functions using global asm

tracking issue: rust-lang/rust#90957

Fixes #124375

This implements the approach suggested in the tracking issue: use the existing global assembly infrastructure to emit the body of `#[naked]` functions. The main advantage is that we now have full control over what gets generated, and are no longer dependent on LLVM not sneakily messing with our output (inlining, adding extra instructions, etc).

I discussed this approach with `@Amanieu` and while I think the general direction is correct, there is probably a bunch of stuff that needs to change or move around here. I'll leave some inline comments on things that I'm not sure about.

Combined with rust-lang/rust#127853, if both accepted, I think that resolves all steps from the tracking issue.

r? `@Amanieu`
@folkertdev
Copy link
Contributor

folkertdev commented Dec 12, 2024

Request for Stabilization

Two years later, we're ready to try this again. Even though this issue is already marked as having passed FCP, given the amount of time that has passed and the changes in implementation strategy, we should follow the process again.

Summary

The naked_functions feature has two main parts: the #[naked] function attribute, and the naked_asm! macro.

An example of a naked function:

const THREE: usize = 3;

#[naked]
pub extern "sysv64" fn add_n(number: usize) -> usize {
    // SAFETY: the validity of the used registers 
    // is guaranteed according to the "sysv64" ABI
    unsafe {
        core::arch::naked_asm!(
            "add rdi, {}",
            "mov rax, rdi",
            "ret",
            const THREE,
        )
    }
}

When the #[naked] attribute is applied to a function, the compiler won't emit a function prologue or epilogue when generating code for this function. This attribute is analogous to __attribute__((naked)) in C. The use of this feature allows the programmer to have precise control over the assembly that is generated for a given function.

The body of a naked function must consist of a single naked_asm! invocation, a heavily restricted variant of the asm! macro: the only legal operands are const and sym, and the only legal options are raw and att_syntax. In lieu of specifying operands, the naked_asm! within a naked function relies on the function's calling convention to determine the validity of registers.

Documentation

The Rust Reference: rust-lang/reference#1153

Tests

History

This feature was originally proposed in RFC 1201, filed on 2015-07-10 and accepted on 2016-03-21. Support for this feature was added in #32410, landing on 2016-03-23. Development languished for several years as it was realized that the semantics given in RFC 1201 were insufficiently specific. To address this, a minimal subset of naked functions was specified by RFC 2972, filed on 2020-08-07 and accepted on 2021-11-16. Prior to the acceptance of RFC 2972, all of the stricter behavior specified by RFC 2972 was implemented as a series of warn-by-default lints that would trigger on existing uses of the naked attribute; these lints became hard errors in #93153 on 2022-01-22. As a result, today RFC 2972 has completely superseded RFC 1201 in describing the semantics of the naked attribute.

More recently, the naked_asm! macro was added to replace the earlier use of a heavily restricted asm! invocation. The naked_asm! name is clearer in error messages, and provides a place for documenting the specific requirements of inline assembly in naked functions.

The implementation strategy was changed to emitting a global assembly block. In effect, an extern function

extern "C" fn foo() {
    core::arch::naked_asm!("ret")
}

is emitted as something similar to

core::arch::global_asm!( 
    "foo:",
    "ret"
);

extern "C" {
    fn foo();
}

The codegen approach was chosen over the llvm naked function attribute because:

  • the rust compiler can guarantee the behavior (no sneaky additional instructions, no inlining, etc.)
  • behavior is the same on all backends (llvm, cranelift, gcc, etc)

Finally, there is now an allow list of compatible attributes on naked functions, so that e.g. #[inline] is rejected with an error.

relevant PRs for these recent changes

unresolved questions

None

@Lokathor
Copy link
Contributor

didn't the "naked asm is implemented using global asm" merge like yesterday?

I'd love to have this feature, but we should probably give it a little time to bake after a change like that.

@Qix-
Copy link

Qix- commented Dec 12, 2024

It was my understanding that ret is not added to the end of a naked function, but instead something akin to a ud2, and that options(noreturn) was forced when using naked functions. Is that no longer the case (or was it never the case, which is likely)?

@folkertdev
Copy link
Contributor

folkertdev commented Dec 12, 2024

I'd love to have this feature, but we should probably give it a little time to bake after a change like that.

It will have that time, it would still take months to actually reach stable even when the quickest route is taken (and I suspect it won't be).

It was my understanding that ret is not added to the end of a naked function, but instead something akin to a ud2, and that options(noreturn) was forced when using naked functions. Is that no longer the case (or was it never the case, which is likely)?

the reference says:

noreturn: The asm! block never returns, and its return type is defined as ! (never). Behavior is undefined if execution falls through past the end of the asm code. A noreturn asm block behaves just like a function which doesn’t return; notably, local variables in scope are not dropped before it is invoked.

But naked functions often do actually return, they just do it from assembly rather than from rust code. That is a fundamental difference, and we decided that noreturn is inaccurate (and just annoying to type). We'd rather document the constraints of naked_asm! (one of which is that control flow may not fall through to the end of the function; the assembly must return or diverge) than annotate every call site with an option.

@Qix-
Copy link

Qix- commented Dec 12, 2024

That's not clear to me though. In #90957 (comment) the generated global_asm! equivalent shows that the empty naked function has a generated ret.

and we decided that noreturn is inaccurate

the assembly must return or diverge

Which is it?

If the naked_asm! block must return or diverge then noreturn is implicit and thus an autogenerated ret makes no sense / can be harmful. So is the example generated empty function global_asm! in the aforementioned comment incorrect, or have the semantics of #[naked] functions changed?

@Lokathor
Copy link
Contributor

The example in that comment seems fundamentally incomplete, because it didn't have a naked_asm! block at all. I don't believe it would compile because you should have exactly one naked_asm! use in a #[naked] fn.

I agree that the example seems wrong in some form or another.

@folkertdev
Copy link
Contributor

that example was incorrect, I've fixed it now.

Just to be clear: no implicit instructions are ever added to the body: it's just the asm code as written by the user, and the user must make sure that that assembly upholds the contract of naked_asm!

@oberien
Copy link
Contributor

oberien commented Dec 12, 2024

Would it make sense to have the compiler insert an additional ud2 after the macro's assembly output as a way to ensure that the naked_asm actually returns, or else the program will error during runtime? Without such an ud2, if the naked function does not correctly handle its own return, the function lying behind it in the binary will be executed, which could lead to very hard to debug situations. An ud2 would ease debugging in such cases during development.

@Lokathor
Copy link
Contributor

Short answer: no, that is exactly what the compiler must never do (insert any instructions that weren't written in the source program).

@npmccallum
Copy link
Contributor

@Lokathor I'm not sure I agree. The defined behavior is that the author MUST handle return correctly. Adding a ud2 at the end is both completely legal and will likely prevent at least one CVE in the future.

Are you worried about being able to predict code size?

@Lokathor
Copy link
Contributor

Lokathor commented Dec 12, 2024

Yes, being able to predict the exact code size is one reason you'd not want ud2.

To quote the stabilization report itself:

The codegen approach was chosen over the llvm naked function attribute because:

  1. the rust compiler can guarantee the behavior (no sneaky additional instructions, no inlining, etc.)

I think it would be reasonable for debug builds, or some opt-in flag, to add the instruction, if you want to "turn on sanitizers". But it should not be the default in --release builds.

EDIT: and "illegal instructions" don't even do anything necessarily on all platforms rust supports. sometimes they are literally nothing but a waste of space for no benefit at all. please no.

@folkertdev
Copy link
Contributor

I'm in favor of the simpler policy of never generating additional instructions, but as a datapoint, gcc does actually emit a ud2 instruction at the end of a naked function (even with optimizations on). Clang does not do this, not even when compiled with no optimizations.

https://godbolt.org/z/Mr5xY8r31

@Amanieu
Copy link
Member

Amanieu commented Dec 13, 2024

I also agree with this: naked_asm is specifically useful when you want precise control over what instructions are emitted. We don't want the compiler to insert instructions behind your back.

@bjorn3
Copy link
Member

bjorn3 commented Dec 14, 2024

Some nop padding will be emitted most of the time anyway without the user having control to keep everything aligned. Might as well replace one nop with an ud2 for some extra safety.

@Lokathor
Copy link
Contributor

Lokathor commented Dec 14, 2024

That very much depends on your platform, doesn't it? On old ARM functions are 4 bytes per a32 instruction and sections are usually aligned to only 4 or maybe 8, so there's maybe one nop between two different functions, but just as likely there's no natural space and this is increasing the size of things. Further, not all embedded platforms even actually do anything with an illegal instruction, so you could be wasting space for no gain at all.

Again, I'm not against the possibility of trap instructions being inserted with a flag or configuration, but it should not be required, and I don't even think it should be default.

@bjorn3
Copy link
Member

bjorn3 commented Dec 14, 2024

I don't think it should be required, but I think we should do it by default when we can. At least with debug assertions enabled.

@oberien
Copy link
Contributor

oberien commented Dec 14, 2024

Summarizing a few brought up points:

  • + Adding ud2 can catch bugs and likely vulnerabilities early during development.
  • + The compiler is free to add padding between functions. On x86 (from the x86 Instruction Set Reference, "Other than raising the invalid opcode exception, this instruction is the same as the NOP instruction."
  • + The compiler is free to order functions in any way it sees fit. It could decide to put another function directly after the current one which is just the ud2 instruction.
  • + Adding ud2 on the modern Intel 64, amd64 and ARM is basically zero-cost. On x86 it's 2 bytes which likely get eliminated (e.g. in the case of ret; ud2) in the CPU frontend and don't take up μop cache.
  • - Adding ud2 on older chipsets or in embedded scenarios can be quite costly. For example on the Gameboy Advance arm4t, undef is a 4-byte nop, taking up valuable iwram space.
  • - Adding ud2 can require more padding between functions for alignment if the end of a function is at or very close to the alignment boundary.
  • - ud2 on some platform might be a simple nop without any actual effect. Generating it will only add cost without any benefit on those platforms.

In my opinion rust strives for safety and security even when using low-level mechanism and unsafe code. Adding ud2 to places rustc requires execution to never reach adds to this security, allowing possible vulnerabilities to be found earlier. On modern platforms, it's practically zero-cost. However, I do agree that it can cause problems on embedded systems and older chipsets.

Would it be a good compromise to put adding the ud2 instruction after naked-functions behind the already existing compiler flag trap-unreachable? You can already use -Z trap-unreachable=no today to make the compiler not emit ud2 instructions in unreachable places.

@Qix-
Copy link

Qix- commented Dec 14, 2024

@bjorn3 I've not seen nop paddings before with naked asm blocks, which architecture are you seeing that on? IMO it's imperative that you get "WYSIWYG"-like behavior when writing naked_asm!{} as that's sort of the whole point - complete and utter control over the emitted bytecode when it's crucial, especially in embedded and kernel environments.

I also strongly disagree that ud2-like instructions should be emitted in all cases, both for the reasons already stated, but also, not every case of naked_asm!{} is called directly. I have a usecase where naked_asm!{} is used during a build step to generate a byte array of instructions used in a kernel environment to bring up secondary cores that are in 16-bit mode, which means paging is disabled and thus instructions have to be written to a direct-mapped page somewhere in lower memory. The kernel is higher-half mapped, which means it resides in index 511 typically, and there's no guarantee that the bytecode is contiguous in rodata nor that it cleanly lives in a single page that I could use as a direct map, thus I have to copy it to contiguous physical memory first before I can boot it. I wrote a proc-macro that, perhaps a bit sacreligiously, uses naked_asm!{} to compile the asm on its own and then extracts it from the generated binary, returning a byte array literal - all using normal Rust tooling.

I could imagine that in similar cases the machine code has to be composed in some non-trivial way, meaning that disparate, perhaps "malformed" (as per the noreturn requirement, for example) naked_asm! sections are concatenated to form an otherwise well-defined block of machine code. Silently inserting an opinionated ud2 in there would wreak havoc on such use cases. Yes, perhaps they could use assemblers for this, but it's a massive plus to be able to do this in Rust, already, without any external CLI tools or extensive build systems.

Incredibly niche use case, I know, but this feature is already squarely in the land of niche. It really would be a huge pain to reverse-engineer the extra ud2 (nop obviously wouldn't be as much as a problem, but still not ideal), especially on CISC architectures and moreso if it was conditionally emitted based on the build profile (release vs dev).

At the very least, put it behind some flag. Maybe an option(notrap) to disable it, which IMO would be acceptable.

@folkertdev
Copy link
Contributor

An alternative approach: could a lint be designed to warn on possibly invalid naked asm?

I'm not sure how to make this robust, but in principle it would be a asm_string.ends_with("ret") sort of check. You could then make that more complicated by checking for jumps and invalid instructions like ud2 in the final position. It is not robust, and the warning should be explicit about that, but gets users some safety, even for (embedded) targets that would always compile to get the smallest binary size. The lint can just be ignored in cases where it is inaccurate.

@Qix-
Copy link

Qix- commented Dec 14, 2024

That's going down an analysis rabbit hole that isn't really feasible (for this case, at least). Not all textual ASM blocks are going to end in ret or something equivalent (especially given that there are at least two syntaxes supported - intel and at&t), and not all generated bytecode (binary) blocks will end in that instruction either, even if they're well formed. Barring subroutine blocks at the end (or the usecase I mentioned, where things are being concatenated), on x86 alone there are most of the jmp instruction variants, ret and iret, sysexit, etc. - all of which would make the asm block well-formed (as it doesn't fall through). Further, some jmp variants wouldn't make it well-formed. For example, in most cases a far JMP might mean the block doesn't return, but in cases where the code selector (CS) has to be changed, you typically far-jump with CS to the next instruction (since mov cs, 0x08 for example isn't valid for some historical reason; to change CS on x86 you have to far jump with CS). That would make it seem like the block is well formed when it isn't.

That level of understanding of bytecode would be an impressive project in its own right (e.g. unicorn engine levels of complexity), but developing all of that just for linting or automatic ud2 emission would be astronomically overengineered when in 99% of cases the engineer already knows whether or not they would want to omit ud2.

I do agree a linting rule here would be preferred, but I don't see how it could feasibly be implemented.

@bjorn3
Copy link
Member

bjorn3 commented Dec 14, 2024

I've not seen nop paddings before with naked asm blocks, which architecture are you seeing that on?

On x86_64 there is padding after every function to align the next function to 16 bytes.

IMO it's imperative that you get "WYSIWYG"-like behavior when writing naked_asm!{} as that's sort of the whole point - complete and utter control over the emitted bytecode when it's crucial, especially in embedded and kernel environments.

Adding ud2 after every naked function is indistinguishable from a function which happens to compile to ud2 being placed right after the naked function, so you still get as much WYSIWYG. WYSIWYG only applies to the function itself, not the padding after it.

I could imagine that in similar cases the machine code has to be composed in some non-trivial way, meaning that disparate, perhaps "malformed" (as per the noreturn requirement, for example) naked_asm! sections are concatenated to form an otherwise well-defined block of machine code. Silently inserting an opinionated ud2 in there would wreak havoc on such use cases. Yes, perhaps they could use assemblers for this, but it's a massive plus to be able to do this in Rust, already, without any external CLI tools or extensive build systems.

You have to indicate the size of the naked function if you want to copy it's bytes. This size woulf exclude the ud2 instruction inserted by the compiler and thus not be coppied anyway.

@haraldh
Copy link
Contributor

haraldh commented Dec 14, 2024

Linting? what is valid? ret? iret? some future TDX op code extension to enter a TEE? Please! What are you trying to guard against? The whole asm code is unsafe.

@folkertdev
Copy link
Contributor

That's going down an analysis rabbit hole that isn't really feasible

yes, fair point.

I was going to make some argument about most naked_asm! blocks ending in ret anyway but a quick sampling of github shows that is really not the case. There is a bunch of fun/cursed stuff though https://github.com/search?q=naked_asm%21&type=code&p=1

Linting? what is valid?

My idea was that it is at least a signal to the programmer to check that their assembly satisfies the constraints: if they deem the lint fired incorrectly, they can just allow it, perhaps with some comment explaining the reasoning.

But I agree that it is hard/impossible to actually make it robust (enough) to work in practice.

@Amanieu
Copy link
Member

Amanieu commented Dec 14, 2024

Adding ud2 after every naked function is indistinguishable from a function which happens to compile to ud2 being placed right after the naked function, so you still get as much WYSIWYG. WYSIWYG only applies to the function itself, not the padding after it.

That's not entirely true: ELF symbols have a length and typically any padding bytes are not included in this length since they are inserted by the linker. If we append ud2 then this will observably be different when inspecting the symbol table.

@bjorn3
Copy link
Member

bjorn3 commented Dec 14, 2024

Given that rustc generates inline assembly which defines the symbol, in the inline assembly rustc can specify the length of the symbol to exclude the ud2 instruction.

@dancrossnyc
Copy link

Please don't insert UD2's and things like that (or similar things for non-x86 targets) in naked functions. One of the problems with the earlier iterations of naked functions, mentioned above, was that they did this, which made it impossible to have precise control over layout and size.

Here's a motivating example: for example, consider two naked functions placed into a specific linker segment; both need to occupy exactly 4KiB (so, page sized on x86) and further they are both supposed to be aligned on 4K boundaries. One may not care what order they're put into the resulting binary, A before B or B before A, but the size and alignment requirements are absolute. They're not copied; instead, one relies on the linker to place them correctly. Adding extra UD2's at the end of these the NF's throws this off.

I appreciate the arguments for safety here, and in Rust generally, but naked functions are already incredibly niche: if one is reaching for them one's doing something so far out of the ordinary for most programmers that trying to add a ud2 to catch bugs here feels superfluous.

A response to this may be to use module-level assembly instead of naked functions for this sort of thing. While that's valid, there are some properties of naked functions that make them attractive versus global asm: symbol naming and visibility and so on.

@Qix-
Copy link

Qix- commented Dec 14, 2024

Might as well replace one nop with an ud2 for some extra safety.

On x86_64 there is padding after every function to align the next function to 16 bytes.

So in the cases when it's already aligned, there's no extra nop to replace, so no ud2 is emitted. So we're back in the territory of "ud2 sometimes, but not other times, and only in debug mode, and changing the preceding body can affect if a ud2 is there, or not".

So in such a case, it is functionally impossible to rely on it and therefore going to masquerade a ton of bugs and cause strange behavior in some cases and not others. Some developers will rely on it being there, erroneously of course. But then be surprised when it's not there. Instead of strictly specifying the safety constraints that must hold, which are there already. Also, nops are there to indicate padding. ud2/similar are not padding instructions. It has side effects, and we cannot know how different environments/tools/chips are going to react to them being there, even if they're not hit (e.g. in the case of pipelining).

There is a flimsy at best reason for emitting them, and a multitude of strong reasons not to emit them. Adding in such instructions is subverting expectation for some illusion of safety in a place that is very much opt-in and explicitly, objectively unsafe.

lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Dec 23, 2024
codegen `#[naked]` functions using global asm

tracking issue: rust-lang/rust#90957

Fixes #124375

This implements the approach suggested in the tracking issue: use the existing global assembly infrastructure to emit the body of `#[naked]` functions. The main advantage is that we now have full control over what gets generated, and are no longer dependent on LLVM not sneakily messing with our output (inlining, adding extra instructions, etc).

I discussed this approach with `@Amanieu` and while I think the general direction is correct, there is probably a bunch of stuff that needs to change or move around here. I'll leave some inline comments on things that I'm not sure about.

Combined with rust-lang/rust#127853, if both accepted, I think that resolves all steps from the tracking issue.

r? `@Amanieu`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-naked Area: `#[naked]`, prologue and epilogue-free, functions, https://git.io/vAzzS B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. F-naked_functions `#![feature(naked_functions)]` finished-final-comment-period The final comment period is finished for this PR / Issue. S-tracking-ready-to-stabilize Status: This is ready to stabilize; it may need a stabilization report and a PR T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.