-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix target-cpu fpu features on Armv8-R. #130295
Conversation
rustbot has assigned @petrochenkov. Use |
These commits modify compiler targets. |
cc @workingjubilee since you looked at #123159, I'll reroll this PR to you too. |
And since it might help bisection in the future, |
Confirmed that with this change, #![no_main]
#![no_std]
#[no_mangle]
pub fn foo(x: f64, y: f64) -> f64 {
x + y
}
#[no_mangle]
pub fn bar(x: &mut [f32; 4], y: &[f32; 4]) {
for i in 0..4 {
x[i] += y[i];
}
}
use core::panic::PanicInfo;
#[panic_handler]
fn panic(_panic: &PanicInfo<'_>) -> ! {
loop {}
} $ build/aarch64-apple-darwin/stage1/bin/rustc --emit=obj --target armv8r-none-eabihf foo.rs -C target-cpu=cortex-r52 -C opt-level=3
polus:rust$ objdump -d foo.o
foo.o: file format elf32-littlearm
Disassembly of section .text.foo:
00000000 <foo>:
0: ee300b01 vadd.f64 d0, d0, d1
4: e12fff1e bx lr
Disassembly of section .text.bar:
00000000 <bar>:
0: f4610a8f vld1.32 {d16, d17}, [r1]
4: f4602a8f vld1.32 {d18, d19}, [r0]
8: f2400de2 vadd.f32 q8, q8, q9
c: f4400a8f vst1.32 {d16, d17}, [r0]
10: e12fff1e bx lr
Disassembly of section .text.rust_begin_unwind:
00000000 <rust_begin_unwind>:
0: eafffffe b 0x0 <rust_begin_unwind> @ imm = #-8 Output on the immediate ancestor commit: $ build/aarch64-apple-darwin/stage1/bin/rustc --emit=obj --target armv8r-none-eabihf foo.rs -C target-cpu=cortex-r52 -C opt-level=3
$ objdump -d foo.o
foo.o: file format elf32-littlearm
Disassembly of section .text.foo:
00000000 <foo>:
0: e92d4800 push {r11, lr}
4: ec510b10 vmov r0, r1, d0
8: ec532b11 vmov r2, r3, d1
c: ebfffffe bl 0xc <foo+0xc> @ imm = #-8
10: ec410b10 vmov d0, r0, r1
14: e8bd8800 pop {r11, pc}
Disassembly of section .text.bar:
00000000 <bar>:
0: ed910a00 vldr s0, [r1]
4: ed911a01 vldr s2, [r1, #4]
8: ed912a02 vldr s4, [r1, #8]
c: ed913a03 vldr s6, [r1, #12]
10: ed904a00 vldr s8, [r0]
14: ed905a01 vldr s10, [r0, #4]
18: ed906a02 vldr s12, [r0, #8]
1c: ed907a03 vldr s14, [r0, #12]
20: ee300a04 vadd.f32 s0, s0, s8
24: ee311a05 vadd.f32 s2, s2, s10
28: ee322a06 vadd.f32 s4, s4, s12
2c: ee333a07 vadd.f32 s6, s6, s14
30: ed800a00 vstr s0, [r0]
34: ed801a01 vstr s2, [r0, #4]
38: ed802a02 vstr s4, [r0, #8]
3c: ed803a03 vstr s6, [r0, #12]
40: e12fff1e bx lr
Disassembly of section .text.rust_begin_unwind:
00000000 <rust_begin_unwind>:
0: eafffffe b 0x0 <rust_begin_unwind> @ imm = #-8 |
@chrisnc My apologies, for your good deeds will not go unpunished, as now I have to ask: Can you add that as a test? |
Punishment accepted. Could you point me to an example that would be good to follow? The tests I see in Here's roughly where I got, but it doesn't compile because no_core disables everything, and if I try to use core, it won't be available in the test. I assume I'm doing this wrong. // Test that Neon instructions are emitted when a target-cpu supporting them is specified.
//@ assembly-output: emit-asm
//@ compile-flags: --target armv8r-none-eabihf
//@ compile-flags: -C target-cpu=cortex-r52
//@ compile-flags: -C opt-level=3
//@ needs-llvm-components: arm
#![feature(no_core, lang_items)]
#![no_std]
#![no_core]
#![crate_type = "lib"]
#[lang = "sized"]
trait Sized {}
#[lang = "copy"]
trait Copy {}
// CHECK-LABEL: vadd_q:
#[no_mangle]
pub fn vadd_q(x: &mut [f32; 4], y: &[f32; 4]) {
// CHECK: vld{{.*}}
// CHECK: vld{{.*}}
// CHECK: vadd.f32 {{.*}}
// CHECK: vst{{.*}}
// CHECK: bx lr
let mut i = 0usize;
while i < 4usize {
x[i] += y[i];
i += 1usize;
}
}
// CHECK-LABEL: vadd_f64:
#[no_mangle]
pub fn vadd_f64(x: f64, y: f64) -> f64 {
// CHECK: vadd.f64 d0, d0, d1
// CHECK: bx lr
x + y
} output:
|
The problem is that this situation is cross-compiling, but I don't believe we actually have a satisfactory way to run cross-compiling of assembly tests beyond the |
8e38ae3
to
f456598
Compare
This PR modifies cc @jieyouxu |
f456598
to
d7b2790
Compare
Okay, the rmake test is working locally now at least. I'm not sure how to have the same test run for multiple targets, but I've set it up so it will extend to the other target+cpu combinations easily (tested |
no need to test every single possible subarchitecture, but yes, this looks nice. I will see if we can't implement actual cross-compiled assembly testing, but for now, this will do quite well, thank you. @bors r+ rollup |
Great! Will this actually run in CI? I had to provide a matching target with: |
d7b2790
to
d4c7c76
Compare
Sorry, needed to amend because I realized two things: I re-read my comments on the older PR from a few months ago and was reminded that the plan for Armv8-R is that the feature list should be empty, because Arm defines Armv8-R to always have these floating-point features, and so does LLVM, so we don't need to specify them. I've removed the feature setting and updated the comment and commit message to reflect that. |
@bors r- |
Fix target-cpu fpu features on Armv8-R. This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture. Add a run-make test that target-cpu=cortex-r52 enables double-precision and neon. try-job: dist-aarch64-linux try-job: dist-various-1 try-job: dist-various-2 try-job: dist-x86_64-linux try-job: dist-x86_64-msvc
💔 Test failed - checks-actions |
This comment has been minimized.
This comment has been minimized.
:( And would it be okay to break off the test part to another PR and merge the Armv8-R change as-is? |
Yeah let's do that. |
I guess we'll be stuck just straight-up invoking what amounts to |
This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture.
1df34e4
to
1a0ba01
Compare
Done. Created #130382 for the test. |
coolio. @bors r+ rollup |
…ingjubilee Fix target-cpu fpu features on Armv8-R. This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture.
…ingjubilee Fix target-cpu fpu features on Armv8-R. This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture.
…ingjubilee Fix target-cpu fpu features on Armv8-R. This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture.
…kingjubilee Rollup of 3 pull requests Successful merges: - rust-lang#130295 (Fix target-cpu fpu features on Armv8-R.) - rust-lang#130325 (Use -0.0 in `intrinsics::simd::reduce_add_unordered`) - rust-lang#130371 (Correctly account for niche-optimized tags in rustc_transmute) r? `@ghost` `@rustbot` modify labels: rollup
These ones are really unfortunate, there's a couple of run-make tests that seem to need to use bootstrap cargo to |
…iaskrgr Rollup of 5 pull requests Successful merges: - rust-lang#129195 (Stabilize `&mut` (and `*mut`) as well as `&Cell` (and `*const Cell`) in const) - rust-lang#130118 (move Option::unwrap_unchecked into const_option feature gate) - rust-lang#130295 (Fix target-cpu fpu features on Armv8-R.) - rust-lang#130371 (Correctly account for niche-optimized tags in rustc_transmute) - rust-lang#130381 (library: Compute Rust exception class from its string repr) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#130295 - chrisnc:armv8r-feature-fix, r=workingjubilee Fix target-cpu fpu features on Armv8-R. This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture.
This is a follow-up to #123159, but applied to Armv8-R.
This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture.