-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to enable Intel AMX in asm!
on Linux?
#107795
Comments
If the problem is stack size, set the environment variable |
asm!
on Linux?
In that case this recommendation is incorrect? This doesn't mean the problem isn't stack size: it's quite plausible the issue is that Rust is consuming an abnormally large amount of the stack and a missed optimization is happening here. I believe you may programmatically increase the stack size via setrlimit to something more than the 8 megabytes that Linux and gcc will default to. You may also need to use Also, while it shouldn't matter here, if you want upgrades to more recent kernels and userlands which have better support in general for AMX, you may want to switch from CentOS 8 to CentOS Stream 8 via the directions here and then upgrade to CentOS Stream 9. Most notably, CentOS Stream 9 should have |
Yup, I'm wrong. I must have been misremembering the code I used last time I was experimenting with a stack size issue. |
Also if you want further help with debugging this, @jczaja, it might help if you describe the message you got from running this code, exactly. You say it exits but... what happens, exactly? |
@workingjubilee,@saethlin I must apologize , as I labelled wrong line with "PROGRAM EXITS HERE". So I updated the code(snippet bellow) to print error message:
Error message:
Output of strace of Rust program:
Output of strace of C++ program:
|
ENOSPC looks to be relevant here: https://github.com/torvalds/linux/blob/master/arch/x86/kernel/fpu/xstate.c#L1579 It looks like sigaltstack(2) was called somewhere with a small size that is not enough for the AMX states. Then, I found this -- #69533: At least, this constant should be replaced by some dynamic value like getauxval(AT_MINSIGSTKSZ) |
Ahh, okay, so it's definitely the Yes, if this diagnosis is correct, we should probably make something like that change. We will want to also be prepared for... "zaniness" like Arm SVE or RV64V_Zvl1024b. |
It seems that solution suggessted by @ChangSeokBae works
|
@workingjubilee , @ChangSeokBae , @saethlin Here is full dummy example of using AMX from Rust (as of stable 1.67.0 toolchain) : main.rs:
Cargo.toml:
Building:
Output:
|
It is best to make the relevant auxval constant available in libc in the likely event that other Unix-y platforms introduce a matching constant, with either the same or a different value, as they sometimes do for these things, when they cannot think of a better interface than the glibc one, so I have opened rust-lang/libc#3125 |
Alternatively, recent glibc versions (>=2.34) may work for you as they have non-constant (MIN)SIGSTKSZ: I was told that it will eventually reference AT_MINSIGSTKSZ. At the moment, it calculates the size based on CPUID, IIRC. |
@ChangSeokBae That probably won't work for us. Rust interacts with C by knowing how to handle the platform's C ABI, so it can call functions, but it is totally blind to macros: rustc does not have the C preprocessor. That's why we redefine these constants in things like our libc crate. |
I have opened:
|
…igstksz, r=m-ou-se Dynamically size sigaltstk in std On modern Linux with Intel AMX and 1KiB matrices, Arm SVE with potentially 2KiB vectors, and RISCV Vectors with up to 16KiB vectors, we must handle dynamic signal stack sizes. We can do so unconditionally by using getauxval, but assuming it may return 0 as an answer, thus falling back to the old constant if needed. Fixes rust-lang#107795
…igstksz, r=m-ou-se Dynamically size sigaltstk in std On modern Linux with Intel AMX and 1KiB matrices, Arm SVE with potentially 2KiB vectors, and RISCV Vectors with up to 16KiB vectors, we must handle dynamic signal stack sizes. We can do so unconditionally by using getauxval, but assuming it may return 0 as an answer, thus falling back to the old constant if needed. Fixes rust-lang#107795
…igstksz, r=m-ou-se Dynamically size sigaltstk in std On modern Linux with Intel AMX and 1KiB matrices, Arm SVE with potentially 2KiB vectors, and RISCV Vectors with up to 16KiB vectors, we must handle dynamic signal stack sizes. We can do so unconditionally by using getauxval, but assuming it may return 0 as an answer, thus falling back to the old constant if needed. Fixes rust-lang#107795
…igstksz, r=m-ou-se Dynamically size sigaltstk in std On modern Linux with Intel AMX and 1KiB matrices, Arm SVE with potentially 2KiB vectors, and RISCV Vectors with up to 16KiB vectors, we must handle dynamic signal stack sizes. We can do so unconditionally by using getauxval, but assuming it may return 0 as an answer, thus falling back to the old constant if needed. Fixes rust-lang#107795
Rollup merge of rust-lang#113525 - workingjubilee:handle-dynamic-minsigstksz, r=m-ou-se Dynamically size sigaltstk in std On modern Linux with Intel AMX and 1KiB matrices, Arm SVE with potentially 2KiB vectors, and RISCV Vectors with up to 16KiB vectors, we must handle dynamic signal stack sizes. We can do so unconditionally by using getauxval, but assuming it may return 0 as an answer, thus falling back to the old constant if needed. Fixes rust-lang#107795
…r=m-ou-se Dynamically size sigaltstk in std On modern Linux with Intel AMX and 1KiB matrices, Arm SVE with potentially 2KiB vectors, and RISCV Vectors with up to 16KiB vectors, we must handle dynamic signal stack sizes. We can do so unconditionally by using getauxval, but assuming it may return 0 as an answer, thus falling back to the old constant if needed. Fixes rust-lang/rust#107795
Hi,
I want to use in Rust (via inline assembly) Intel AMX instruction set.
AMX support is by default disabled in Linux Kernel due to significant amount of memory(~10KB) that has to be save on stack when there is context switching for programs using AMX. To enable AMX we need processor
with this capability (sapphirerapids), recent enough Linux kernel (5.16+) and stacks of FPU&sigalt to be of a size enough to be able to store AMX tiles (registers). Article on enabling AMX is here.
I have implemented a programs to enable and test AMX: one in C++ and the other in Rust. The one in C++ does initialize AMX properly, but the Rust program is not able to initialize AMX properly (likely due to stack sizes being not big enough, see "PROGRAM EXITS HERE in Rust example"). Similar problem was described for python programming language. Please advice how to have AMX support enabled in Rust on SapphireRapids under Linux.
Details:
C++ program:
Rust:
main:rs:
Cargo.toml
building:
RUSTFLAGS='-C target-cpu=sapphirerapids -C target-feature=+amx-int8,+amx-bf16,+amx-tile' cargo build
toolchains used: 1.67.0 , 1.69.0-nightly
Linux kernel: 5.19.0-1.el8.elrepo.x86_64
OS: Centos 8.5
The text was updated successfully, but these errors were encountered: