Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append disabled LLVM CPU features after enabled ones #21501

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gaosui
Copy link

@gaosui gaosui commented Sep 23, 2024

This PR attempts to resolve #21473, where LLVM uses FP registers on aarch64 when -fp-armv8 is present.

I traced this down to the llvm_cpu_features property of a resolved target. Features in this string is ordered by target.cpu.arch.allFeaturesList(), resulting in a feature string like ... ,-fp-armv8, ... ,+v8.2a.

I suspect that the way LLVM parses this string makes latter features overwrite earlier ones. +v8.2a appearing after -fp-armv8 would enable FPU again. Appending disabled features after enabled ones has indeed fixed this issue.

Some things to consider with this fix:

  1. Is +feature after -feature ever required? This is a problem only when -feature is a superset of +feature. In general we want subsets appear after supersets. Zig CPU features have the notion of dependency, but that doesn't fully encapsulate the superset and subset relations, especially not for using with LLVM.
  2. Please review general coding style and patterns.
  3. Should we add a test case for this?

Frist time PR 🤞

@alexrp
Copy link
Member

alexrp commented Sep 24, 2024

I suspect that the way LLVM parses this string makes latter features overwrite earlier ones. +v8.2a appearing after -fp-armv8 would enable FPU again. Appending disabled features after enabled ones has indeed fixed this issue.

From a glance at MCSubtargetInfo, this seems right. Features are parsed in the order they're given. When enabling a feature, dependency features are enabled; when disabling a feature, dependent features are disabled.

This would also explain some weird behavior we've seen previously where the CPU features should be correct, but some things end up being unexpectedly enabled regardless.

@alexrp
Copy link
Member

alexrp commented Sep 24, 2024

Some extra nuance here: llvm/llvm-project@72a895336b1e4

@alexrp
Copy link
Member

alexrp commented Sep 26, 2024

So a problem here is that Zig will process +v8_2a-fp_armv8 into ...,+ccpp,+pan-rwv,-fp-armv8,+v8.2a,.... That is, v8_2a is still enabled even though one of its dependencies (fp_armv8) is not. The order also doesn't matter; -fp_armv8+v8_2a will give the same result.

Meanwhile, as mentioned above, LLVM will process +v8.2a,-fp-armv8 into ...,+ccpp,+pan-rwv,-fp-armv8,-v8.2a,.... That is, v8.2a itself is disabled, but all of its dependencies are enabled except for fp-armv8. The order matters; -fp-armv8,+v8.2a will result in ...,+ccpp,+pan-rwv,+fp-armv8,+v8.2a,....

My point being: While this PR will improve the situation significantly, it won't completely solve the problem as you can still get into situations where a feature is enabled in Zig but disabled in LLVM.

I personally don't like the Zig behavior here. If you disable feature A that feature B depends on to work, you can't actually say that you have feature B enabled - that's nonsense. I think LLVM actually gets this right, not just in terms of handling dependencies, but also in terms of order; I find it highly unintuitive that +v8_2a-fp_armv8 and -fp_armv8+v8_2a mean the same thing in Zig, and I'd wager that most users would agree.

@andrewrk I'm curious to hear your thoughts on this. The CPU feature parsing was implemented quite a few years ago, so I don't know why it was done this way in particular. I would much prefer if we handled dependencies and ordering the same way as LLVM, if you're open to that. It would also help towards #12227 (comment).

@rohlem
Copy link
Contributor

rohlem commented Oct 2, 2024

The order matters; -fp-armv8,+v8.2a will result in ...,+ccpp,+pan-rwv,+fp-armv8,+v8.2a,.... [...]
I find it highly unintuitive that +v8_2a-fp_armv8 and -fp_armv8+v8_2a mean the same thing in Zig, and I'd wager that most users would agree.

I agree from your comment that LLVM's handling seems more useful overall, except for this behavior:
If I pass -element,+set I don't think it's clear that element should be re-enabled by +set. (Maybe the user wasn't aware that the set includes element.)
Keeping it disabled could equally be a footgun / incorrect though.

IMO the more robust option would be to error if a set includes an option that was explicitly specified conflicting prior. (So -element,+set as well as +element,-set.)
Either the option shouldn't have been declared disabled, or the order should be reversed by the user, to opt-in to prioritizing -element over the earlier +set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

aarch64 bare metal uses FP&SIMD registers when neon and fp are disabled
3 participants