Append disabled LLVM CPU features after enabled ones #21501

gaosui · 2024-09-23T22:47:51Z

This PR attempts to resolve #21473, where LLVM uses FP registers on aarch64 when -fp-armv8 is present.

I traced this down to the llvm_cpu_features property of a resolved target. Features in this string is ordered by target.cpu.arch.allFeaturesList(), resulting in a feature string like ... ,-fp-armv8, ... ,+v8.2a.

I suspect that the way LLVM parses this string makes latter features overwrite earlier ones. +v8.2a appearing after -fp-armv8 would enable FPU again. Appending disabled features after enabled ones has indeed fixed this issue.

Some things to consider with this fix:

Is +feature after -feature ever required? This is a problem only when -feature is a superset of +feature. In general we want subsets appear after supersets. Zig CPU features have the notion of dependency, but that doesn't fully encapsulate the superset and subset relations, especially not for using with LLVM.
Please review general coding style and patterns.
Should we add a test case for this?

Frist time PR 🤞

alexrp · 2024-09-24T07:58:42Z

I suspect that the way LLVM parses this string makes latter features overwrite earlier ones. +v8.2a appearing after -fp-armv8 would enable FPU again. Appending disabled features after enabled ones has indeed fixed this issue.

From a glance at MCSubtargetInfo, this seems right. Features are parsed in the order they're given. When enabling a feature, dependency features are enabled; when disabling a feature, dependent features are disabled.

This would also explain some weird behavior we've seen previously where the CPU features should be correct, but some things end up being unexpectedly enabled regardless.

alexrp · 2024-09-24T08:12:16Z

Some extra nuance here: llvm/llvm-project@72a895336b1e4

alexrp · 2024-09-26T01:58:39Z

So a problem here is that Zig will process +v8_2a-fp_armv8 into ...,+ccpp,+pan-rwv,-fp-armv8,+v8.2a,.... That is, v8_2a is still enabled even though one of its dependencies (fp_armv8) is not. The order also doesn't matter; -fp_armv8+v8_2a will give the same result.

Meanwhile, as mentioned above, LLVM will process +v8.2a,-fp-armv8 into ...,+ccpp,+pan-rwv,-fp-armv8,-v8.2a,.... That is, v8.2a itself is disabled, but all of its dependencies are enabled except for fp-armv8. The order matters; -fp-armv8,+v8.2a will result in ...,+ccpp,+pan-rwv,+fp-armv8,+v8.2a,....

My point being: While this PR will improve the situation significantly, it won't completely solve the problem as you can still get into situations where a feature is enabled in Zig but disabled in LLVM.

I personally don't like the Zig behavior here. If you disable feature A that feature B depends on to work, you can't actually say that you have feature B enabled - that's nonsense. I think LLVM actually gets this right, not just in terms of handling dependencies, but also in terms of order; I find it highly unintuitive that +v8_2a-fp_armv8 and -fp_armv8+v8_2a mean the same thing in Zig, and I'd wager that most users would agree.

@andrewrk I'm curious to hear your thoughts on this. The CPU feature parsing was implemented quite a few years ago, so I don't know why it was done this way in particular. I would much prefer if we handled dependencies and ordering the same way as LLVM, if you're open to that. It would also help towards #12227 (comment).

rohlem · 2024-10-02T09:14:21Z

The order matters; -fp-armv8,+v8.2a will result in ...,+ccpp,+pan-rwv,+fp-armv8,+v8.2a,.... [...]
I find it highly unintuitive that +v8_2a-fp_armv8 and -fp_armv8+v8_2a mean the same thing in Zig, and I'd wager that most users would agree.

I agree from your comment that LLVM's handling seems more useful overall, except for this behavior:
If I pass -element,+set I don't think it's clear that element should be re-enabled by +set. (Maybe the user wasn't aware that the set includes element.)
Keeping it disabled could equally be a footgun / incorrect though.

IMO the more robust option would be to error if a set includes an option that was explicitly specified conflicting prior. (So -element,+set as well as +element,-set.)
Either the option shouldn't have been declared disabled, or the order should be reversed by the user, to opt-in to prioritizing -element over the earlier +set.

Append disabled LLVM CPU features after enabled ones

552af8c

gaosui force-pushed the llvm-cpu-feature-order branch from 1e91c6b to 552af8c Compare September 24, 2024 22:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Append disabled LLVM CPU features after enabled ones #21501

Append disabled LLVM CPU features after enabled ones #21501

gaosui commented Sep 23, 2024

alexrp commented Sep 24, 2024

alexrp commented Sep 24, 2024

alexrp commented Sep 26, 2024 •

edited

Loading

rohlem commented Oct 2, 2024 •

edited

Loading

Append disabled LLVM CPU features after enabled ones #21501

Are you sure you want to change the base?

Append disabled LLVM CPU features after enabled ones #21501

Conversation

gaosui commented Sep 23, 2024

alexrp commented Sep 24, 2024

alexrp commented Sep 24, 2024

alexrp commented Sep 26, 2024 • edited Loading

rohlem commented Oct 2, 2024 • edited Loading

alexrp commented Sep 26, 2024 •

edited

Loading

rohlem commented Oct 2, 2024 •

edited

Loading