Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require the cx16 feature on Intel CPUs when threading is enabled #15219

Merged
merged 1 commit into from
Mar 6, 2016

Conversation

eschnett
Copy link
Contributor

This makes 128-bit atomics work on Intel. Without this feature, LLVM does not always know how to generate the respective code.

The CPU feature cx16 describes whether the CPU supports the cmpxchg16b instruction. All modern Intel CPUs support this feature; see the discussion in #14818.

When Julia generates code for a native 64-bit CPU, this flag is already set correctly. However, when Julia generates code for either a 32-bit or a generic Intel CPU, then LLVM assumes pessimistically that this feature is not present. According to Wikipedia https://en.wikipedia.org/wiki/X86-64, this is only relevant for "early AMD64 processors", and "the 64-bit version of Windows 8.1 requires the instruction". I thus suggest to require this instruction as well when threading is enabled.

The alternative is to disable support for 128-bit atomics. The generic CPU target is apparently specified at many occasions, including for 32-bit Intel CPUs and in Travis.

Closes #14818.

@tkelman
Copy link
Contributor

tkelman commented Feb 24, 2016

What is the earliest 32 bit x86 processor family that implements this?

@eschnett
Copy link
Contributor Author

@tkelman I found this: http://www.tomshardware.com/answers/id-2396559/cmpxchg16b-compatible-cpu.html (this forum usually has reliable information):

"Intel introduced the CMPXCHG16B (aka cx16) instruction first with the P4 Prescott/Pentium D and then added it to the P6 derivative CPUs with the Core 2 Duo. [...]

"AMD added the instruction to Bulldozer, Jaguar, and later CPU cores."

@eschnett eschnett mentioned this pull request Feb 27, 2016
@tkelman
Copy link
Contributor

tkelman commented Feb 28, 2016

I think we may as well try this and see whether it causes any issues on old hardware.

@eschnett
Copy link
Contributor Author

eschnett commented Mar 5, 2016

@tkelman Can I interpret this as "LGTM"? (I'd like to repeat this only affects builds with threading enabled.)

@tkelman
Copy link
Contributor

tkelman commented Mar 6, 2016

The fact that we don't always build with threading enabled means it would probably take longer to notice any ill effects of this.

@eschnett
Copy link
Contributor Author

eschnett commented Mar 6, 2016

True. So do you lean towards "try to break it early and visibly", "try it timidly in a corner", or "leave out support for In128"?

In the latter case, we can still detect at run time how Julia has been configured, and enable Int128 atomics if they work, but this would then see no Travis coverage.

@tkelman
Copy link
Contributor

tkelman commented Mar 6, 2016

While master is still in an unstable dev period I lean towards the former, let's find out if there are users on nightlies where this might break things (if we can work out our nightly frequency issues). Hardware dependent conditions just need lots of users to try things out and report back.

This makes 128-bit atomics work on Intel. Without this feature, LLVM does not always know how to generate the respective code.

The CPU feature `cx16` describes whether the CPU supports the `cmpxchg16b` instruction. All modern Intel CPUs support this feature; see the discussion in JuliaLang#14818.

When Julia generates code for a `native` 64-bit CPU, this flag is already set correctly. However, when Julia generates code for either a 32-bit or a `generic` Intel CPU, then LLVM assumes pessimistically that this feature is not present. According to Wikipedia <https://en.wikipedia.org/wiki/X86-64>, this is only relevant for "early AMD64 processors", and "the 64-bit version of Windows 8.1 requires the instruction". I thus suggest to require this instruction as well when threading is enabled.

The alternative is to disable support for 128-bit atomics. The generic CPU target is apparently specified at many occasions, including for 32-bit Intel CPUs and in Travis.

Closes JuliaLang#14818.
@eschnett eschnett force-pushed the eschnett/cmpxchg16b branch from 980a47f to b87d54c Compare March 6, 2016 03:48
@nalimilan
Copy link
Member

I agree, if we want to rely on this from now on, better make it visible so that we get feedback.

@tkelman
Copy link
Contributor

tkelman commented Mar 6, 2016

this function should be re indented to 4 spaces, but that wasn't your doing

tkelman added a commit that referenced this pull request Mar 6, 2016
Require the `cx16` feature on Intel CPUs when threading is enabled
@tkelman tkelman merged commit 8a7eac2 into JuliaLang:master Mar 6, 2016
@eschnett eschnett deleted the eschnett/cmpxchg16b branch March 6, 2016 14:01
@tkelman
Copy link
Contributor

tkelman commented Mar 10, 2016

This feature requirement should probably also be conditional on LLVM version. We're still kind of supporting building against old LLVM until we've fixed the compiler performance regressions, and on 3.3 this gives a warning on startup that '+cx16' is not a recognized feature for this target.

@vtjnash
Copy link
Member

vtjnash commented Mar 10, 2016

i have that fixed locally and plan to make a PR soon.

@oscardssmith oscardssmith mentioned this pull request Apr 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants