Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atomic C functions are not atomic. #41

Open
TravisWhitaker opened this issue Jun 25, 2020 · 6 comments
Open

Atomic C functions are not atomic. #41

TravisWhitaker opened this issue Jun 25, 2020 · 6 comments

Comments

@TravisWhitaker
Copy link

TravisWhitaker commented Jun 25, 2020

hs_atomic_read and hs_atomic_write aren't atomic at all on machines without strong memory models (total read and store ordering). To see an example of the kind of havoc this bad assumption can cause, see this (thankfully now fixed) GHC issue https://gitlab.haskell.org/ghc/ghc/-/issues/15449

Here's the aarch64 code that GCC 8.2 yields for these functions: https://godbolt.org/z/giVdtx No barriers or acq/rel lda/sta emitted...

I'm curious why C functions are used for these atomic operations at all. Are these functions really that much faster than atomicModifyIORef?

@TravisWhitaker
Copy link
Author

The Atomic values that we operate on with these C functions are in a ForeignPtr https://github.com/tibbe/ekg-core/blob/f8d26b9a3806125694c25b9459ec0c4a0b2e87ce/Data/Atomic.hs#L22

Since IORef is just a MutVar#, using these C functions might even be slower than an IORef.

I'd be happy to take a crack at moving over to IORef, but perhaps there's something I'm missing (or I'm standing on the grave of some really tricky bug...).

@tibbe
Copy link
Collaborator

tibbe commented Jun 25, 2020

IIRC I measured it and it was much faster. In particular a MutVar# update implies allocation for the heap value it refers to.

@TravisWhitaker
Copy link
Author

Yikes, for some reason I thought that a MutVar's contents would be unboxed if the contained type was unboxed, but it seems that's not the case: https://gitlab.haskell.org/ghc/ghc/-/blob/master/includes/rts/storage/Closures.h#L181

MutableByteArray# has the required atomic operations, though. So we could sprinkle __sync__synchronize() around, or give MutableByteArray# a spin.

@TravisWhitaker
Copy link
Author

@tibbe is there any interest in fixing this bug (via #42 or otherwise)? As-is, ekg is totally broken on aarch64.

CC @23Skidoo

@23Skidoo
Copy link
Collaborator

I'll look into cutting new releases during the weekend.

@TravisWhitaker
Copy link
Author

I’d definitely like to get more eyes on #42 before it lands in a release. I’ve been using it since I opened the PR, but my use cases might not hit certain issues.

awjchen added a commit to hasura/ekg-core that referenced this issue May 23, 2021
This commit attempts to address issue haskell-github-trust#41 of tibbe/ekg-core by
replaceing the C code for `Atomic` with GHC prim ops
(`fetchAddIntArray`).

However, we insist on a 64-bit counter, so if machine does not
support 64-bit prim ops, we fall back to using an `IORef` and
`atomicModifyIORefCAS`.

The performance of the 64-bit prim ops implementation is somewhat slower
than the existing C code, perhaps due to the additional conversion
between `Int` and `Int64`.
awjchen added a commit to hasura/ekg-core that referenced this issue May 29, 2021
This commit attempts to address issue haskell-github-trust#41 of tibbe/ekg-core by
replacing the C code for the distribution metric with GHC prim ops.

The performance of this implementation is about half that of the
existing C code in a single-threaded benchmark; without masking the
performance is comparable.

This commit is based on the work of Travis Whitaker in PR haskell-github-trust#42 of
tibbe/ekg-core.
awjchen added a commit to hasura/ekg-core that referenced this issue Jun 23, 2021
This commit attempts to address issue haskell-github-trust#41 of tibbe/ekg-core by
replaceing the C code for `Atomic` with GHC prim ops
(`fetchAddIntArray`).

However, we insist on a 64-bit counter, so if machine does not
support 64-bit prim ops, we fall back to using an `IORef` and
`atomicModifyIORefCAS`.

The performance of the 64-bit prim ops implementation is somewhat slower
than the existing C code, perhaps due to the additional conversion
between `Int` and `Int64`.
awjchen added a commit to hasura/ekg-core that referenced this issue Jun 23, 2021
This commit attempts to address issue haskell-github-trust#41 of tibbe/ekg-core by
replacing the C code for the distribution metric with GHC prim ops.

The performance of this implementation is about half that of the
existing C code in a single-threaded benchmark; without masking the
performance is comparable.

This commit is based on the work of Travis Whitaker in PR haskell-github-trust#42 of
tibbe/ekg-core.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants