Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARC/ORC loop string allocation slowdown #15392

Closed
VP8M8 opened this issue Sep 23, 2020 · 8 comments
Closed

ARC/ORC loop string allocation slowdown #15392

VP8M8 opened this issue Sep 23, 2020 · 8 comments

Comments

@VP8M8
Copy link

VP8M8 commented Sep 23, 2020

My proof of work function encounters a bug when running it under arc/orc. I'm using a libsodium wrapper but if you replace crypto_generichash() with SHA1 in the standard library or a SHA2 library (or any other hashing library I think) the bug is also triggered. It runs fine for the default refc gc but when I enable arc or orc I only get ~50% cpu usage instead of 100% with refc. Something weird must be happening under the hood since while htop reports ~90% cpu usage, only ~50% of the load bars are green while the other half is red. This means ~50% of cpu time is used for user processes and ~50% is used for kernel processes. The refc version uses 100% cpu (all green user time) and takes 2.1 seconds to complete while arc/orc takes 17.5 seconds on my Haswell cpu.

Example

Edit: removed unnecessary code and added SHA1 version in comments.

import threadpool, strutils
import libsodium/sodium
# import std/sha1

let
  payload = "just testing?0!7801"
#  initialHashHex = $secureHash(payload)
  initialHashHex = crypto_generichash(payload, 16)
  target = 0xFFFFFFFFFF'u64

func pow(target: uint64, initialHashHex: string, nonce: uint64): (uint64, string) {.thread.} =
  var
    trialValue = high(uint64)
    nonceThread = nonce
    resultHash = newStringOfCap(64)
  while trialValue > target:
#    resultHash = $secureHash($nonceThread & initialHashHex)
    resultHash = crypto_generichash(crypto_generichash($nonceThread, 32, initialHashHex).bin2hex, 32, initialHashHex).bin2hex
    trialValue = fromHex[uint64](resultHash[0..15])
    inc(nonceThread)
  return (nonceThread-1, resultHash)

proc multithread(totalThreads: range[1..256], thread: range[1..256], initialHashHex: string, target: uint64): uint64 {.thread.} =
  let nonce = high(uint64) div uint64(totalThreads) * uint64(thread-1)
  echo "thread: ", thread, " initialNonce: ", nonce.toHex  # Debugging
  let (finalNonce, finalHash) = pow(target, initialHashHex, nonce)
  echo "thread: ", thread, " finalNonce: ", finalNonce.toHex, " finalHash: ", finalHash  # Debugging
  result = finalNonce

proc callMulti(threads: range[0..256], initialHashHex: string, target: uint64) =
  var responses = newSeq[FlowVarBase](threads)
  for thread in 1..threads:
    responses[thread-1] = spawn multithread(threads, thread, initialHashHex, target)
  var index = blockUntilAny(responses)

let threads = 4
callMulti(threads, initialHashHex, target)

Current Output

Used time <program>

thread: 1 initial-nonce: 0000000000000000
thread: 2 initial-nonce: 3FFFFFFFFFFFFFFF
thread: 3 initial-nonce: 7FFFFFFFFFFFFFFE
thread: 4 initial-nonce: BFFFFFFFFFFFFFFD
thread: 3 finalNonce: 8000000000149A1E finalHash: 000000a571afd62996c32001f8f84ecc1d178d1e4937f64bd2db65d840e76fa5

real    0m17.544s
user    0m38.656s
sys     0m25.423s

Expected Output

Used time <program>

thread: 1 initial-nonce: 0000000000000000
thread: 2 initial-nonce: 3FFFFFFFFFFFFFFF
thread: 3 initial-nonce: 7FFFFFFFFFFFFFFE
thread: 4 initial-nonce: BFFFFFFFFFFFFFFD
thread: 3 finalNonce: 8000000000149A1E finalHash: 000000a571afd62996c32001f8f84ecc1d178d1e4937f64bd2db65d840e76fa5

real    0m2.058s
user    0m7.896s
sys     0m0.009s

Possible Solution

I posted this on the forum and mratsim chimed in saying:

  • crypto_generichash($nonceThread, 32, initialHashHex), this converts (and so allocate) to a Nim string, and then convert to C-string implicitly as libsodium is using a C backend. It's possible that ARC/ORC is then reclaiming this somehow.

While that might be the case, it still happens when using a hash algorithm implemented in nim.

Additional Information

The bug exists in the latest devel and 1.2.6.

$ nim -v
Nim Compiler Version 1.3.7 [Linux: amd64]
Compiled at 2020-09-22
git hash: 4b9eea2fcc5d030393ca9020fbbee33fcb48d41a
@Araq
Copy link
Member

Araq commented Sep 23, 2020

Please provide the version that imports std / sha1

@VP8M8
Copy link
Author

VP8M8 commented Sep 23, 2020

It's the same as above except add import std/sha1 at the top and replace

resultHash = crypto_generichash(crypto_generichash($nonceThread, 32, initialHashHex).bin2hex, 32, initialHashHex).bin2hex

with

resultHash = $secureHash($nonceThread & initialHashHex)

@Araq
Copy link
Member

Araq commented Sep 25, 2020

On my machine the runtimes are 6 minutes for ARC, 4 minutes for the default GC. For unknown reasons.

@VP8M8
Copy link
Author

VP8M8 commented Sep 28, 2020

Isn't it an unknown ARC reason? Why remove the ARC label?

@VP8M8
Copy link
Author

VP8M8 commented Jan 12, 2022

Just an update saying the ARC/ORC bug is still present in Nim 1.6.2.

@ringabout
Copy link
Member

It seems to be related to threads + orc. Related: #18146

image

@ringabout
Copy link
Member

--mm:orc -d:useMalloc

image

--mm:refc
image

@ringabout
Copy link
Member

fixed on the devel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants