Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add jl_getaffinity and jl_setaffinity #53402

Merged
merged 18 commits into from
May 25, 2024
Merged

Conversation

carstenbauer
Copy link
Member

@carstenbauer carstenbauer commented Feb 20, 2024

This PR adds two functions jl_getaffinity and jl_setaffinity to the runtime, which are slim wrappers around uv_thread_getaffinity and uv_thread_setaffinity and can be used to set the affinity of Julia threads.

This will

  • simplify thread pinning (ThreadPinning.jl currently pins threads by spawning tasks that run the necessary ccalls) and
  • enable users to also pin GC threads (or, more generally, all Julia threads).

Example:

bauerc@n2lcn0146 julia git:(cb/affinity)  
➜ ./julia -q --startup-file=no --threads 2,3 --gcthreads 4,1
julia> cpumasksize = @ccall uv_cpumask_size()::Cint
1024

julia> mask = zeros(Cchar, cpumasksize);

julia> jl_getaffinity(tid, mask, cpumasksize) = ccall(:jl_getaffinity, Int32, (Int16, Ptr{Cchar}, Int32), tid, mask, cpumasksize)
jl_getaffinity (generic function with 1 method)

julia> jl_getaffinity(1, mask, cpumasksize)
0

julia> print(mask[1:Sys.CPU_THREADS])
Int8[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

julia> mask[1] = 0;

julia> jl_setaffinity(tid, mask, cpumasksize) = ccall(:jl_setaffinity, Int32, (Int16, Ptr{Cchar}, Int32), tid, mask, cpumasksize)
jl_setaffinity (generic function with 1 method)

julia> jl_setaffinity(1, mask, cpumasksize)
0

julia> fill!(mask, 0);

julia> jl_getaffinity(1, mask, cpumasksize)
0

julia> print(mask[1:Sys.CPU_THREADS])
Int8[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

(cc @vchuravy, @gbaraldi)

Would be great to get this into 1.11 (despite feature freeze) because otherwise we won't be able to pin GC threads until 1.12 (likely not until the end of the year).

Closes #53073

@carstenbauer carstenbauer added the multithreading Base.Threads and related functionality label Feb 20, 2024
@carstenbauer
Copy link
Member Author

carstenbauer commented Feb 20, 2024

Open questions:

  • Currently I use jl_atomic_load_acquire to query jl_n_threads and jl_atomic_load_relaxed to get jl_all_tls_states. Is this ok?
  • I put the declaration at the end of julia_threads.h. Is this fine?
  • I added the two new functions to jl_exported_funcs.inc. Is this how I should do it, or is this auto-generated somehow?
  • To what extend do we test/document these functions?

Copy link
Member

@vchuravy vchuravy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs at least some minimal tests

src/threading.c Outdated Show resolved Hide resolved
src/threading.c Outdated Show resolved Hide resolved
@carstenbauer
Copy link
Member Author

Added minimal tests and incorporated the changes suggested by @vchuravy.

src/threading.c Outdated Show resolved Hide resolved
@carstenbauer
Copy link
Member Author

carstenbauer commented Feb 21, 2024

Adding backport label for 1.11 for the reasons mentioned above (should also be very straightforward to backport). Please chime in if you disagree.

@carstenbauer carstenbauer added the backport 1.11 Change should be backported to release-1.11 label Feb 21, 2024
@KristofferC KristofferC mentioned this pull request Feb 26, 2024
28 tasks
test/threads.jl Outdated Show resolved Hide resolved
KristofferC added a commit that referenced this pull request Mar 1, 2024
Backported PRs:
- [x] #53361 <!-- 🤖 [master] Bump the SparseArrays stdlib from c9f7293
to cb602d7 -->
- [x] #53300 <!-- allow external absint to hold custom data in
`codeinst.inferred` -->
- [x] #53342 <!-- Add `Base.wrap` to docs -->
- [x] #53372 <!-- Silence warnings in `test/file.jl` -->
- [x] #53357 <!-- 🤖 [master] Bump the Pkg stdlib from 6dd0e7c9e to
76070d295 -->
- [x] #53373 <!-- fix sysimage-native-code=no option with pkgimages -->
- [x] #53333 <!-- More consistent return value for annotations, take 2
-->
- [x] #53354 <!-- fix code coverage bug in tail position and `else` -->
- [x] #53407 <!-- fix sysimage-native-code=yes option -->
- [x] #53388 <!-- Fix documentation: thread pool of main thread -->
- [x] #53355 <!-- Fix synchronization issues on the GC scheduler. -->
- [x] #53429 <!-- Subtype: skip slow-path in `local_∀_∃_subtype` if
inputs contain no ∃ typevar. -->
- [x] #53437 <!-- Add debug variant of loader_trampolines.o -->
- [x] #53284 <!-- Add annotate! method for AnnotatedIOBuffer -->
- [x] #53466 <!-- [MozillaCACerts_jll] Update to v2023-12-12 -->
- [x] #53467 <!-- [LibGit2_jll] Update to v1.7.2 -->
- [x] #53326 <!-- RFC: when loading code for internal purposes, load
stdlib files directly, bypassing DEPOT_PATH, LOAD_PATH, and stale checks
-->
- [x] #53332
- [x] #53320 <!-- Add `Sys.isreadable, Sys.iswriteable`, update `ispath`
-->
- [x] #53476

Contains multiple commits, manual intervention needed:
- [ ] #53285 <!-- Add update mechanism for Terminfo, and common
user-alias data -->

Non-merged PRs with backport label:
- [ ] #53424 <!-- yet more atomics & cache-line fixes on work-stealing
queue -->
- [ ] #53408 <!-- task splitting: change additive accumulation to
multiplicative -->
- [ ] #53403 <!-- Move parallel precompilation to Base -->
- [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` -->
- [ ] #53391 <!-- Default to the medium code model in x86 linux -->
- [ ] #53125 <!-- coverage: count coverage where explicitly requested by
inference only -->
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
@KristofferC KristofferC mentioned this pull request Mar 1, 2024
60 tasks
@carstenbauer carstenbauer requested a review from vtjnash March 7, 2024 16:28
src/threading.c Outdated Show resolved Hide resolved
@vtjnash vtjnash added the merge me PR is reviewed. Merge when all tests are passing label Mar 7, 2024
@IanButterworth
Copy link
Member

Windows failures appear related

Test Failed at C:\buildkite-agent\builds\win2k22-amdci6-1\julialang\julia-master\julia-a50eb050b7\share\julia\test\threads.jl:353
--
  | Expression: jl_setaffinity(1, mask, cpumasksize) == 0
  | Evaluated: -4071 == 0

@carstenbauer
Copy link
Member Author

Strange. Looking at the libuv source code and https://docs.libuv.org/en/v1.x/errors.html, this return code seems to refer to "invalid argument". We pass 3 arguments, 1, mask, and cpumasksize. The latter is queried from libuv itself and thus should be safe. I think we can also rule out that 1 is the issue. That leaves us with mask, which is just mask = fill!(zeros(Cchar, cpumasksize), 1). Unfortunately, I don't have a Windows machine to test this live. On Linux, when I try to provide illegal masks, I don't get the error code we're seeing here:

julia> jl_setaffinity(1, fill!(zeros(Cchar, cpumasksize), 0), cpumasksize) # different error code if all mask entries are zero
-22

julia> jl_setaffinity(1, fill!(zeros(Cchar, cpumasksize+3), 1), cpumasksize) # increasing the mask size beyond cpumasksize doesn't seem to be a problem?
0

Of course, we could simply not test this on Windows but, in contrast to macOS, it should work there...

KristofferC added a commit that referenced this pull request Mar 17, 2024
Backported PRs:
- [x] #39071 <!-- Add a lazy `logrange` function and `LogRange` type -->
- [x] #51802 <!-- Allow AnnotatedStrings in log messages -->
- [x] #53369 <!-- Orthogonalize re-indexing for FastSubArrays -->
- [x] #48050 <!-- improve `--heap-size-hint` arg handling -->
- [x] #53482 <!-- add IR encoding for EnterNode -->
- [x] #53499 <!-- Avoid compiler warning about redefining jl_globalref_t
-->
- [x] #53507 <!-- update staled `Core.Compiler.Effects` documentation
-->
- [x] #53408 <!-- task splitting: change additive accumulation to
multiplicative -->
- [x] #53523 <!-- add back an alias for `check_top_bit` -->
- [x] #53377 <!-- add _readdirx for returning more object info gathered
during dir scan -->
- [x] #53525 <!-- fix InteractiveUtils call in Base.runtests on failure
-->
- [x] #53540 <!-- use more efficient `_readdirx` for tab completion -->
- [x] #53545 <!-- use `_readdirx` for `walkdir` -->
- [x] #53551 <!-- revert "Add @create_log_macro for making custom styled
logging macros (#52196)" -->
- [x] #53554 <!-- Always return a value in 1-d circshift! of
abstractarray.jl -->
- [x] #53424 <!-- yet more atomics & cache-line fixes on work-stealing
queue -->
- [x] #53571 <!-- Update Documenter to v1.3 for inventory writing -->
- [x] #53403 <!-- Move parallel precompilation to Base -->
- [x] #53589 <!-- add back `unsafe_convert` to pointer for arrays -->
- [x] #53596 <!-- build: remove extra .a file -->
- [x] #53606 <!-- fix error path in `precompilepkgs` -->
- [x] #53004 <!-- Unexport with, at_with, and ScopedValue from Base -->
- [x] #53629 <!-- typo fix in scoped values docs -->
- [x] #53630 <!-- sroa: Fix incorrect scope counting -->
- [x] #53598 <!-- Use Base parallel precompilation to build stdlibs -->
- [x] #53649 <!-- precompilepkgs: package in boths deps and weakdeps are
in fact only weak -->
- [x] #53671 <!-- Fix bootstrap Base precompile in cross compile
configuration -->
- [x] #52125 <!-- Load Pkg if not already to reinstate missing package
add prompt -->
- [x] #53602 <!-- Handle zero on arrays of unions of number types and
missings -->
- [x] #53516 <!-- permit NamedTuple{<:Any, Union{}} to be created -->
- [x] #53643 <!-- Bump CSL to 1.1.1 to fix libgomp bug -->
- [x] #53679 <!-- move precompile workload back from Base -->
- [x] #53663 <!-- add isassigned methods for reinterpretarray -->
- [x] #53662 <!-- [REPL] fix incorrectly cleared line after completions
accepted -->
- [x] #53611 <!-- Linalg: matprod_dest for Diagonal and adjvec -->
- [x] #53659 <!-- fix #52025, re-allow all implicit pointer casts in
cconvert for Array -->
- [x] #53631 <!-- LAPACK: validate input parameters to throw informative
errors -->
- [x] #53628 <!-- Make some improvements to the Scoped Values
documentation. -->
- [x] #53655 <!-- Change tbaa of ptr_phi to tbaa_value  -->
- [x] #53391 <!-- Default to the medium code model in x86 linux -->
- [x] #53699 <!-- Move `isexecutable, isreadable, iswritable` to
`filesystem.jl` -->
- [x] #41232 <!-- Fix linear indexing for ReshapedArray if the parent
has offset axes -->
- [x] #53527 <!-- Enable analyzegc checks for try catch and fix found
issues -->
- [x] #52092 
- [x] #53682 <!-- Increase build precompilation -->
- [x] #53720 
- [x] #53553 <!-- typeintersect: fix `UnionAll` unaliasing bug caused by
innervars. -->

Contains multiple commits, manual intervention needed:
- [ ] #53305 <!-- Propagate inbounds in isassigned with CartesianIndex
indices -->

Non-merged PRs with backport label:
- [ ] #53736 <!-- fix literal-pow to return the right type when the base
is -1 -->
- [ ] #53707 <!-- Make ScopedValue public -->
- [ ] #53696 <!-- add invokelatest to on_done callback in bracketed
paste -->
- [ ] #53660 <!-- put Logging back in default sysimage -->
- [ ] #53509 <!-- revert moving "creating packages" from Pkg.jl -->
- [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} -->
- [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` -->
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
- [ ] #51928 <!-- Styled markdown, with a few tweaks -->
- [ ] #51816 <!-- User-themable stacktraces -->
- [ ] #51811 <!-- Make banner size depend on terminal size -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC mentioned this pull request Mar 20, 2024
41 tasks
@carstenbauer
Copy link
Member Author

carstenbauer commented Mar 21, 2024

Test are passing now (I choose a mask where only the first CPU thread is marked as 1, i.e. pinning to the first CPU thread).

However, depending on how the CI actually runs (does it have exclusive machine access?) I could imagine that the first CPU thread isn't always guaranteed to be available to the process..... Let me test the original mask (all ones) once again to see if the Windows CI failure was just a fluke. If it wasn't I think I would vote to simply disable this test on Windows.

@carstenbauer
Copy link
Member Author

Alright same test that was passing above is now suddenly failing...

@carstenbauer
Copy link
Member Author

@IanButterworth, given we're fine with not testing on Windows - I don't know why it fails and don't know how to reliably fix the test for it - this can be merged now.

While not ideal, I don't think it's too bad to not test this on Windows. After all, (1) this isn't documented API and (2) pinning threads typically is only really used on Linux (clusters) anyways. But I'm open to suggestions, of course.

@IanButterworth
Copy link
Member

Seems this needs a rebase now. Also, I just wanted to double-check that merging this wouldn't introduce a flaky test?

@IanButterworth
Copy link
Member

Bump (given this has merge me)

@KristofferC KristofferC mentioned this pull request Apr 17, 2024
59 tasks
carstenbauer and others added 4 commits April 17, 2024 11:55
Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>
@carstenbauer
Copy link
Member Author

@IanButterworth I've just rebased. Let's see if the test is flaky. (If it turns out to be problematic, maybe we shouldn't test this - internal feature - at all?)

@giordano
Copy link
Contributor

There's a test failure in 32-bit windows job in "threads" set, but on the phone I can't easily see the error message and tell if it's relevant.

@DilumAluthge
Copy link
Member

CI is now all green. @vtjnash @vchuravy

@vchuravy vchuravy merged commit 065aeb6 into JuliaLang:master May 25, 2024
7 checks passed
@DilumAluthge DilumAluthge removed the merge me PR is reviewed. Merge when all tests are passing label May 25, 2024
KristofferC added a commit that referenced this pull request May 28, 2024
Backported PRs:
- [x] #53665 <!-- use afoldl instead of tail recursion for tuples -->
- [x] #53976 <!-- LinearAlgebra: LazyString in interpolated error
messages -->
- [x] #54005 <!-- make `view(::Memory, ::Colon)` produce a Vector -->
- [x] #54010 <!-- Overload `Base.literal_pow` for `AbstractQ` -->
- [x] #54069 <!-- Allow PrecompileTools to see MI's inferred by foreign
abstract interpreters -->
- [x] #53750 <!-- inference correctness: fields and globals can revert
to undef -->
- [x] #53984 <!-- Profile: fix heap snapshot is valid char check -->
- [x] #54102 <!-- Explicitly compute stride in unaliascopy for SubArray
-->
- [x] #54070 <!-- Fix integer overflow in `skip(s::IOBuffer,
typemax(Int64))` -->
- [x] #54013 <!-- Support case-changes to Annotated{String,Char}s -->
- [x] #53941 <!-- Fix writing of AnnotatedChars to AnnotatedIOBuffer -->
- [x] #54137 <!-- Fix typo in docs for `partialsortperm` -->
- [x] #54129 <!-- use correct size when creating output data from an
IOBuffer -->
- [x] #54153 <!-- Fixup IdSet docstring -->
- [x] #54143 <!-- Fix `make install` from tarballs -->
- [x] #54151 <!-- LinearAlgebra: Correct zero element in
`_generic_matvecmul!` for block adj/trans -->
- [x] #54213 <!-- Add `public` statement to `Base.GC` -->
- [x] #54222 <!-- Utilize correct tbaa when emitting stores of unions.
-->
- [x] #54233 <!-- set MAX_OS_WRITE on unix -->
- [x] #54255 <!-- fix `_checked_mul_dims` in the presence of 0s and
overflow. -->
- [x] #54259 <!-- Fix typo in `readuntil` -->
- [x] #54251 <!-- fix typo in gc_mark_memory8 when chunking a large
array -->
- [x] #54276 <!-- Fix solve for complex `Hermitian` with non-vanishing
imaginary part on diagonal -->
- [x] #54248 <!-- ensure package callbacks are invoked when no valid
precompile file exists for an "auto loaded" stdlib -->
- [x] #54308 <!-- Implement eval-able AnnotatedString 2-arg show -->
- [x] #54302 <!-- Specialised substring equality for annotated strs -->
- [x] #54243 <!-- prevent `package_callbacks` to run multiple time for a
single package -->
- [x] #54350 <!-- add a precompile signature to Artifacts code that is
used by JLLs -->
- [x] #54331 <!-- correctly track freed bytes in
jl_genericmemory_to_string -->
- [x] #53509 <!-- revert moving "creating packages" from Pkg.jl -->
- [x] #54335 <!-- When accessing the data pointer for an array, first
decay it to a Derived Pointer -->
- [x] #54239 <!-- Make sure `fieldcount` constant-folds for `Tuple{...}`
-->
- [x] #54288
- [x] #54067
- [x] #53715 <!-- Add read/write specialisation for IOContext{AnnIO} -->
- [x] #54289 <!-- Rework annotation ordering/optimisations -->
- [x] #53815 <!-- create phantom task for GC threads -->
- [x] #54130 <!-- inference: handle `LimitedAccuracy` in
`handle_global_assignment!` -->
- [x] #54428 <!-- Move ConsoleLogging.jl into Base -->
- [x] #54332 <!-- Revert "add unsetindex support to more copyto methods
(#51760)" -->
- [x] #53826 <!-- Make all command-line options documented in all
related files -->
- [x] #54465 <!-- typeintersect: conservative typevar subtitution during
`finish_unionall` -->
- [x] #54514 <!-- typeintersect: followup cleanup for the nothrow path
of type instantiation -->
- [x] #54499 <!-- make `@doc x` work without REPL loaded -->
- [x] #54210 <!-- attach finalizer in `mmap` to the correct object -->
- [x] #54359 <!-- Pkg REPL: cache `pkg_mode` lookup -->

Non-merged PRs with backport label:
- [ ] #54471 <!-- Actually setup jit targets when compiling
packageimages instead of targeting only one -->
- [ ] #54457 <!-- Make `String(::Memory)` copy -->
- [ ] #54323 <!-- inference: fix too conservative effects for recursive
cycles -->
- [ ] #54322 <!-- effects: add new `@consistent_overlay` macro -->
- [ ] #54191 <!-- make `AbstractPipe` public -->
- [ ] #53957 <!-- tweak how filtering is done for what packages should
be precompiled -->
- [ ] #53882 <!-- Warn about cycles in extension precompilation -->
- [ ] #53707 <!-- Make ScopedValue public -->
- [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} -->
- [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` -->
- [ ] #53286 <!-- Raise an error when using `include_dependency` with
non-existent file or directory -->
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC mentioned this pull request May 29, 2024
60 tasks
DilumAluthge added a commit that referenced this pull request Jun 3, 2024
This PR adds two functions `jl_getaffinity` and `jl_setaffinity` to the
runtime, which are slim wrappers around `uv_thread_getaffinity` and
`uv_thread_setaffinity` and can be used to set the affinity of Julia
threads.

This will
* simplify thread pinning (ThreadPinning.jl currently pins threads by
spawning tasks that run the necessary ccalls) and
* enable users to also pin GC threads (or, more generally, all Julia
threads).

**Example:**
```julia
bauerc@n2lcn0146 julia git:(cb/affinity)  
➜ ./julia -q --startup-file=no --threads 2,3 --gcthreads 4,1
julia> cpumasksize = @CCall uv_cpumask_size()::Cint
1024

julia> mask = zeros(Cchar, cpumasksize);

julia> jl_getaffinity(tid, mask, cpumasksize) = ccall(:jl_getaffinity, Int32, (Int16, Ptr{Cchar}, Int32), tid, mask, cpumasksize)
jl_getaffinity (generic function with 1 method)

julia> jl_getaffinity(1, mask, cpumasksize)
0

julia> print(mask[1:Sys.CPU_THREADS])
Int8[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

julia> mask[1] = 0;

julia> jl_setaffinity(tid, mask, cpumasksize) = ccall(:jl_setaffinity, Int32, (Int16, Ptr{Cchar}, Int32), tid, mask, cpumasksize)
jl_setaffinity (generic function with 1 method)

julia> jl_setaffinity(1, mask, cpumasksize)
0

julia> fill!(mask, 0);

julia> jl_getaffinity(1, mask, cpumasksize)
0

julia> print(mask[1:Sys.CPU_THREADS])
Int8[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
```
(cc @vchuravy, @gbaraldi)

Would be great to get this into 1.11 (despite feature freeze) because
otherwise we won't be able to pin GC threads until 1.12 (likely not
until the end of the year).

Closes #53073

---------

Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>
Co-authored-by: Dilum Aluthge <dilum@aluthge.com>
KristofferC pushed a commit that referenced this pull request Jun 7, 2024
This PR adds two functions `jl_getaffinity` and `jl_setaffinity` to the
runtime, which are slim wrappers around `uv_thread_getaffinity` and
`uv_thread_setaffinity` and can be used to set the affinity of Julia
threads.

This will
* simplify thread pinning (ThreadPinning.jl currently pins threads by
spawning tasks that run the necessary ccalls) and
* enable users to also pin GC threads (or, more generally, all Julia
threads).

**Example:**
```julia
bauerc@n2lcn0146 julia git:(cb/affinity)
➜ ./julia -q --startup-file=no --threads 2,3 --gcthreads 4,1
julia> cpumasksize = @CCall uv_cpumask_size()::Cint
1024

julia> mask = zeros(Cchar, cpumasksize);

julia> jl_getaffinity(tid, mask, cpumasksize) = ccall(:jl_getaffinity, Int32, (Int16, Ptr{Cchar}, Int32), tid, mask, cpumasksize)
jl_getaffinity (generic function with 1 method)

julia> jl_getaffinity(1, mask, cpumasksize)
0

julia> print(mask[1:Sys.CPU_THREADS])
Int8[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

julia> mask[1] = 0;

julia> jl_setaffinity(tid, mask, cpumasksize) = ccall(:jl_setaffinity, Int32, (Int16, Ptr{Cchar}, Int32), tid, mask, cpumasksize)
jl_setaffinity (generic function with 1 method)

julia> jl_setaffinity(1, mask, cpumasksize)
0

julia> fill!(mask, 0);

julia> jl_getaffinity(1, mask, cpumasksize)
0

julia> print(mask[1:Sys.CPU_THREADS])
Int8[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
```
(cc @vchuravy, @gbaraldi)

Would be great to get this into 1.11 (despite feature freeze) because
otherwise we won't be able to pin GC threads until 1.12 (likely not
until the end of the year).

Closes #53073

---------

Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>
Co-authored-by: Dilum Aluthge <dilum@aluthge.com>
(cherry picked from commit 065aeb6)
@ararslan
Copy link
Member

This errors on FreeBSD, so CI jobs now all have:

Error in testset threads:
Test Failed at /usr/home/julia/.buildkite-agent/builds/freebsd13-amdci6-3/julialang/julia-master/julia-58927d885a/share/julia/test/threads.jl:366
  Expression: jl_setaffinity(1, mask, cpumasksize) == 0
   Evaluated: -22 == 0

KristofferC added a commit that referenced this pull request Jun 25, 2024
Backported PRs:
- [x] #54361 <!-- [LBT] Upgrade to v5.9.0 -->
- [x] #54474 <!-- Unalias source from dest in copytrito -->
- [x] #54548 <!-- Fixes for bitcast bugs with LLVM 17 / opaque pointers
-->
- [x] #54191 <!-- make `AbstractPipe` public -->
- [x] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` -->
- [x] #53356 <!-- Rename at-scriptdir project argument to at-script and
search upwards for Project.toml -->
- [x] #54545 <!-- typeintersect: fix incorrect innervar handling under
circular env -->
- [x] #54586 <!-- Set storage class of julia globals to dllimport on
windows to avoid auto-import weirdness. Forward port of #54572 -->
- [x] #54587 <!-- Accomodate for rectangular matrices in `copytrito!`
-->
- [x] #54617 <!-- CLI: Use `GetModuleHandleExW` to locate libjulia.dll
-->
- [x] #54605 <!-- Allow libquadmath to also fail as it is not available
on all systems -->
- [x] #54634 <!-- Fix trampoline assembly for build on clang 18 on apple
silicon -->
- [x] #54635 <!-- Aggressive constprop in trevc! to stabilize triangular
eigvec -->
- [x] #54645 <!-- ensure we set the right value to gc_first_tid -->
- [x] #54554 <!-- make elsize public -->
- [x] #54648 <!-- Construct LazyString in error paths for tridiag -->
- [x] #54658 <!-- fix missing uuid check on extension when finding the
location of an extension -->
- [x] #54594 <!-- Switch to Pkg mode prompt immediately and load Pkg in
the background -->
- [x] #54669 <!-- Improve error message in inplace transpose -->
- [x] #54671 <!-- Add boundscheck in bindingkey_eq to avoid OOB access
due to data race -->
- [x] #54672 <!-- make: Fix `sed` command for LLVM libraries with no
symbol versioning -->
- [x] #54624 <!-- more precise aliasing checks for SubArray -->
- [x] #54679 <!-- 🤖 [master] Bump the Distributed stdlib from 6a07d98 to
6c7cdb5 -->
- [x] #54604 <!-- Fix tbaa annotation on union selector bytes inside of
structs -->
- [x] #54690 <!-- Fix assertion/crash when optimizing function with dead
basic block -->
- [x] #54704 <!-- LazyString in reinterpretarray error messages -->
- [x] #54718 <!-- fix prepend StackOverflow issue -->
- [x] #54674 <!-- Reimplement dummy pkg prompt as standard prompt -->
- [x] #54737 <!-- LazyString in interpolated error messages involving
types -->
- [x] #54642 <!-- Document GenericMemory and AtomicMemory -->
- [x] #54713 <!-- make: use `readelf` for LLVM symbol version detection
-->
- [x] #54760 <!-- REPL: improve prompt! async function handler -->
- [x] #54606 <!-- fix double-counting and non-deterministic results in
`summarysize` -->
- [x] #54759 <!-- REPL: Fully populate the dummy Pkg prompt -->
- [x] #54702 <!-- lowering: Recognize argument destructuring inside
macro hygiene -->
- [x] #54678 <!-- Don't let setglobal! implicitly create bindings -->
- [x] #54730 <!-- Fix uuidkey of exts in fast path of `require_stdlib`
-->
- [x] #54765 <!-- Handle no-postdominator case in finalizer pass -->
- [x] #54591 <!-- Don't expose guard pages to malloc_stack API consumers
-->
- [x] #54755 <!-- [TOML] remove Dates hack, replace with explicit usage
-->
- [x] #54721 <!-- add try/catch around scheduler to reset sleep state
-->
- [x] #54631 <!-- Avoid concatenating LazyString in setindex! for
triangular matrices -->
- [x] #54322 <!-- effects: add new `@consistent_overlay` macro -->
- [x] #54785
- [x] #54865
- [x] #54815
- [x] #54795
- [x] #54779
- [x] #54837 

Contains multiple commits, manual intervention needed:
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
- [ ] #54649 <!-- Less restrictive copyto! signature for triangular
matrices -->

Non-merged PRs with backport label:
- [ ] #54779 <!-- make recommendation clearer on manifest version
mismatch -->
- [ ] #54739 <!-- finish implementation of upgradable stdlibs -->
- [ ] #54738 <!-- serialization: fix relocatability bug -->
- [ ] #54574 <!-- Make ScopedValues public -->
- [ ] #54457 <!-- Make `String(::Memory)` copy -->
- [ ] #53957 <!-- tweak how filtering is done for what packages should
be precompiled -->
- [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} -->
- [ ] #53286 <!-- Raise an error when using `include_dependency` with
non-existent file or directory -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC removed the backport 1.11 Change should be backported to release-1.11 label Jun 25, 2024
lazarusA pushed a commit to lazarusA/julia that referenced this pull request Jul 12, 2024
This PR adds two functions `jl_getaffinity` and `jl_setaffinity` to the
runtime, which are slim wrappers around `uv_thread_getaffinity` and
`uv_thread_setaffinity` and can be used to set the affinity of Julia
threads.

This will
* simplify thread pinning (ThreadPinning.jl currently pins threads by
spawning tasks that run the necessary ccalls) and
* enable users to also pin GC threads (or, more generally, all Julia
threads).

**Example:**
```julia
bauerc@n2lcn0146 julia git:(cb/affinity)  
➜ ./julia -q --startup-file=no --threads 2,3 --gcthreads 4,1
julia> cpumasksize = @CCall uv_cpumask_size()::Cint
1024

julia> mask = zeros(Cchar, cpumasksize);

julia> jl_getaffinity(tid, mask, cpumasksize) = ccall(:jl_getaffinity, Int32, (Int16, Ptr{Cchar}, Int32), tid, mask, cpumasksize)
jl_getaffinity (generic function with 1 method)

julia> jl_getaffinity(1, mask, cpumasksize)
0

julia> print(mask[1:Sys.CPU_THREADS])
Int8[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

julia> mask[1] = 0;

julia> jl_setaffinity(tid, mask, cpumasksize) = ccall(:jl_setaffinity, Int32, (Int16, Ptr{Cchar}, Int32), tid, mask, cpumasksize)
jl_setaffinity (generic function with 1 method)

julia> jl_setaffinity(1, mask, cpumasksize)
0

julia> fill!(mask, 0);

julia> jl_getaffinity(1, mask, cpumasksize)
0

julia> print(mask[1:Sys.CPU_THREADS])
Int8[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
```
(cc @vchuravy, @gbaraldi)

Would be great to get this into 1.11 (despite feature freeze) because
otherwise we won't be able to pin GC threads until 1.12 (likely not
until the end of the year).

Closes JuliaLang#53073

---------

Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>
Co-authored-by: Dilum Aluthge <dilum@aluthge.com>
ararslan added a commit that referenced this pull request Jul 27, 2024
Changes made:
- Use 0 for the thread ID to ensure it's always valid. The function
expects `0 <= tid < jl_n_threads` so 1 is incorrect if `jl_n_threads` is
1.
- After retrieving the affinity mask with `jl_getaffinity`, pass that
same mask back to `jl_setaffinity`. This ensures that the mask is always
valid. Using a mask of all ones results in `EINVAL` on FreeBSD. Based on
the discussion in #53402, this change may also fix Windows, so I've
tried reenabling it here.
- To check whether `jl_getaffinity` actually did something, we can check
that the mask is no longer all zeros after the call.

Fixes #54817
KristofferC pushed a commit that referenced this pull request Aug 2, 2024
Changes made:
- Use 0 for the thread ID to ensure it's always valid. The function
expects `0 <= tid < jl_n_threads` so 1 is incorrect if `jl_n_threads` is
1.
- After retrieving the affinity mask with `jl_getaffinity`, pass that
same mask back to `jl_setaffinity`. This ensures that the mask is always
valid. Using a mask of all ones results in `EINVAL` on FreeBSD. Based on
the discussion in #53402, this change may also fix Windows, so I've
tried reenabling it here.
- To check whether `jl_getaffinity` actually did something, we can check
that the mask is no longer all zeros after the call.

Fixes #54817

(cherry picked from commit 8a7e23d)
lazarusA pushed a commit to lazarusA/julia that referenced this pull request Aug 17, 2024
Changes made:
- Use 0 for the thread ID to ensure it's always valid. The function
expects `0 <= tid < jl_n_threads` so 1 is incorrect if `jl_n_threads` is
1.
- After retrieving the affinity mask with `jl_getaffinity`, pass that
same mask back to `jl_setaffinity`. This ensures that the mask is always
valid. Using a mask of all ones results in `EINVAL` on FreeBSD. Based on
the discussion in JuliaLang#53402, this change may also fix Windows, so I've
tried reenabling it here.
- To check whether `jl_getaffinity` actually did something, we can check
that the mask is no longer all zeros after the call.

Fixes JuliaLang#54817
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Option to pin GC threads
8 participants