-
Notifications
You must be signed in to change notification settings - Fork 169
Workload Assignment
When a Client requests a workload from the Server, the Server will filter out any workloads which the Client is not qualified to complete. From there it will filter out to only include the highest priority of those qualified-workloads. Lastly, from there, a workload will be selected based on the distribution of other works on other workloads, to maintain a balanced distribution of workers.
-
Remove any workloads which contain a dev or base engine that is not supported by the Client. This can occur if the Client does not have a sufficiently new compiler to build the engine; does not have the requested CPU instruction sets; does not have an Operating System that is compatible with the engine; does not have a Fine-Grained Access Token for a private engine.
-
Remove any workloads with unmet Syzygy requirements. A workload can request up to 7-man Syzygy tablebases for both Adjudication and for WDL during play. Workloads which specifically request certain sized Syzygy tables are generally uncommon, and should be used carefully to account for different hardware configurations.
-
Remove any workloads that require more threads than we have connected. For a workload where both engines have the same number of threads, this means removing any workloads where our machine does not have the capacity to run at least single concurrent game. For thread-odds workloads, we will artificially cut the number of threads of the Client in half if they are clearly dipping into Hyperthreads, to make that determination. An an example, imagine a 4-core/8-thread CPU. It is capable of playing a workload that is 4 threads vs 4 threads with concurrency=2, and also a workload that is 8 threads vs 8 threads with concurrency=1. It is also capable of playing a workload that 4 threads vs 1 threads with concurrency=1, as we will refuse to use Hyperthreads for smp-odds. Lastly, the worker would refuse a workload that is 8 threads vs 1 threads. This is because one engine would be using hyperthreads, but the other would be getting full cores.
-
We will compute the distribution of workers, relative to the throughput of each workload. This is done by diving the total number of threads assigned to each workload by the throughput of the workload. We call this the ratio. We compute the fair-ratio, which would be the ratio if all workers had been distributed evenly with respect to their thread count, and each workloads' throughput.
-
If the worker's most recent workload is in the list of qualified ones, then it will be repeated, so long as no workload is getting less than 75% of the fair-ratio. Otherwise, the worker will be assigned the workload with the lowest ratio. If multiple workloads share the same lowest ratio, then one will be selected at random.
The server provides three critical values in the workload JSON response. These are cutechess-count
, concurrency-per
, and games-per-cutechess
. These values are a function of the number of threads and sockets, as well as the nature of the test or tune. They are explained below.
-
cutechess-count
indicates the number of cutechess copies that should be running at one time. For a typical workload, where each engine is playing with one thread,cutechess-count
will be equal to the number of sockets on the worker, as provided via--nsockets
or-N
when starting the Client. If the workload is an SPSA tune, using theMULTIPLE
method of distributing SPSA-points, thencutechess-count
will be the maximum number of concurrent games divided by two. Finally, if the previous condition is not true, and the workload uses more than 1 thread for either engine, thencutechess-count
will be set to1
. -
concurrency-per
indicates the number of concurrent games that will be played, for any particular cutechess copy that is running. If the workload is an SPSA tune, using theMULTIPLE
method of distributing SPSA-points, then this value will be2
. Otherwise, it will be the maximum number of concurrent games, which is defined as(threads // cutechess-count) // max(dev_threads, base_threads)
. -
games-per-cutechess
is the number of games to play total, on each particular cutechess copy that is running. Once again, SPSA tunes using theMULTIPLE
method are a special case, and will play2 * workload_size
games, ie a game-pair for eachworkload_size
. The general case will instead play2 * workload_size * concurrency-per
, ie a game-pair for eachworkload_size
for each possible concurrent game.