Add the option to Multithread AtriumDB #119

WilliamDixon · 2025-01-02T22:41:10Z

One of AtriumDB's strengths is its ability to use multiple threads at once to encode and decode blocks of data in parallel.

In order to optimize CPU time (number of CPUs * time spent processing), it was most efficient to run AtriumDB in single threaded mode (avoiding extra CPU time sharing information between cores).

However with the addition of the "Wall Time" metric, I thought it worthwhile to add a new AtriumDB subclass that utilizes its parallelization feature which optimizes Wall Time at the expense of CPU time.

To highlight an example of this tradeoff, below are the times it took for AtriumDB to write an entire Mimic record to disk with and without multithreading on a 40 core Linux server:

SingleThreaded AtriumDB:
CPU time: 212.1006 sec
Wall Time: 215.1170 s

MultiThreaded AtriumDB (40 threads):
CPU time: 435.6397 sec
Wall Time: 27.0486 s

As you can see, CPU time suffers by a factor of 2, while Wall time improves by a factor of 8.

These benefits are most apparent for large reads/writes, and disappear completely when the size of the task drops below 1 AtriumDB Block (whose size can also now be more easily adjusted in the source code of this PR).

…w subclasses that use Multithreading.

AtriumDB: Make block_size and num_threads class variables, add two ne…

52a0c24

…w subclasses that use Multithreading.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the option to Multithread AtriumDB #119

Add the option to Multithread AtriumDB #119

WilliamDixon commented Jan 2, 2025 •

edited

Loading

Add the option to Multithread AtriumDB #119

Are you sure you want to change the base?

Add the option to Multithread AtriumDB #119

Conversation

WilliamDixon commented Jan 2, 2025 • edited Loading

WilliamDixon commented Jan 2, 2025 •

edited

Loading