Prevent dataset scanning from depleting memory #1970

cmroche · 2018-02-25T14:48:38Z

What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)

Bugfix

What is the current behavior? (You can also link to an open issue here)

If you have large datasets, with many markets imported the dataset scanner is going to fork for every single exchange and market combination all at once. This depletes memory very fast, and doesn't provide a significant gain in processing performance (can often lead to worse performance even).

AWS and Docker machines often crash due to the scanning requiring several gigs of memory.

What is the new behavior (if this is a feature change)?

The dataset scanner now queues and runs only as many forks as there are CPU cores on the system. Greatly reducing the impact on memory to a couple 10s of MB.

Other information:

askmike · 2018-02-26T10:26:32Z

super slick!

Prevent dataset scanning from depleting memory

95bc908

askmike merged commit 7cf20f6 into askmike:develop Feb 26, 2018

cmroche deleted the scanning branch March 6, 2018 01:22

This was referenced Mar 21, 2018

v0.5.14 #2047

Merged

0.5.14 #2048

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent dataset scanning from depleting memory #1970

Prevent dataset scanning from depleting memory #1970

cmroche commented Feb 25, 2018

askmike commented Feb 26, 2018

Prevent dataset scanning from depleting memory #1970

Prevent dataset scanning from depleting memory #1970

Conversation

cmroche commented Feb 25, 2018

askmike commented Feb 26, 2018