Refactor how chunks are persisted. #95

woodsaj · 2016-01-05T08:57:25Z

New approach uses a pool of worker goroutines to process all chunks.

The change here removes the per-series channel to limit the number of un-saved chunks that can be present. Instead of queues per series, there is now just 1 writeQueue (buffered channel) shared by all series. As a result this single writeQueue should be quite large, large enough to hold all outstanding writes. If we wanted to hold 10 unwritten chunks then the queue would need to be ~10x the number of series. So if there were 100k series and we wanted to have up to 10chunks unwritten, the queue size (cassandraWriteQueueSize) would need to be 1million. The amount of memory needed for this is actually less then the amount needed for 100k individial queues with a size of 10.

Refactor how chunks are persisted.

woodsaj added 3 commits January 5, 2016 15:36

refactor how chunks are persisted

7f2841c

set default writeConcurrency to same as default concurrency.

4504014

add addition benchmark tests.

4577891

woodsaj pushed a commit that referenced this pull request Jan 5, 2016

Merge pull request #95 from raintank/issue94

dd6f106

Refactor how chunks are persisted.

woodsaj merged commit dd6f106 into master Jan 5, 2016

woodsaj deleted the issue94 branch January 5, 2016 09:25

Dieterbe mentioned this pull request Jan 8, 2016

chunks can be saved to cassandra out of order #100

Closed

This was referenced Jan 8, 2016

spread cassandra write load over time #99

Closed

major performance issues with metric-tank in production #94

Closed

metric-tank crash in production #88

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor how chunks are persisted. #95

Refactor how chunks are persisted. #95

woodsaj commented Jan 5, 2016

Refactor how chunks are persisted. #95

Refactor how chunks are persisted. #95

Conversation

woodsaj commented Jan 5, 2016