chunks can be saved to cassandra out of order #100

Dieterbe · 2016-01-08T06:39:44Z

since #95, it looks like chunks can be saved out of order.
IIRC the old code made sure to always save chunks in order (on a per series basis)

because that was a requirement for clustering and mark-as-saved to work properly, I think. am i mistaken? is this no longer a problem?
is it more optimal for cassandra to receive them in order?

woodsaj · 2016-01-08T06:43:17Z

chunks are still saved in order.
The core change introduced in #95 is that there is a single shared writeQueue instead of per series write queues. Entries are still added to the queue in the same order.

Dieterbe · 2016-01-08T06:50:03Z

you have multiple concurrent goroutines independently pulling chunks from the queue and saving at timings that are dependent on goroutine scheduling, timing of some of the operations they are performing, and additional sleeps when they hit failures. so it looks like plenty of avenues for out of order saving.

woodsaj · 2016-01-08T06:55:50Z

yes agreed.

Though chunks are added to the writeQueue in order, as there are multiple goroutines processing the writeQueue chunks could be saved to Cassandra out of order. This is not a problem for cassandra, as it will just order the chunks when the memtables are flushed to sstables.

After chunks are committed to cassandra, a metricPersist message is added to to a buffer that is flushed every second. If a newer chunk gets saved at the end of the second and then immediately flushed, followed by the node crashing before an older chunk is saved, then it will lead to a situation where the older chunk will never be saved.

via a very simple hashing-based mapping of series keys to worker queues

Dieterbe · 2016-01-14T02:25:53Z

After chunks are committed to cassandra, a metricPersist message is added to to a buffer that is flushed every second. If a newer chunk gets saved at the end of the second and then immediately flushed, followed by the node crashing before an older chunk is saved, then it will lead to a situation where the older chunk will never be saved.

except if you have another node that you promote to primary that gets to save the older chunk before it purges it, right? IOW: is it ever a problem for a node that you promote from secondary to primary, and it has a series with chunks in this chronological order: saved, unsaved, saved ? as long as it's been up long enough it should save the chunk in the middle, making up for the "crash during out-of-order saving" of the old primary, right?

if so, out of order saving doesn't seem like such a big deal, as long as we can somehow guarantee that inability to save old chunks doesn't drag on for too long.

we can fairly easily guarantee in-order saving on a per-series level (which I think is all we need?), by distributing chunkWriteRequests to a worker, by hashing the chunk key to a worker like i just did in #109

we would need ideally a lightweight hashing function that still provides good distribution though, not sure what to use. however since our chunkwriterequest keys are based on the metric keys, which are based on orgId + md5sum of series name and tags, we should be be able to employ a very naive hash and get good distribution. if it turns out it's not good enough, or when we start supporting metrics with other forms of keys, we can always revise.
this may also be interesting: https://groups.google.com/forum/#!topic/golang-nuts/msozW-DWets

implement cassandra write ordering on a per-series basis. fix #100

woodsaj closed this as completed Jan 8, 2016

Dieterbe reopened this Jan 8, 2016

Dieterbe added a commit that referenced this issue Jan 14, 2016

implement cassandra write ordering on a per-series basis. fix #100

4947088

via a very simple hashing-based mapping of series keys to worker queues

Dieterbe closed this as completed in f14f893 Jan 15, 2016

Dieterbe added a commit that referenced this issue Jan 15, 2016

Merge pull request #109 from raintank/cassandra-per-series-ordering

7c66559

implement cassandra write ordering on a per-series basis. fix #100

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chunks can be saved to cassandra out of order #100

chunks can be saved to cassandra out of order #100

Dieterbe commented Jan 8, 2016

woodsaj commented Jan 8, 2016

Dieterbe commented Jan 8, 2016

woodsaj commented Jan 8, 2016

Dieterbe commented Jan 14, 2016

chunks can be saved to cassandra out of order #100

chunks can be saved to cassandra out of order #100

Comments

Dieterbe commented Jan 8, 2016

woodsaj commented Jan 8, 2016

Dieterbe commented Jan 8, 2016

woodsaj commented Jan 8, 2016

Dieterbe commented Jan 14, 2016