Deleting expired shards isn't thread safe #779

jvshahid · 2014-07-24T20:34:27Z

Due to a bug fixed by #769 and c02cff2. Shards were getting dropped prematurely (using the duration instead of retention) while some user is still writing to them. Although the bug is fixed it turns out that a shard could be deleted and closed while some data is writing to it which will cause a nil pointer dereference causing the entire daemon to crash. The two locations that were identified to have race condition (there could be more) are https://github.com/influxdb/influxdb/blob/master/datastore/shard.go#L459 and https://github.com/influxdb/influxdb/blob/master/datastore/shard.go#L93

We should mark the shards as deletable until no one has reference to it and delete it, similar to the shard cache that we have in the datastore.

The text was updated successfully, but these errors were encountered:

nichdiekuh · 2014-07-25T08:40:29Z

Sounds like this issue is related to my #747 issue and will fix it as well. can't wait it! :)

jvshahid · 2014-07-25T16:05:15Z

@nichdiekuh yes that could be the cause of the issues you're seeing. That said, as of rc3 InfluxDB incorrectly drops shards when it shouldn't be which is why you hit the bug in the first place. We will release rc4 today with the fix for #769 and #774. This version should not have any problem with the benchmark you provided. I ran the benchmark last night and it wrote 40% before the machine ran out of space.

nichdiekuh · 2014-07-25T16:14:37Z

That sounds promising! And btw: I've set the retention-sweep-period too "1000000m" for testing purposes, started my script and it's still running. Unfortunately the sweep-period cannot be set to anything like "30d" or so (influx doesn't start with other units, but that's another issue)

shugo · 2014-07-31T05:30:52Z

Thanks for your efforts for shard expiration.

Is this issue related to #767?
It seems that shards are not removed from shardsById even if they are deleted by PeriodicallyDropShardsWithRetentionPolicies().

pauldix · 2014-08-22T19:09:09Z

Fixed by #866

jvshahid assigned pauldix Jul 24, 2014

jvshahid added this to the 0.8.0 milestone Jul 24, 2014

nichdiekuh mentioned this issue Jul 29, 2014

socket hang up #747

Closed

pauldix closed this as completed Aug 22, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deleting expired shards isn't thread safe #779

Deleting expired shards isn't thread safe #779

jvshahid commented Jul 24, 2014

nichdiekuh commented Jul 25, 2014

jvshahid commented Jul 25, 2014

nichdiekuh commented Jul 25, 2014

shugo commented Jul 31, 2014

pauldix commented Aug 22, 2014

Deleting expired shards isn't thread safe #779

Deleting expired shards isn't thread safe #779

Comments

jvshahid commented Jul 24, 2014

nichdiekuh commented Jul 25, 2014

jvshahid commented Jul 25, 2014

nichdiekuh commented Jul 25, 2014

shugo commented Jul 31, 2014

pauldix commented Aug 22, 2014