-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix shard group creation #2610
Fix shard group creation #2610
Conversation
@@ -390,7 +390,6 @@ func (s *Server) StartSelfMonitoring(database, retention string, interval time.D | |||
|
|||
// Shard-level stats. | |||
tags["shardID"] = strconv.FormatUint(s.id, 10) | |||
s.mu.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you removing these locks? They are needed to avoid a race condition that was flagged on Circle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's odd. Not sure how those got removed. Will revet that.
The change makes sense, except for the removal of the locks. |
be0d522
to
6941d5e
Compare
Removed the lock removal changes. Not supposed to be in this PR. |
Shard groups were being created for each point causing a a lot of excessive broadcasts for large batches.
End time was included in the check for a point. This meant a point that was truncated and landed on the end time could fall in both the prior and next shard groups. For example, a range 17:00-18:00 and 18:00-19:00 would both include a point that was at 18:00.
@@ -1160,7 +1160,7 @@ func NewRetentionPolicy(name string) *RetentionPolicy { | |||
// Returns nil group does not exist. | |||
func (rp *RetentionPolicy) shardGroupByTimestamp(timestamp time.Time) *ShardGroup { | |||
for _, g := range rp.shardGroups { | |||
if timeBetweenInclusive(timestamp, g.StartTime, g.EndTime) { | |||
if (g.StartTime.Before(timestamp) || g.StartTime.Equal(timestamp)) && g.EndTime.After(timestamp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about !g.StartTime.After(timestamp) && g.EndTime.After(timestamp)
or g.Contains(timestamp, timestamp)
(which reads kind of awkwardly but uses the logic encapsulated in ShardGroup
)?
@otoolep Updated PR to fix another place where shard group boundaries were calculated incorrectly. |
// Overlaps return whether the shard group contains data for the time range between min and max | ||
func (sg *ShardGroup) Overlaps(min, max time.Time) bool { | ||
return min.Before(sg.StartTime) && (max.After(sg.StartTime) || max.Equal(sg.StartTime)) || | ||
sg.Contains(min) || sg.Contains(max) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this can be simplified
return !sg.StartTime.After(max) && sg.EndTime.After(min)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
👏 for the |
+1, great catch. |
The logic in this function was wrong when used for shard group bound calculations so remove it and use functions on shard group instead.
for _, p := range points { | ||
times[p.Time.Truncate(rp.ShardGroupDuration)] = struct{}{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have a question to try and improve my own understanding of go. i've seen this pattern a couple of times in this codebase so i have to ask...
since you only care about the keys of this map, is it any more efficient (from the point of view of memory allocation) to make this a map[time.Time]* struct{}{}
and use nil
as the value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See http://dave.cheney.net/2014/03/25/the-empty-struct for a good explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you - that's exactly the info i was wanting.
This PR fixes two issues with shard group creation.
The first is that a shard group create request was sent for every point which can cause large pauses when a new shard group needs to be created when sending large batches.
The second is that the shard group end time was being included in the point calculation which allows a point to be included in two shard group ranges if it matched the end time of a shard group.