-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make SafeDelete faster for large blocks; TSDB block repair case #1056
Changes from all commits
f0dfe27
f09d3f5
230bc7b
3f0d6cd
ed78bde
7fcf793
2f179ff
9ca0c05
fd572aa
24173d3
0d39253
5a7fd5d
bfd44f1
d1274ef
0a01898
bf62c5c
e296f8f
768ec93
c46ff1c
b625a95
380672b
5599139
e6ad6a0
649df4d
9e26cfd
31f7dd6
f6f3443
37d495a
b852c34
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -495,12 +495,18 @@ func Repair(logger log.Logger, dir string, id ulid.ULID, source metadata.SourceT | |
resmeta.Stats = tsdb.BlockStats{} // reset stats | ||
resmeta.Thanos.Source = source // update source | ||
|
||
if err := rewrite(indexr, chunkr, indexw, chunkw, &resmeta, ignoreChkFns); err != nil { | ||
if err := rewrite(logger, indexr, chunkr, indexw, chunkw, &resmeta, ignoreChkFns); err != nil { | ||
return resid, errors.Wrap(err, "rewrite block") | ||
} | ||
if err := metadata.Write(logger, resdir, &resmeta); err != nil { | ||
return resid, err | ||
} | ||
// TSDB may rewrite metadata in bdir. | ||
// TODO: This is not needed in newer TSDB code. See | ||
// https://github.com/prometheus/tsdb/pull/637 | ||
if err := metadata.Write(logger, bdir, meta); err != nil { | ||
return resid, err | ||
} | ||
return resid, nil | ||
} | ||
|
||
|
@@ -595,6 +601,7 @@ type seriesRepair struct { | |
// rewrite writes all data from the readers back into the writers while cleaning | ||
// up mis-ordered and duplicated chunks. | ||
func rewrite( | ||
logger log.Logger, | ||
indexr tsdb.IndexReader, chunkr tsdb.ChunkReader, | ||
indexw tsdb.IndexWriter, chunkw tsdb.ChunkWriter, | ||
meta *metadata.Meta, | ||
|
@@ -665,8 +672,22 @@ func rewrite( | |
return labels.Compare(series[i].lset, series[j].lset) < 0 | ||
}) | ||
|
||
lastSet := labels.Labels{} | ||
// Build a new TSDB block. | ||
for _, s := range series { | ||
// The TSDB library will throw an error if we add a series with | ||
// identical labels as the last series. This means that we have | ||
// discovered a duplicate time series in the old block. We drop | ||
// all duplicate series preserving the first one. | ||
// TODO: Add metric to count dropped series if repair becomes a daemon | ||
// rather than a batch job. | ||
if labels.Compare(lastSet, s.lset) == 0 { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The series are sorted based on their labelSet. What happens if two labelSets are identical? For one they will be consecutive in the slice. So if the labelSet we have now is the same as the previous labelSet we have a duplicate series which the TSDB code will produce an error on. So we just drop that series. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's add a comment on the implied behavior here. I believe you when you say you've seen this behavior, but it worries me, as something went terribly wrong wherever these blocks were produced. We should at least document that the heuristic here is that on collision we keep the first series found and drop consecutive ones. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. Thank you! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry one more thing. As this shouldn't generally happen, let's add a metric so we can monitor when it does happen. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The repair process batch job only which makes metrics less useful. I had some similar comments in #964 Perhaps I should leave a comment if that were to change? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes let's in that case put a TODO in there There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. TODO added. |
||
level.Warn(logger).Log("msg", | ||
"dropping duplicate series in tsdb block found", | ||
"labelset", s.lset.String(), | ||
) | ||
continue | ||
} | ||
if err := chunkw.WriteChunks(s.chks...); err != nil { | ||
return errors.Wrap(err, "write chunks") | ||
} | ||
|
@@ -691,6 +712,7 @@ func rewrite( | |
} | ||
postings.Add(i, s.lset) | ||
i++ | ||
lastSet = s.lset | ||
} | ||
|
||
s := make([]string, 0, 256) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove double spaces in this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Old habits die hard...fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I ask where is this habit from? Sorry for my ignorance - super curious (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol... Search google for "double space after period" you'll learn more than you bargained for. Like, my age. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL never heard of this either. But actually the rule could apply here since code formatting is using monospaced fonts mostly :)