-
Notifications
You must be signed in to change notification settings - Fork 682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A new stat for bytes read off the disk #1702
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few comments..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks sreekanth, I've included your suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor...
index/scorch/merge.go
Outdated
@@ -426,6 +435,16 @@ type segmentMerge struct { | |||
notifyCh chan *mergeTaskIntroStatus | |||
} | |||
|
|||
func cumulateBytesRead(sbs []segment.Segment) uint64 { | |||
rv := uint64(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var rv uint64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
index/scorch/scorch.go
Outdated
for itemsDeQueued < numUpdates { | ||
result := <-resultChan | ||
resultSize := result.Size() | ||
atomic.AddUint64(&s.iStats.analysisBytesAdded, uint64(resultSize)) | ||
totalAnalysisSize += resultSize | ||
analysisResults[itemsDeQueued] = result | ||
itemsDeQueued++ | ||
result.VisitFields(func(f index.Field) { | ||
atomic.AddUint64(&s.stats.TotIndexedAnalysisBytes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better stats wordings
TotBytesIndexedAfterAnalysis?
TotBytesReadDuringQueryTime Or TotBytesReadAtQueryTime
so that both stats share the same prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
index/scorch/snapshot_index.go
Outdated
@@ -146,10 +148,18 @@ func (i *IndexSnapshot) newIndexSnapshotFieldDict(field string, | |||
results := make(chan *asynchSegmentResult) | |||
for index, segment := range i.segment { | |||
go func(index int, segment *SegmentSnapshot) { | |||
var prevBytesRead uint64 | |||
if seg, ok := segment.segment.(diskStatsReporter); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same interface checks were repeated at 152 and 159.
Can't we make that to 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah done, my bad.
index/scorch/snapshot_index.go
Outdated
postings.BytesRead()-prevBytesReadPL) | ||
} | ||
|
||
if itr, ok := rv.iterators[i].(diskStatsReporter); ok && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid repeated interface checks for the same vars..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a re-initialisation of the var rv.iterators[i], so i can't reuse the same variable over here, since the bytes read as part of the iterator initialisation won't be part of the old variable.
@@ -387,13 +387,23 @@ func (s *Scorch) Batch(batch *index.Batch) (err error) { | |||
analysisResults := make([]index.Document, int(numUpdates)) | |||
var itemsDeQueued uint64 | |||
var totalAnalysisSize int | |||
analysisBytes := func(tokMap index.TokenFrequencies) (rv uint64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a flag here to calculate this only when the flag is set?
Feel like this is something we don't always necessarily need to do - hope none of this calculation shows up in CPU profiles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is enough, we've to record copies - stored values and doc values. Also term vectors. I don't see you doing this in your zap PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to close the thread, these two changes are going to be included in upcoming PRs.
@@ -387,13 +387,23 @@ func (s *Scorch) Batch(batch *index.Batch) (err error) { | |||
analysisResults := make([]index.Document, int(numUpdates)) | |||
var itemsDeQueued uint64 | |||
var totalAnalysisSize int | |||
analysisBytes := func(tokMap index.TokenFrequencies) (rv uint64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is enough, we've to record copies - stored values and doc values. Also term vectors. I don't see you doing this in your zap PR.
@@ -352,6 +353,14 @@ func (s *Scorch) planMergeAtSnapshot(ctx context.Context, | |||
atomic.AddUint64(&s.stats.TotFileMergePlanTasksErr, 1) | |||
return err | |||
} | |||
|
|||
switch segI := seg.(type) { | |||
case segment.DiskStatsReporter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per my latest conversation with Jon, we shouldn't be accounting for content written to disk by the merger. More conversation needed here I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the case here as well, we are only accounting for the BytesRead from disk for the new merge created segment. And this would anyway be needed/accounted for the next query on this newly formed segment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to carry further the stats from the older segments that got merged into the newly formed segment so that the accounting of stats is cumulative. And hence the need for the SetBytesRead() api.
If we fail to consider the stats from the pre-merge segments, then the bytes accounting go wrong.
Have posted alternative thought in zapx PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
index/scorch/snapshot_index.go
Outdated
@@ -424,6 +435,11 @@ func (i *IndexSnapshot) Document(id string) (rv index.Document, err error) { | |||
segmentIndex, localDocNum := i.segmentIndexAndLocalDocNumFromGlobal(docNum) | |||
|
|||
rvd := document.NewDocument(id) | |||
var prevBytesRead uint64 | |||
seg, ok := i.segment[segmentIndex].segment.(segment.DiskStatsReporter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename ok to say diskStatsAvailable so in line 472 below we know why things are ok.
index/scorch/snapshot_index.go
Outdated
rv.dicts = make([]segment.TermDictionary, len(is.segment)) | ||
for i, segment := range is.segment { | ||
var prevBytesRead uint64 | ||
segP, ok := segment.segment.(diskStatsReporter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, rename ok to diskStatsAvailable.
index/scorch/snapshot_index.go
Outdated
@@ -661,10 +709,18 @@ func (i *IndexSnapshot) documentVisitFieldTermsOnSegment( | |||
} | |||
|
|||
if ssvOk && ssv != nil && len(vFields) > 0 { | |||
var prevBytesRead uint64 | |||
ssvp, ok := ssv.(segment.DiskStatsReporter) | |||
if ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
index/scorch/snapshot_index_tfr.go
Outdated
@@ -76,6 +76,11 @@ func (i *IndexSnapshotTermFieldReader) Next(preAlloced *index.TermFieldDoc) (*in | |||
} | |||
// find the next hit | |||
for i.segmentOffset < len(i.iterators) { | |||
prevBytesRead := uint64(0) | |||
itr, ok := i.iterators[i.segmentOffset].(segment.DiskStatsReporter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
No description provided.