Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in-memory storage support for adaptive sampling #3335

Merged
merged 10 commits into from
Oct 25, 2021

Conversation

srikanthccv
Copy link
Contributor

Signed-off-by: Srikanth Chekuri srikanth.chekuri92@gmail.com

Which problem is this PR solving?

Signed-off-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
@srikanthccv srikanthccv marked this pull request as ready for review October 21, 2021 05:28
@srikanthccv srikanthccv requested a review from a team as a code owner October 21, 2021 05:28
@srikanthccv srikanthccv requested a review from vprithvi October 21, 2021 05:28
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this form the implementation has a memory leak where data will grow unbounded. Note these two things:

  • The storage only needs to keep N buckets of throughput because the processor uses them with exponential decay (there is a parameter in the processor that specifies how many). So we could bound the storage by that number.
  • Even though the final probabilities are appended to the storage, the processor only uses the latest record. The only reason we used to store the previous ones is to have a history of calculations for debugging purposes (it's not even exposed via API, only by looking at the database directly). So the simplest implementation would be to keep exactly one record of probabilities in memory, 2nd simplest to keep a fixed-length round-robin buffer (but then there should be a way to inspect it since there's no database we can query, e.g. by exposing via expvar).

plugin/storage/memory/lock.go Outdated Show resolved Hide resolved
plugin/storage/memory/lock.go Outdated Show resolved Hide resolved
plugin/storage/memory/lock_test.go Outdated Show resolved Hide resolved
plugin/storage/memory/sampling.go Outdated Show resolved Hide resolved
plugin/storage/memory/sampling_test.go Outdated Show resolved Hide resolved
@srikanthccv
Copy link
Contributor Author

Thanks @yurishkuro for the review. I took the default unbounded number of traces as an inspiration but given the processor implementation it makes sense to limit the number of entries to N. Regarding probabilities, I think we can go with the simplest approach first.

Co-authored-by: Yuri Shkuro <yurishkuro@users.noreply.github.com>
Signed-off-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
Signed-off-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
…nto srikanth/in-memory

Signed-off-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
plugin/storage/memory/lock_test.go Outdated Show resolved Hide resolved
plugin/storage/memory/lock_test.go Outdated Show resolved Hide resolved
plugin/storage/memory/sampling.go Outdated Show resolved Hide resolved
plugin/storage/memory/sampling.go Outdated Show resolved Hide resolved
plugin/storage/memory/sampling.go Outdated Show resolved Hide resolved
plugin/storage/memory/sampling.go Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 25, 2021

Codecov Report

Merging #3335 (fab68ef) into master (b53d901) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3335      +/-   ##
==========================================
+ Coverage   96.01%   96.04%   +0.03%     
==========================================
  Files         259      261       +2     
  Lines       15422    15464      +42     
==========================================
+ Hits        14807    14852      +45     
+ Misses        523      520       -3     
  Partials       92       92              
Impacted Files Coverage Δ
plugin/sampling/strategystore/adaptive/factory.go 100.00% <100.00%> (ø)
plugin/storage/cassandra/factory.go 97.08% <100.00%> (ø)
plugin/storage/memory/factory.go 100.00% <100.00%> (ø)
plugin/storage/memory/lock.go 100.00% <100.00%> (ø)
plugin/storage/memory/sampling.go 100.00% <100.00%> (ø)
cmd/collector/app/server/zipkin.go 68.29% <0.00%> (-2.44%) ⬇️
cmd/query/app/static_handler.go 97.60% <0.00%> (-1.20%) ⬇️
plugin/storage/integration/integration.go 79.28% <0.00%> (+0.39%) ⬆️
plugin/storage/badger/spanstore/reader.go 96.21% <0.00%> (+0.70%) ⬆️
pkg/config/tlscfg/cert_watcher.go 94.73% <0.00%> (+2.10%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b53d901...fab68ef. Read the comment docs.

Signed-off-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
maxBuckets int
}

type Throughput struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these don't look like they need to be public

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please suggest some name. I had a hard time differentiating b/w throughputs and lowercase only made it more difficult.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storedThroughput?

Signed-off-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
Signed-off-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
@yurishkuro yurishkuro enabled auto-merge (squash) October 25, 2021 12:42
@yurishkuro yurishkuro merged commit e9bf6ed into jaegertracing:master Oct 25, 2021
@yurishkuro
Copy link
Member

Thanks! There's probably a place in the docs we can tweak to mention this new backend.

@srikanthccv
Copy link
Contributor Author

It is the docs https://www.jaegertracing.io/docs/1.27/sampling/#adaptive-sampling that led me to the issue. Do you want me to send a PR updating the docs now itself or once the remaining backends are also supported? I am planning to send a follow up PR for es (and badger later).

@yurishkuro
Copy link
Member

Hey, it's up to you. We're approaching the next release (in a week), I am not sure if there's enough time to implement other (real) backends. Maybe makes sense to just send a small PR for the next-release docs mentioning memory option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants