Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Tail Based Sampling Processor From OTEL Collector Extension #5878

Merged
merged 30 commits into from
Aug 31, 2024

Conversation

mahadzaryab1
Copy link
Collaborator

@mahadzaryab1 mahadzaryab1 commented Aug 22, 2024

Which problem is this PR solving?

Description of the changes

  • Added the tail-based sampling processor extension from otel to jaeger
  • Added a docker compose to demonstrate usage of the tail-based sampling processor extension in jaeger.
  • Added an end to end integration test to test that the new processor works as expected
  • Added a README to the docker compose setup describing the setup and usage of the new processor

How was this change tested?

  • An end to end integration test was added and is run from the CI

Checklist

Copy link

codecov bot commented Aug 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.82%. Comparing base (9a30dfc) to head (c50ded3).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5878      +/-   ##
==========================================
- Coverage   96.83%   96.82%   -0.02%     
==========================================
  Files         342      342              
  Lines       16524    16525       +1     
==========================================
- Hits        16001    16000       -1     
- Misses        337      339       +2     
  Partials      186      186              
Flag Coverage Δ
badger_v1 8.05% <ø> (ø)
badger_v2 1.82% <ø> (ø)
cassandra-3.x-v1 16.62% <ø> (ø)
cassandra-3.x-v2 1.75% <ø> (ø)
cassandra-4.x-v1 16.62% <ø> (ø)
cassandra-4.x-v2 1.75% <ø> (ø)
elasticsearch-6.x-v1 18.79% <ø> (+0.01%) ⬆️
elasticsearch-7.x-v1 18.85% <ø> (+0.01%) ⬆️
elasticsearch-8.x-v1 19.03% <ø> (ø)
elasticsearch-8.x-v2 1.82% <ø> (ø)
grpc_v1 9.49% <ø> (ø)
grpc_v2 7.16% <ø> (ø)
kafka-v1 9.74% <ø> (ø)
kafka-v2 1.82% <ø> (ø)
memory_v2 1.82% <ø> (ø)
opensearch-1.x-v1 18.90% <ø> (+0.01%) ⬆️
opensearch-2.x-v1 18.90% <ø> (ø)
opensearch-2.x-v2 1.82% <ø> (ø)
tailsampling-processor 0.46% <ø> (?)
unittests 95.30% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yurishkuro
Copy link
Member

I tested the binary sizes

  • without this change - 75Mb
  • with this change - 114Mb
  • with only tail sampler (without load balancer) - 78Mb

I would suggest we only include tail sampler. That's the component that's most useful in the final collector, which needs to be Jaeger. But the upstream load balancing collectors can be just OTEL Collectors.

@mahadzaryab1 mahadzaryab1 force-pushed the tail-based-sampling branch 3 times, most recently from 84a9b93 to 929c409 Compare August 24, 2024 23:28
@mahadzaryab1 mahadzaryab1 force-pushed the tail-based-sampling branch 2 times, most recently from 01c9743 to e15af38 Compare August 30, 2024 23:47
cmd/jaeger/internal/integration/tailsampling_test.go Outdated Show resolved Hide resolved
cmd/jaeger/internal/integration/tailsampling_test.go Outdated Show resolved Hide resolved
docker-compose/tail-sampling/Makefile Outdated Show resolved Hide resolved
docker-compose/tail-sampling/Makefile Outdated Show resolved Hide resolved
cmd/jaeger/internal/integration/tailsampling_test.go Outdated Show resolved Hide resolved
@@ -170,6 +170,10 @@ index-cleaner-integration-test: docker-images-elastic
index-rollover-integration-test: docker-images-elastic
$(MAKE) storage-integration-test COVEROUT=cover-index-rollover.out

.PHONY: tail-sampling-integration-test
tail-sampling-integration-test:
SAMPLING=tail $(MAKE) jaeger-v2-storage-integration-test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this runs go test, but when do you start the docker compose environment?

All other e2e tests have a driver script that orchestrates all components of the test, e.g. scripts/es-integration-test.sh

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not using the docker-compose environment for my test. Calling e2eInitialize is enough to start the Jaeger collector. You can simply run this test by calling make tail-sampling-integration-test. Let me know if you want to change any of this setup though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. However, it means that the new docker compose file will begin to rot since it's not being exercised by the CI, something we tried to avoid (e.g. see e2e spm test). So it would be good to actually combine using docker compose with e2e test.

Copy link
Collaborator Author

@mahadzaryab1 mahadzaryab1 Aug 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see! What would you want this to look like? The current docker-compose set up generates load using tracegen which ideally we wouldn't want in the integration test so we can manually generate those. And the existing setup in the E2E tests does some nice things for us like flush the storage in between tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that's why I was asking from the start what your plan would be. The way you are using e2e_integration framework is very lightweight, and I could easily see an alternative setup where everything is just orchestrated from a shell script

  • run docker-compose with one config
    • maybe don't include tracegen in compose, run it manually
  • do a curl against query service to retrieve service names as JSON (trivial to write)
  • shut down docker-compose (to clear the storage) and run again with different config

If you are interested to pursue this approach, I would suggest still merging this PR first so that we already have something in place. Can you finish the README?

Copy link
Collaborator Author

@mahadzaryab1 mahadzaryab1 Aug 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro That sounds good to me and I can pursue that approach in a follow-up PR. And yes, working on the README now. Will push it up soon.

@yurishkuro
Copy link
Member

yurishkuro commented Aug 31, 2024

make sure to do git pull, I pushed updates to go.mod to resolve conflicts

@mahadzaryab1
Copy link
Collaborator Author

make sure to do git pull, I pushed updates to go.mod to resolve conflicts

Thank you so much for doing this for me!

@yurishkuro yurishkuro added changelog:exprimental Change to an experimental part of the code v2 labels Aug 31, 2024
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@mahadzaryab1 mahadzaryab1 changed the title [WIP] feat: support tail based sampling from otel collector [WIP] feat: support tail based sampling from otel collector extension Aug 31, 2024
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@mahadzaryab1 mahadzaryab1 changed the title [WIP] feat: support tail based sampling from otel collector extension feat(processor): support tail based sampling processor from otel collector extension Aug 31, 2024
@mahadzaryab1 mahadzaryab1 changed the title feat(processor): support tail based sampling processor from otel collector extension Support Tail Based Sampling Processor From Otel Collector Extension Aug 31, 2024
@mahadzaryab1 mahadzaryab1 marked this pull request as ready for review August 31, 2024 16:47
@mahadzaryab1 mahadzaryab1 requested a review from a team as a code owner August 31, 2024 16:47
@dosubot dosubot bot added area/otel docker Pull requests that update Docker code enhancement labels Aug 31, 2024
@mahadzaryab1
Copy link
Collaborator Author

mahadzaryab1 commented Aug 31, 2024

@yurishkuro the README and the rest of the PR is ready for review now

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@mahadzaryab1
Copy link
Collaborator Author

@yurishkuro - there looks to be a failing test in the CI. Is it a flaky test? I don't believe its related to my changes.

@mahadzaryab1 mahadzaryab1 changed the title Support Tail Based Sampling Processor From Otel Collector Extension Support Tail Based Sampling Processor From OTEL Collector Extension Aug 31, 2024
@yurishkuro yurishkuro merged commit 8ad6ed0 into jaegertracing:main Aug 31, 2024
50 checks passed
@yurishkuro
Copy link
Member

🎉 🎉 🎉

@mahadzaryab1 mahadzaryab1 deleted the tail-based-sampling branch August 31, 2024 22:42
codeboten referenced this pull request in open-telemetry/opentelemetry-collector-contrib Sep 25, 2024
…35259)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
|
[github.com/jaegertracing/jaeger](https://redirect.github.com/jaegertracing/jaeger)
| `v1.60.0` -> `v1.61.0` |
[![age](https://developer.mend.io/api/mc/badges/age/go/github.com%2fjaegertracing%2fjaeger/v1.61.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/go/github.com%2fjaegertracing%2fjaeger/v1.61.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/go/github.com%2fjaegertracing%2fjaeger/v1.60.0/v1.61.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/go/github.com%2fjaegertracing%2fjaeger/v1.60.0/v1.61.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

> [!WARNING]
> Some dependencies could not be looked up. Check the Dependency
Dashboard for more information.

---

### Release Notes

<details>
<summary>jaegertracing/jaeger
(github.com/jaegertracing/jaeger)</summary>

###
[`v1.61.0`](https://redirect.github.com/jaegertracing/jaeger/releases/tag/v1.61.0):
/ v2.0.0-rc1

[Compare
Source](https://redirect.github.com/jaegertracing/jaeger/compare/v1.60.0...v1.61.0)

##### Backend Changes

This release contains an official pre-release candidate of Jaeger v2, as
binary and Docker image `jaeger`.

##### ⛔ Breaking Changes

- Remove support for cassandra 3.x and add cassandra 5.x
([@&#8203;mahadzaryab1](https://redirect.github.com/mahadzaryab1) in
[#&#8203;5962](https://redirect.github.com/jaegertracing/jaeger/pull/5962))

##### 🐞 Bug fixes, Minor Improvements

- Fix: the 'tagtype' in es jaeger-span mapping tags.properties should be
'type' ([@&#8203;chinaran](https://redirect.github.com/chinaran) in
[#&#8203;5980](https://redirect.github.com/jaegertracing/jaeger/pull/5980))
- Add readme for adaptive sampling
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5955](https://redirect.github.com/jaegertracing/jaeger/pull/5955))
- \[adaptive sampling] clean-up after previous refactoring
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5954](https://redirect.github.com/jaegertracing/jaeger/pull/5954))
- \[adaptive processor] remove redundant function
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5953](https://redirect.github.com/jaegertracing/jaeger/pull/5953))
- \[jaeger-v2] consolidate options and namespaceconfig for badger
storage
([@&#8203;mahadzaryab1](https://redirect.github.com/mahadzaryab1) in
[#&#8203;5937](https://redirect.github.com/jaegertracing/jaeger/pull/5937))
- Remove unused "namespace" field from badger config
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5929](https://redirect.github.com/jaegertracing/jaeger/pull/5929))
- Simplify bundling of ui assets
([@&#8203;mahadzaryab1](https://redirect.github.com/mahadzaryab1) in
[#&#8203;5917](https://redirect.github.com/jaegertracing/jaeger/pull/5917))
- Clean up grpc storage config
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5877](https://redirect.github.com/jaegertracing/jaeger/pull/5877))
- Add script to replace apache headers with spdx
([@&#8203;thecaffeinedev](https://redirect.github.com/thecaffeinedev) in
[#&#8203;5808](https://redirect.github.com/jaegertracing/jaeger/pull/5808))
- Add copyright/license headers to script files
([@&#8203;Zen-cronic](https://redirect.github.com/Zen-cronic) in
[#&#8203;5829](https://redirect.github.com/jaegertracing/jaeger/pull/5829))
- Clearer output from lint scripts
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5820](https://redirect.github.com/jaegertracing/jaeger/pull/5820))

##### 🚧 Experimental Features

- \[jaeger-v2] add validation and comments to badger storage config
([@&#8203;mahadzaryab1](https://redirect.github.com/mahadzaryab1) in
[#&#8203;5927](https://redirect.github.com/jaegertracing/jaeger/pull/5927))
- \[jaeger-v2] add validation and comments to memory storage config
([@&#8203;mahadzaryab1](https://redirect.github.com/mahadzaryab1) in
[#&#8203;5925](https://redirect.github.com/jaegertracing/jaeger/pull/5925))
- Support tail based sampling processor from otel collector extension
([@&#8203;mahadzaryab1](https://redirect.github.com/mahadzaryab1) in
[#&#8203;5878](https://redirect.github.com/jaegertracing/jaeger/pull/5878))
- \[v2] configure health check extension for all configs
([@&#8203;Wise-Wizard](https://redirect.github.com/Wise-Wizard) in
[#&#8203;5861](https://redirect.github.com/jaegertracing/jaeger/pull/5861))
- \[v2] add legacy formats into e2e kafka integration tests
([@&#8203;joeyyy09](https://redirect.github.com/joeyyy09) in
[#&#8203;5824](https://redirect.github.com/jaegertracing/jaeger/pull/5824))
- \[v2] configure healthcheck extension
([@&#8203;Wise-Wizard](https://redirect.github.com/Wise-Wizard) in
[#&#8203;5831](https://redirect.github.com/jaegertracing/jaeger/pull/5831))
- Added \_total suffix to otel counter metrics.
([@&#8203;Wise-Wizard](https://redirect.github.com/Wise-Wizard) in
[#&#8203;5810](https://redirect.github.com/jaegertracing/jaeger/pull/5810))

##### 👷 CI Improvements

- Release v2 cleanup 3
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5984](https://redirect.github.com/jaegertracing/jaeger/pull/5984))
- Replace loopvar linter
([@&#8203;anishbista60](https://redirect.github.com/anishbista60) in
[#&#8203;5976](https://redirect.github.com/jaegertracing/jaeger/pull/5976))
- Stop using v1 and v1.x tags for docker images
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5956](https://redirect.github.com/jaegertracing/jaeger/pull/5956))
- V2 repease prep
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5932](https://redirect.github.com/jaegertracing/jaeger/pull/5932))
- Normalize build-binaries targets
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5924](https://redirect.github.com/jaegertracing/jaeger/pull/5924))
- Fix integration test log dumping for storage backends
([@&#8203;mahadzaryab1](https://redirect.github.com/mahadzaryab1) in
[#&#8203;5915](https://redirect.github.com/jaegertracing/jaeger/pull/5915))
- Add jaeger-v2 binary as new release artifact
([@&#8203;renovate-bot](https://redirect.github.com/renovate-bot) in
[#&#8203;5893](https://redirect.github.com/jaegertracing/jaeger/pull/5893))
- \[ci] add support for v2 tags during build
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5890](https://redirect.github.com/jaegertracing/jaeger/pull/5890))
- Add hardcoded db password and username to cassandra integration test
([@&#8203;Ali-Alnosairi](https://redirect.github.com/Ali-Alnosairi) in
[#&#8203;5805](https://redirect.github.com/jaegertracing/jaeger/pull/5805))
- Define contents permissions on "dependabot validate" workflow
([@&#8203;mmorel-35](https://redirect.github.com/mmorel-35) in
[#&#8203;5874](https://redirect.github.com/jaegertracing/jaeger/pull/5874))
- \[fix] print kafka logs on test failure
([@&#8203;joeyyy09](https://redirect.github.com/joeyyy09) in
[#&#8203;5873](https://redirect.github.com/jaegertracing/jaeger/pull/5873))
- Pin github actions dependencies
([@&#8203;harshitasao](https://redirect.github.com/harshitasao) in
[#&#8203;5860](https://redirect.github.com/jaegertracing/jaeger/pull/5860))
- Add go.mod for docker debug image
([@&#8203;hellspawn679](https://redirect.github.com/hellspawn679) in
[#&#8203;5852](https://redirect.github.com/jaegertracing/jaeger/pull/5852))
- Enable lint rule: redefines-builtin-id
([@&#8203;ZXYxc](https://redirect.github.com/ZXYxc) in
[#&#8203;5791](https://redirect.github.com/jaegertracing/jaeger/pull/5791))
- Require manual go version updates for patch versions
([@&#8203;wasup-yash](https://redirect.github.com/wasup-yash) in
[#&#8203;5848](https://redirect.github.com/jaegertracing/jaeger/pull/5848))
- Clean up obselete 'version' tag from docker-compose files
([@&#8203;vvs-personalstash](https://redirect.github.com/vvs-personalstash)
in
[#&#8203;5826](https://redirect.github.com/jaegertracing/jaeger/pull/5826))
- Update expected codecov flags count to 19
([@&#8203;yurishkuro](https://redirect.github.com/yurishkuro) in
[#&#8203;5811](https://redirect.github.com/jaegertracing/jaeger/pull/5811))

##### 📊 UI Changes

Dependencies upgrades only.

##### 👏👏👏 New Contributors

- [@&#8203;Nabil-Salah](https://redirect.github.com/Nabil-Salah) made
their first contribution in
[https://github.com/jaegertracing/jaeger/pull/5806](https://redirect.github.com/jaegertracing/jaeger/pull/5806)
-
[@&#8203;vvs-personalstash](https://redirect.github.com/vvs-personalstash)
made their first contribution in
[https://github.com/jaegertracing/jaeger/pull/5826](https://redirect.github.com/jaegertracing/jaeger/pull/5826)
- [@&#8203;Zen-cronic](https://redirect.github.com/Zen-cronic) made
their first contribution in
[https://github.com/jaegertracing/jaeger/pull/5821](https://redirect.github.com/jaegertracing/jaeger/pull/5821)
- [@&#8203;thecaffeinedev](https://redirect.github.com/thecaffeinedev)
made their first contribution in
[https://github.com/jaegertracing/jaeger/pull/5808](https://redirect.github.com/jaegertracing/jaeger/pull/5808)
- [@&#8203;wasup-yash](https://redirect.github.com/wasup-yash) made
their first contribution in
[https://github.com/jaegertracing/jaeger/pull/5848](https://redirect.github.com/jaegertracing/jaeger/pull/5848)
- [@&#8203;ZXYxc](https://redirect.github.com/ZXYxc) made their first
contribution in
[https://github.com/jaegertracing/jaeger/pull/5791](https://redirect.github.com/jaegertracing/jaeger/pull/5791)
- [@&#8203;harshitasao](https://redirect.github.com/harshitasao) made
their first contribution in
[https://github.com/jaegertracing/jaeger/pull/5860](https://redirect.github.com/jaegertracing/jaeger/pull/5860)
- [@&#8203;Ali-Alnosairi](https://redirect.github.com/Ali-Alnosairi)
made their first contribution in
[https://github.com/jaegertracing/jaeger/pull/5805](https://redirect.github.com/jaegertracing/jaeger/pull/5805)
- [@&#8203;chinaran](https://redirect.github.com/chinaran) made their
first contribution in
[https://github.com/jaegertracing/jaeger/pull/5891](https://redirect.github.com/jaegertracing/jaeger/pull/5891)
- [@&#8203;mahadzaryab1](https://redirect.github.com/mahadzaryab1) made
their first contribution in
[https://github.com/jaegertracing/jaeger/pull/5878](https://redirect.github.com/jaegertracing/jaeger/pull/5878)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "on tuesday" (UTC), Automerge - At any
time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/open-telemetry/opentelemetry-collector-contrib).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC44MC4wIiwidXBkYXRlZEluVmVyIjoiMzguODAuMCIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiZGVwZW5kZW5jaWVzIiwicmVub3ZhdGVib3QiXX0=-->

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: opentelemetrybot <107717825+opentelemetrybot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/otel changelog:exprimental Change to an experimental part of the code docker Pull requests that update Docker code enhancement v2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support tail-based sampling from OTEL Collector
2 participants