Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[zpage] initial implementation #595

Merged
merged 14 commits into from
Aug 4, 2021
Merged

Conversation

TommyCpp
Copy link
Contributor

@TommyCpp TommyCpp commented Jul 13, 2021

related #269.

Adding a tracez implementation to debug spans.

It consists of three parts: a ZPageSpanProcessor to install with the TracerProvider; a SpanAggregator to accumulate the sample spans; a channel to send query requests to SpanAggregator.

We probably can extract the protobuf related transform into a separate crate(opentelemetry-proto maybe?) to be shared between opentelemetry-otlp and opentelemetry-zpage.

Another limitation is we cannot record the status of the running spans as it requires we keep an immutable reference to the span. I don't think it's possible without introducing a lock inside the span. #602

Note to reviewers:

  • The transform.rs contains duplicate code from opentelemetry-otlp to transform between protobuf generated types and internal types.

@codecov
Copy link

codecov bot commented Jul 13, 2021

Codecov Report

Merging #595 (4ebc795) into main (93b5a8f) will not change coverage.
The diff coverage is 100.0%.

Impacted file tree graph

@@          Coverage Diff          @@
##            main    #595   +/-   ##
=====================================
  Coverage   54.7%   54.7%           
=====================================
  Files        101     101           
  Lines       8943    8943           
=====================================
  Hits        4897    4897           
  Misses      4046    4046           
Impacted Files Coverage Δ
opentelemetry/src/trace/span_context.rs 89.4% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 93b5a8f...4ebc795. Read the comment docs.

@TommyCpp TommyCpp marked this pull request as ready for review July 25, 2021 01:39
@TommyCpp TommyCpp requested a review from a team July 25, 2021 01:39
Copy link
Member

@jtescher jtescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great start! Should we add a note about the experimental status in the readme? seems to be suggested in https://github.com/open-telemetry/opentelemetry-specification/tree/main/experimental

TommyCpp added 12 commits July 26, 2021 20:45
* Added proto.

* We use a dedicated SpanAggregator to collect information from ZPagesProcessor so that we could collect information from multiple span processor in the future should we needed to do so.

* Always copy SpanData information and send it to aggregator. It may seems unnecessary to refresh the information on running span whenever a span is started. But consider that if the span is generating too fast. Some span may fail to notify the aggregator they already ended. So to make sure no span is stuck as running span example. We refresh the running span example whenever there is a new span started.
We aggregate the information for running spans, error spans and spans in different latencies bucket into SpanStats to better manage them.
…ries.

1. We use counts to trace the total number of spans. It's different from the len because len function tracks the number of sampled spans.

2. Added examples on how to query tracez results.
@TommyCpp TommyCpp merged commit 0f864fe into open-telemetry:main Aug 4, 2021
@jtescher jtescher mentioned this pull request Jul 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants