Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Regression]: opentelemetry_sdk::TracerProvider::span_processors no longer public #2060

Closed
demurgos opened this issue Aug 28, 2024 · 6 comments · Fixed by #2119
Closed

[Regression]: opentelemetry_sdk::TracerProvider::span_processors no longer public #2060

demurgos opened this issue Aug 28, 2024 · 6 comments · Fixed by #2119
Labels
bug Something isn't working triage:todo Needs to be traiged.

Comments

@demurgos
Copy link
Contributor

What happened?

PR #1755 did some API clean-ups, but it was too aggressive and turned some useful APIs private. In particular opentelemetry_sdk::TracerProvider::span_processors was pub but is now only pub(crate).

I have an HTTP server that acts as an OTLP proxy. A tracer provider is configured with exporters. This tracers is then used either to directly write traces for the main server, or to forward traces received through POST /v1/traces/.
To forward already existing spans received through OTLP, you need access to the span_processors. This was previously exposed by opentelemetry_sdk::TracerProvider::span_processors in version 0.23.0 but since version 0.24.0 it is no longer exposed.

Please expose this method so it is possible to wrap a tracer provider into a higher level tracer provider adapter.

API Version

0.24.0

SDK Version

0.24.1

What Exporter(s) are you seeing the problem on?

OTLP

Relevant log output

error[E0624]: method `span_processors` is private
  --> crates/core/src/opentelemetry/mod.rs:28:43
   |
28 |       Self::Regular(provider) => provider.span_processors(),
   |                                           ^^^^^^^^^^^^^^^ private method
   |
  ::: /home/demurgos/.cargo/registry/src/index.crates.io-6f17d22bba15001f/opentelemetry_sdk-0.24.1/src/trace/provider.rs:92:5
   |
92 |     pub(crate) fn span_processors(&self) -> &[Box<dyn SpanProcessor>] {
   |     ----------------------------------------------------------------- private method defined here
@demurgos demurgos added bug Something isn't working triage:todo Needs to be traiged. labels Aug 28, 2024
@lalitb
Copy link
Member

lalitb commented Aug 28, 2024

opentelemetry_sdk::TracerProvider::span_processors() is not part of Otel specs, and also application is not supposed to directly invoke any methods from processor.

Though, the use-case is not very clear. So, the HTTP server receives the span-data as POST request, and it want to directly invoke SpanProcessor::on_start() and SpanProcessor::on_end() on that to further export to OTLP Collector ?

@demurgos
Copy link
Contributor Author

demurgos commented Aug 28, 2024

I'll clarify my use-case a bit more.

I work on a website for browser games. The architecture is such that each game has its own web server, and there is one extra server for platform features, such as user accounts or logs. The games are built such that they only talk to their DB or the platform server, never to any outside services. This helps with development as you can very easily run everything locally and don't have to deal with various integrations, they're all abstracted away.

For logs, we use OpenTelemetry as it has SDKs for many environments and allows us to use the same framework everywhere. It also has a solid model. The platform server is implemented in Rust and uses opentelemetry-rust.

When the platform starts, it reads its config and builds a tracer provider and configures it with the right exporters. In production we use the tonic GRPC exporter to send data to our observability service where you can then search and monitor things. But we also support configurations where traces are exported to stdout as JSON or human readable text, this is especially useful during local development. This tracer provider is used for all traces generated by the platform server and works great this way.

An extra feature of the platform server is that it implements the OTLP collector API. It means that game servers can send all their traces to the platform server: the platformer is the only place were exporters are actually configured. This is also used to implement authentication, rate limiting and ensure that the traces from each game have the proper attributes. This handler is the proxy that I'm talking about in my first message. At its core, all it does is:

  let span_processors: &[Box<dyn SpanProcessor>] = tracer_provider.span_processors();
  let spans: Vec<SpanData> = read_payload_and_do_checks();
  for span in spans {
    for processor in span_processors {
      processor.on_end(span.clone());
    }
  }

And finally here is a diagram summarizing how it works

graph LR;
    subgraph game
        gameCode --> gameTracer;
        gameTracer --> gameOtlpExporter;
    end
    subgraph platform
        pfCode --> pfTracer;
        pfTracer --> pfSpanProcessor;
        pfOtlpHandler --> pfSpanProcessor;
        pfSpanProcessor --> pfExporter;
    end
    subgraph telemetryBackend
        stdoutJson;
        stdoutHuman;
        grpcCollector;
    end
    
     gameOtlpExporter --> pfOtlpHandler;
        pfExporter --> stdoutJson;
        pfExporter --> stdoutHuman;
        pfExporter --> grpcCollector;
Loading

As a summary, I would like a way to use the tracer provider to export fully built SpanData values unchanged to its span processors/exporters. Going through the TracerProvider and Tracer traits adds a lot of noise and is more hacky than just passing the span data when you have them already. The span_processor method was perfect for this, and hiding it from consumers without an alternative means that it's now no longer possible to pass SpanData through, a deplorable loss in usability.

Regarding Otel Spec, my understanding is that they define a baseline recommended API but don't forbid you from exposing extra methods.

As a meta commentary, I would appreciate if the crate and libs remained modular and exposed utilities without forcing you to go through extra abstractions. For example the traceparent parsing and deserialization could be a pure method instead of also being entangled with retrieval from the propagator, but that's off-topic.

@lalitb
Copy link
Member

lalitb commented Aug 30, 2024

Thanks @demurgos for the details, it gives more clarity now.
@cijothomas @TommyCpp if you have comments regarding exposing the TracerProvider::span_processors() method, and letting applications explicitly call the processor.on_end(spandata) for the configured processors. We can also discuss this in community meeting.

@demurgos
Copy link
Contributor Author

demurgos commented Sep 15, 2024

I've been looking into it a bit more. I was able to solve my issue at a lower level. Instead of passing my exporters directly to the tracer provider builder using with_batch_exporter, I instead create the batch processor manually and place it in an Arc. It lets me clone the processor and pass one to the builder with with_span_processor and one to my proxy handler.

This feels cleaner overall, even if there's the extra Arc indirection. While looking into it I noticed an inconsistency between SimpleSpanProcessor and BatchSpanProcessor. It is possible to build a BatchSpanProcessor from the outside using the builder, but there is no way to build a SimpleSpanProcessor.

Based on this, my request would be to just expose SimpleSpanProcessor::new for consistency with BatchSpanProcessor. This should be a less controversial change that still enables my OTLP proxying use-case. Is it OK if I submit a PR for this?

demurgos added a commit to demurgos/opentelemetry-rust that referenced this issue Sep 16, 2024
This commit update the visibility of the method `SimpleSpanProcessor::new` from `pub(crate)` to `pub`. This enables consumers to create `SimpleSpanProcessor` values for use with `trace::provider::Builder::with_span_processor`.

In particular, this fixes a consistency issue: the `BatchSpanProcessor` struct may already be built thanks to its `pub` builder. With this change, both processors in this repo may be built manually.

Closes open-telemetry#2060
demurgos added a commit to demurgos/opentelemetry-rust that referenced this issue Sep 16, 2024
This commit update the visibility of the method `SimpleSpanProcessor::new` from `pub(crate)` to `pub`. This enables consumers to create `SimpleSpanProcessor` values for use with `trace::provider::Builder::with_span_processor`.

In particular, this fixes a consistency issue: the `BatchSpanProcessor` struct may already be built thanks to its `pub` builder. With this change, both processors in this repo may be built manually.

Closes open-telemetry#2060
@lalitb
Copy link
Member

lalitb commented Sep 16, 2024

I'm considering whether we should also remove the with_simple_processor and with_batch_processor API methods, as they are simply convenient wrappers around with_span_processor. It doesn't seem necessary to support multiple APIs for the same functionality. The same reasoning could apply for logs as well.

@demurgos
Copy link
Contributor Author

Agreed that these methods are not that useful. They could be removed, but it's a more involved change as it would also require updating every place where it's currently used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage:todo Needs to be traiged.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants