Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: rack event baggage handling #965

Merged
merged 3 commits into from
May 8, 2024

Conversation

robertlaurin
Copy link
Contributor

@robertlaurin robertlaurin commented May 7, 2024

Fixes extracted baggage entries not being attached to the opentelemetry context.

@robertlaurin robertlaurin force-pushed the rack-event-baggage branch from 58ea3b6 to 0315130 Compare May 7, 2024 20:34
request.env[TOKENS_KEY] = register_current_span(span)
span_ctx = OpenTelemetry::Trace.context_with_span(span, parent_context: parent_context)
rack_ctx = OpenTelemetry::Instrumentation::Rack.context_with_span(span, parent_context: span_ctx)
request.env[TOKEN_KEY] = OpenTelemetry::Context.attach(rack_ctx)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to attach multiple context objects just a single context that captures:

  • The extracted context (this is where we get our baggage)
  • The current span context
  • The current rack span context

The *.context_with_span calls make use of the set_value context interface which always returns a new ctx object.

So instead of layering contexts, we're building a single one here that contains all the entries needed when we start a rack request.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL

@robertlaurin
Copy link
Contributor Author

robertlaurin commented May 7, 2024

To fix the untraced endpoint we'll need to update the utilities helper to allow for a non block structured way of setting the untraced ctx, as the key for the entry is currently private.

Once we have the right way of setting the untraced context then we can do something like the following

diff --git a/instrumentation/rack/lib/opentelemetry/instrumentation/rack/middlewares/event_handler.rb b/instrumentation/rack/lib/opentelemetry/instrumentation/rack/middlewares/event_handler.rb
index 4b283a4b..bc51077c 100644
--- a/instrumentation/rack/lib/opentelemetry/instrumentation/rack/middlewares/event_handler.rb
+++ b/instrumentation/rack/lib/opentelemetry/instrumentation/rack/middlewares/event_handler.rb
@@ -52,7 +52,12 @@ module OpenTelemetry
           # @param [Rack::Response] This is nil in practice
           # @return [void]
           def on_start(request, _)
-            return if untraced_request?(request.env)
+            if untraced_request?(request.env)
+              request.env[TOKEN_KEY] = OpenTelemetry::Context.attach(
+                OpenTelemetry::Common::Utilities.untraced_ctx
+              )
+              return
+            end
 
             parent_context = extract_remote_context(request)
             span = create_span(parent_context, request)

cc: @robbkidd

def detach_context(request)
return nil unless request.env[TOKEN_KEY]

OpenTelemetry::Trace.current_span.finish
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lack of symmetry (with on_start's use of Foo.context_with_span) threw me quite a bit here. Nothing to be done - just noting my confusion. Basically down to on_start managing contexts explicitly, whereas here we use the implicit "current context".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would make sense to store a reference to the rack span in the rack request env so we can close it later, rather than relying on in the implicit current_span.

With the current approach we're vulnerable to a scenario where some misconfigured or buggy instrumentation opening a span, setting the context, and not closing it... right? It's probably not likely, but still plausible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Rack span is not the current span at this point then it means the context token is out of sync also and we will likely get a detach error.

If this tracked the token and the Context together as a tuple then it would fetch the Rack span from it instead and ensure it is closing the right span and the token/Context would be the same.

If something went wrong with a different implicit "current" context not being detached, it would still result in a detach error but at least the Rack span would be the one we are trying to finish.
e.g.

return nil unless request.env[RACK_OTEL_CONTEXT_KEY]

[token, rack_ctx] = request.env[RACK_OTEL_CONTEXT_KEY]

OpenTelemetry::Trace.current_span(rack_ctx).finish
OpenTelemetry::Context.detach(token)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense to me, see 0b666b9

request.env[TOKENS_KEY] = register_current_span(span)
span_ctx = OpenTelemetry::Trace.context_with_span(span, parent_context: parent_context)
rack_ctx = OpenTelemetry::Instrumentation::Rack.context_with_span(span, parent_context: span_ctx)
request.env[RACK_OTEL_CONTEXT_KEY] = [OpenTelemetry::Context.attach(rack_ctx), rack_ctx]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need both the token and the context here? 👀 Oh, because you need to fetch the span from the context to finish it, and use the token to detach. Can we either document the content of request.env[RACK_OTEL_CONTEXT_KEY] or rename the key to something that better describes the content?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive question: do we need the context? Could we just use the span? I.e.

            rack_ctx = OpenTelemetry::Instrumentation::Rack.context_with_span(span, parent_context: parent_context)
            token = OpenTelemetry::Context.attach(rack_ctx)
            request.env[OTEL_TOKEN_AND_SPAN] = [token, span]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just use the span dfcbfb2

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion for using the context may be speculative generality based on the rest of the OTel API always wanting or requiring a context (usually defaulting to current).

No objections from me. If we need the context in the future we can change this.

Comment on lines 197 to 201
return nil unless request.env[RACK_OTEL_CONTEXT_KEY]

token, rack_ctx = request.env[RACK_OTEL_CONTEXT_KEY]
OpenTelemetry::Trace.current_span(rack_ctx).finish
OpenTelemetry::Context.detach(token)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we store the token and span instead:

Suggested change
return nil unless request.env[RACK_OTEL_CONTEXT_KEY]
token, rack_ctx = request.env[RACK_OTEL_CONTEXT_KEY]
OpenTelemetry::Trace.current_span(rack_ctx).finish
OpenTelemetry::Context.detach(token)
return nil unless request.env[OTEL_TOKEN_AND_SPAN]
token, span = request.env[OTEL_TOKEN_AND_SPAN]
span.finish
OpenTelemetry::Context.detach(token)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 dfcbfb2

Copy link
Contributor

@fbogsany fbogsany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 LGTM

@arielvalentin arielvalentin merged commit cdae50a into open-telemetry:main May 8, 2024
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants