Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: rack event baggage handling #965

Merged
merged 3 commits into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ module Middlewares
class EventHandler
include ::Rack::Events::Abstract

TOKENS_KEY = 'otel.context.tokens'
TOKEN_KEY = 'otel.context.token'
GOOD_HTTP_STATUSES = (100..499)

# Creates a server span for this current request using the incoming parent context
Expand All @@ -56,7 +56,9 @@ def on_start(request, _)

parent_context = extract_remote_context(request)
span = create_span(parent_context, request)
request.env[TOKENS_KEY] = register_current_span(span)
span_ctx = OpenTelemetry::Trace.context_with_span(span, parent_context: parent_context)
rack_ctx = OpenTelemetry::Instrumentation::Rack.context_with_span(span, parent_context: span_ctx)
request.env[TOKEN_KEY] = OpenTelemetry::Context.attach(rack_ctx)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to attach multiple context objects just a single context that captures:

  • The extracted context (this is where we get our baggage)
  • The current span context
  • The current rack span context

The *.context_with_span calls make use of the set_value context interface which always returns a new ctx object.

So instead of layering contexts, we're building a single one here that contains all the entries needed when we start a rack request.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL

rescue StandardError => e
OpenTelemetry.handle_error(exception: e)
end
Expand Down Expand Up @@ -108,7 +110,7 @@ def on_finish(request, response)
rescue StandardError => e
OpenTelemetry.handle_error(exception: e)
ensure
detach_contexts(request)
detach_context(request)
end

private
Expand Down Expand Up @@ -191,11 +193,11 @@ def request_span_attributes(env)
attributes
end

def detach_contexts(request)
request.env[TOKENS_KEY]&.reverse_each do |token|
OpenTelemetry::Context.detach(token)
OpenTelemetry::Trace.current_span.finish
end
def detach_context(request)
return nil unless request.env[TOKEN_KEY]

OpenTelemetry::Trace.current_span.finish
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lack of symmetry (with on_start's use of Foo.context_with_span) threw me quite a bit here. Nothing to be done - just noting my confusion. Basically down to on_start managing contexts explicitly, whereas here we use the implicit "current context".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would make sense to store a reference to the rack span in the rack request env so we can close it later, rather than relying on in the implicit current_span.

With the current approach we're vulnerable to a scenario where some misconfigured or buggy instrumentation opening a span, setting the context, and not closing it... right? It's probably not likely, but still plausible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Rack span is not the current span at this point then it means the context token is out of sync also and we will likely get a detach error.

If this tracked the token and the Context together as a tuple then it would fetch the Rack span from it instead and ensure it is closing the right span and the token/Context would be the same.

If something went wrong with a different implicit "current" context not being detached, it would still result in a detach error but at least the Rack span would be the one we are trying to finish.
e.g.

return nil unless request.env[RACK_OTEL_CONTEXT_KEY]

[token, rack_ctx] = request.env[RACK_OTEL_CONTEXT_KEY]

OpenTelemetry::Trace.current_span(rack_ctx).finish
OpenTelemetry::Context.detach(token)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense to me, see 0b666b9

OpenTelemetry::Context.detach(request.env[TOKEN_KEY])
rescue StandardError => e
OpenTelemetry.handle_error(exception: e)
end
Expand Down Expand Up @@ -244,15 +246,6 @@ def config
OpenTelemetry::Instrumentation::Rack::Instrumentation.instance.config
end

def register_current_span(span)
ctx = OpenTelemetry::Trace.context_with_span(span)
rack_ctx = OpenTelemetry::Instrumentation::Rack.context_with_span(span, parent_context: ctx)

contexts = [ctx, rack_ctx]
contexts.compact!
contexts.map { |context| OpenTelemetry::Context.attach(context) }
end

def create_span(parent_context, request)
span = tracer.start_span(
create_request_span_name(request),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,25 @@
_(proxy_event).must_be_nil
end

describe 'when baggage is set' do
let(:headers) do
Hash(
'baggage' => 'foo=123'
)
end

let(:service) do
lambda do |_env|
_(OpenTelemetry::Baggage.raw_entries['foo'].value).must_equal('123')
[200, { 'Content-Type' => 'text/plain' }, response_body]
end
end

it 'sets baggage in the request context' do
_(rack_span.name).must_equal 'HTTP GET'
end
end

describe 'when a query is passed in' do
let(:uri) { '/endpoint?query=true' }

Expand Down