Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from http.flavor to net.protocol.* in HTTP client instrumentat… #8131

Merged

Conversation

mateuszrzeszutek
Copy link
Member

…ions

Implements half of the open-telemetry/opentelemetry-specification#3272 spec PR; and half of the #4519 issue.

@mateuszrzeszutek mateuszrzeszutek requested a review from a team March 24, 2023 15:19
@mateuszrzeszutek mateuszrzeszutek force-pushed the net.protocol-client branch 2 times, most recently from dc91435 to 84f2c85 Compare March 27, 2023 06:47
String protocolName = getter.getProtocolName(request, response);
if (protocolName != null) {
internalSet(
attributes, NetAttributes.NET_PROTOCOL_NAME, protocolName.toLowerCase(Locale.ROOT));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think of moving toLowerCase into the getProtocolName implementations that need it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my initial idea, but I've decided to have the conversion here because of how this attribute is described in the spec:

Application layer protocol used. The value SHOULD be normalized to lowercase.

I'm thinking that any normalization should probably be done in the extractor; there's a similarity to the SQL sanitization, which also happens in the extractor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Application layer protocol used. The value SHOULD be normalized to lowercase.

I think of the semantic convention definitions as applying to the getter methods, and I think it could become slightly more confusing if some parts of the semantic convention are handled in the extractor and some parts are handled in the getter methods.

I think the existing SQL sanitization is a bit ok, because in that case the getter method is getRawStatement() as opposed to getStatement()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think of the semantic convention definitions as applying to the getter methods, and I think it could become slightly more confusing if some parts of the semantic convention are handled in the extractor and some parts are handled in the getter methods.

That's a good point.
My idea about the extractors and semantic conventions was: put as much as possible into extractors, and make implementing the getters as easy as possible - someone who uses the instrumentation api to instrument their own library should not have to know the intricacies of our semantic conventions, the extractors are kinda "smart" and can massage the data a bit. We already do things like that with e.g. removing auth info from http.url, or the conditional on the net.sock.family (skip if default inet).

@Override
public String getProtocolName(HttpURLConnection connection, @Nullable Integer integer) {
// HttpURLConnection hardcodes the protocol name&version
return "http";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not capture since this is the default? cc @lmolkova

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, great catch!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify: should we not capture http at all; not capture http 1.1; or something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my reading is that we should capture the version (1.1) if we know it, but not the protocol

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, why? The net.protocol.name attribute is (or will be) shared across different semantic conventions, it's not like it has a constant default value that we can just skip (like net.sock.family has inet). IMO knowing that version is 1.1 without knowing the name of protocol sends a somewhat unclear message - you can guess that it actually is http by looking at the other attributes though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. So we don't have a good way to enforce defaults for net attributes on higher-level instrumentations, right?
I.e. populating net.protocol.version = http on HTTP spans is pure redundancy, but then we'd have to have http-specific implementation of net attribute extractor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. populating net.protocol.version = http on HTTP spans is pure redundancy, but then we'd have to have http-specific implementation of net attribute extractor

I'm actually thinking about having the HTTP getter extend the net getter (just as http semconv kind of "extends" the net semconv); but even in that case I wouldn't say that it's 100% redundant. There are some HTTP clients that support QUIC or SPDY - and while I think that almost nobody actually uses these protocols, there is still a possibility to use them, and the instrumentation should be prepared to reflect that.

default String getProtocolVersion(REQUEST request, @Nullable RESPONSE response) {
return null;
}
String getProtocolVersion(REQUEST request, @Nullable RESPONSE response);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmolkova do you have any thought how common the protocol name/version will be, and whether we should have default methods for these (similar to the sock attributes below), or whether it's good to force implementers to think about / implement these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW sort of related: I've been thinking of making getTransport() optional, since there are few instrumentations that actually know the transport protocol (instead of just hardcoding TCP_IP and hoping it's correct)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it makes sense for protocol information to be left unset by default - it does not seem that essential and instrumentations that care can override.

wrt transport - I think we removed it from HTTP spec completely. HTTP, gRPC, and most of the other protocols would always have TCP there, maybe we should just add a default value to the spec?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP/3 uses UDP. gRPC offers an in-process transport for which the hardcoded IP_TCP is probably wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make the transport optional in a separate PR -- and the protocol attributes in this PR, since it kinda reverts some of my changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP/3 uses UDP. gRPC offers an in-process transport for which the hardcoded IP_TCP is probably wrong.

Agree, I think the default on the instrumentation side should be null. Spec can have language (for the backends) to assume transport from protocol name/version. E.g. HTTP/3 is UDP, there is no need to duplicate it on telemetry.

For less popular protocols, it might make sense to set both protocol name/version AND transport, but their specs can require transport if needed.

mateuszrzeszutek pushed a commit that referenced this pull request Apr 3, 2023
I applied [this
comment](#8131 (comment))
to the whole codebase and removed some `Optional`s that were used in the
hot path
trask added a commit that referenced this pull request Apr 3, 2023
Based on #8190 and
#8131 (comment)

---------

Co-authored-by: Mateusz Rzeszutek <mrzeszutek@splunk.com>
Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mateuszrzeszutek thanks! can you open an issue(s) for potentially open questions?

let's merge if you think you have time for server side before the release next week, otherwise let's hold this PR until after the release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants