Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not recommend URL path as acceptable span name #270

Closed
yurishkuro opened this issue Sep 29, 2019 · 6 comments · Fixed by #416
Closed

Do not recommend URL path as acceptable span name #270

yurishkuro opened this issue Sep 29, 2019 · 6 comments · Fixed by #416
Milestone

Comments

@yurishkuro
Copy link
Member

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-semantic-conventions.md says

Given an RFC 3986 compliant URI of the form scheme:[//authority]path[?query][#fragment], the span name of the span SHOULD be set to to the URI path value.

I think this is a very bad recommendation. In many REST APIs it is customary to encode entity IDs in the path, such as /api/customer/23476253654/address/237464. Using this as a span name creates very high cardinality set and defeats the purpose of span name identifying the type of operation in a service rather than a specific instance of the operation.

Proposal: change the spec per the following:

For server endpoints the OpenTracing spec recommended using the "route", which could look like /api/customer/{custId}/address/{addrId} (literally). This gives a stable, immutable string identifying the endpoint/operation.

For client spans, it is usually difficult for instrumentation to have access to similarly stable string, because all you often get is a URL. At the same time, the name of the client span is usually far less important, since there's always a corresponding well-defined server-side name. So using HTTP method could be enough as client span name. Some instrumentation libraries allow configuring span name extractor, where business logic can be added to the application to match the URL to a stable string, or perhaps pull some extra data from the request or headers.

@carlosalberto
Copy link
Contributor

This sounds good to me. I remember seeing that in instrumentation in the past, so it's better to clarify that ;)

@bogdandrutu bogdandrutu added this to the Alpha v0.3 milestone Sep 30, 2019
@tedsuo
Copy link
Contributor

tedsuo commented Oct 4, 2019

Should we also recommend that the HTTP method be used, when the route is not available? That is common in OpenTracing.

@yurishkuro
Copy link
Member Author

Sure

@Oberon00
Copy link
Member

Oberon00 commented Oct 7, 2019

I can think of the following possible names for client and server spans (in absence of route information):

  • Just the method, e.g. GET (I think that's what @yurishkuro suggests)
  • Using <method> <hostname>, e.g. GET localhost:5000
  • At the client side: Using the format string if known (e.g. http://localhost:5000/articles/%d/update; it's debatable if only the URI path part of the template string should be used).

Regarding:

At the same time, the name of the client span is usually far less important, since there's always a corresponding well-defined server-side name.

I don't fully agree with that statement, as there will be external requests in most applications, e.g. to some payment provider or Google API, etc. where you won't get any corresponding server-side Span.

@yurishkuro
Copy link
Member Author

I don't fully agree with that statement, as there will be external requests in most applications

Fair enough, but my point is that an instrumentation in a lower-level client library would never know how to generate the span name correctly. In opentracing-contrib modules for HTTP clients there are configuration options that allow the user to register "span name setter", which could be "just take the URI path" if the server is not encoding entity IDs in the path, or it could be something completely custom. But the default configuration is never "take the URI path", it's simply "take http method", as the safest approach (method + server host name is also not recommended as a default, because the latter very well may be high cardinality).

@jmacd
Copy link
Contributor

jmacd commented Jan 22, 2020

This will close soon with #416.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants