Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for modeling nested CLIENT instrumentation #1822

Closed
trask opened this issue Dec 3, 2020 · 7 comments
Closed

Proposal for modeling nested CLIENT instrumentation #1822

trask opened this issue Dec 3, 2020 · 7 comments
Labels
enhancement New feature or request

Comments

@trask
Copy link
Member

trask commented Dec 3, 2020

Since CLIENT spans cannot have nested CLIENT spans, we suppress the nested http client instrumentation, which seems reasonable, as user is interacting with the higher level library, and is probably most interested in seeing things at that level.

But sometimes the nested instrumentation can provide interesting telemetry.

For example, Elasticsearch CLIENT spans are modeled as database spans, but makes HTTP calls under the covers using an instrumented http client, and it may be interesting to see the lower level HTTP telemetry.

I'm not sure if Elasticsearch CLIENT spans are always 1-1 with HTTP calls, but it would be reasonable for some database CLIENT spans to make multiple HTTP calls under the covers.

As we discussed in SIG meeting yesterday, if we want to capture the lower level telemetry, a nice option would be to use events to represent the lower level HTTP calls.

@trask trask added the enhancement New feature or request label Dec 3, 2020
@iNikem
Copy link
Contributor

iNikem commented Dec 3, 2020

Should we formalise these events on spec level?

@johnbley
Copy link
Member

johnbley commented Dec 4, 2020

This "layering" or "nesting" views of protocols occurs quite a bit and will occur more and more as we add more instrumentation. A few thoughts that will probably only serve to confuse matters more:

  • Depth: It might occur more than two layers deep: consider what happens if we add Socket instrumentation
    to time the connect sequence. Database API -> Http library -> Socket connection

  • Necessity: Each layer has interesting observability information to add. Consider questions like "did this
    high-level API call result in a new network connection or did it use one from a pool?" or "did this operation connect
    to the /v2 endpoint or the /v1?". We need data from all layers to answer these sorts of questions.

  • Naming: Typically the best (most meaningful / best semantics / appropriate cardinality) names will come from the
    highest level, though allowing multiple "client" spans (not necessarily kind=client) allows each level to express names
    as it sees them and UI/backends can coalesce/combine as they see fit. Flattening to a single span would complicate matters.

  • Propagation and parentage: are very tricky. In the database api -> http library -> socket chain, only the http
    instrumentation might know how to propagate in a way that will work. In a message queue API -> http transport chain injecting into the message metadata is critical since that is what will be persisted/carried through the queueing system, but an "instantaneous" view of "why did this queue insert take so long?" cares about the specific HTTP interaction for the post, and so we might also want to propagate context on the http headers. What parentSpanId is propagated? A notion of "propagation family/protocol" should be developed so that it is possible to encode rules such as "if another http instrumentation has already injected on this request, do not inject again" while allowing different propagation layers/protocols to coexist.

  • Data structures: I recommend we pursue some form of combining/allowing all layers of instrumentation to co-exist
    and report their data. "Flattening" lower spans into the "first"/highest client span seems like a reasonable step, but it might get crowded and difficult - see "naming" above but also consider how to flatten attributes (colliding keys) and events (ditto). Additionally, it creates a "this instrumentation thinks it is creating a span but is actually writing to a view that materializes in another span" source of complexity in the codebase. If allowing nestedkind=client spans is not an option, then perhaps (a) have newer/lower client spans "reach up" and modify the kind of parent spans so only the lowest one is kind=client or (b) Delay marking kind until a span is ending, at which point it can know if another client span was created below it, or (c) allow kind=INTERNAL under kind=CLIENT?

  • Modeling: Consider again the message queue -> http chain and that semantically these are in fact two distinct operation - the first is PRODUCER and the second is CLIENT. In a previous product we allowed distinct layering and the equivalent of multiple kind=CLIENT stacked on top of each other. This led to some difficulties and weird recursive matching algorithms for linking views/models of data (e.g., client has instrumentation for SOAP, HTTP, and TCP layers, but server receiving the request only has HTTP), but it was manageable and the data was clear and complete on an agent level.

Finally, I'll say that any form of modeling/representation that isn't "let's pretend this lower level of visibility never happened" would be an improvement, and so I'll support any movement in that direction.

@jkwatson
Copy link
Contributor

I don't think that events are going to solve this problem completely, as I think @johnbley is also suggesting. Events will absolutely be the appropriate way to tackle some issues (like looking up connections from a connection pool), but not everything.

I think we'll need to tackle this at the spec level to truly solve the issue.

Options I see:

  1. Allow INTERNAL spans below CLIENT
  2. Allow sub-spans to update the kind of their parent (kind is not currently writable after a span has been started)
  3. Introduce a new span kind that can live below CLIENT

@trask
Copy link
Member Author

trask commented May 15, 2021

Linking in a good and relevant comment from @agoallikmaa #2923 (comment)

@jkwatson
Copy link
Contributor

Note, the spec has now been updated to allow nested CLIENT instrumentation, so this issue probably needs some re-thought.

@iNikem
Copy link
Contributor

iNikem commented Sep 20, 2021

@open-telemetry/java-approvers I propose to close this issue as solved by https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/suppressing-instrumentation.md#enable-instrumentation-suppression-by-type

@trask
Copy link
Member Author

trask commented Sep 21, 2021

👍 additional proposals/discussions can be opened as new issues

@trask trask closed this as completed Sep 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants