Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Span exhibiting properties of multiple SpanKinds #413

Open
eyakimov-bbg opened this issue Jan 20, 2020 · 8 comments
Open

A Span exhibiting properties of multiple SpanKinds #413

eyakimov-bbg opened this issue Jan 20, 2020 · 8 comments
Labels
area:api Cross language API specification issue area:semantic-conventions Related to semantic conventions release:after-ga Not required before GA release, and not going to work on before GA spec:trace Related to the specification/trace directory

Comments

@eyakimov-bbg
Copy link

According to this part of the specification: Tracing Api - SpanKind, I understood this to mean that a Span has only a single SpanKind of the set of Client, Server, Producer, Consumer, Internal.

However, I have a fairly common scenario where I would have a single Span that exhibits the characteristics of both.

Consider the following scenario:

A Server (say gRPC) handles a request, during this request it pushes a message to a message queue which is picked up by a downstream service. The main server closes the request as it no longer cares about what happens.

In this scenario, the operation exhibits characteristics of both SERVER and PRODUCER.
One COULD model it today as two Spans, but this requires a second span:

server:
Span(kind=Server)
  Span(kind=Producer)

Currently, we would use a single Span for this (but our tracing system doesn't use such a 'kind' property today), and I would also ideally like to continue to capture this in a single Span as its a single logical "unit of work" but one that exhibits multiple characteristics.

Perhaps modelling this property as a set of "roles" (kinds) of a Span could be an alternative way of capturing this. The example would become:

server:
Span(roles=[Server, Producer])

(we can ofc just pluralise kind->kinds)

An additional requirement would be for such a property to be set AFTER Span creation. I.e. the server framework which creates the span would not be aware that the application code will also end up exhibiting Producer characteristics.

@Oberon00
Copy link
Member

If you look at the semantic conventions, you will see that you will get difficulties trying to squeeze this into a single span. E.g. for handling an incoming HTTP request in your server, you would set net.peer.ip to the IP of the client that sent the request. For pushing the message to the queue, you would set net.peer.ip to the IP of the messaging system's node you connect to. Also, if you used Client instead of Producer, it would be impossible to detect whether the HTTP attributes are from an incoming HTTP request (server) or an outgoing HTTP request.

There is a slightly related issue #51 "Span.Kind.LOAD_BALANCER (PROXY/SIDECAR)" that proposes a new SpanKind for things that just reroute messages. But in this case, I would suggest either:

  1. Stick to the OpenTelemetry concepts and use two spans for two (nested) operations.
  2. Ignore semantic conventions all together, just use any spankind and ignore it on your back end.

@eyakimov-bbg
Copy link
Author

Great point, as we don't use the conventions today, we haven't encountered the name clashing issue yet. Perhaps a more obvious example of this would be:

Span(kind=Server)
  Span(kind=Producer)
  Span(kind=Producer)

If one would treat this as a single span, then it would be unavoidable to encounter naming conflicts when following the semantic conventions.

Perhaps it would instead by appropriate to mention this recommendation in both the specification and in the semantic conventions documents. Specifically the recommendation to encourage the creation of Client/Producer spans in such examples wherever a Span covers both operations.

@aphelionz
Copy link

aphelionz commented Mar 25, 2020

Another thought related to this is a SpanKind inside of a peer-to-peer system, where it's a hybrid of CLIENT + SERVER, and possibly PRODUCER + CONSUMER as well. A SpanKind of PEER or P2P would probably be sufficient.

Edit: markdown formatting

@dyladan
Copy link
Member

dyladan commented Mar 25, 2020

Even in a peer-to-peer system, for each rpc there is one peer acting as a client and one or more peers acting as a server.

@aphelionz
Copy link

aphelionz commented Mar 25, 2020

One example we have in OrbitDB is the "heads exchange" Upon one peer connecting with another, they exchange their latest database OPLOG heads in the form of IPFS hashes. I'd like to keep all of this at least under the same parent span so I can see the full lifecycle of a database in the full p2p system. (I'm just using INTERNAL for now.)

I admit that assigning client or peer to one feels arbitrary, but I suppose I could do it by time? It's also slightly hard to tell which peer initiated the connection since they're both actively swarming. I'm not necessarily disagreeing, more just looking for how to get my head around this. I tend to think of p2p connections as bidirectional.

BTW looking forward to your talk at Observe2020 in a few weeks. A shame we can't meet up in person 👍

Edit: I also don't want to hijack this thread so I'm happy to open a different issue or whatever works

@Oberon00
Copy link
Member

I'd say: If you initiate the connection, you are the client, otherwise you are the server. If you do both, you have nested spans.

@bogdandrutu bogdandrutu added the spec:trace Related to the specification/trace directory label Jun 12, 2020
@bogdandrutu bogdandrutu added area:api Cross language API specification issue area:semantic-conventions Related to semantic conventions labels Jun 26, 2020
@carlosalberto carlosalberto added the release:after-ga Not required before GA release, and not going to work on before GA label Jul 2, 2020
@gabrieljones
Copy link

I have several microservices that consume from one kafka topic then publish to a different kafka topic in the same span. I am running into the clash with semantic conventions. In my case each span has a tiny but not insignificant internal accounting cost. Modeling this as a parent and child span effectively doubles my tracing costs.

I am thinking of adding the following tags

  • messaging.source_system
  • messaging.destination_system
  • messaging.source correlates with messaging.destination
  • messaging.source_kind correlates with messaging.destination_kind
  • messaging.source_* for all other messaging tags
  • messaging.destination_* for all other messaging tags

Reference: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/messaging.md#messaging-attributes

@Oberon00
Copy link
Member

Oberon00 commented Jul 7, 2021

that consume from one kafka topic then publish to a different kafka topic in the same span

If you want to use the OpenTelemetry semantic conventions you need to use two separate (probably nested) spans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:api Cross language API specification issue area:semantic-conventions Related to semantic conventions release:after-ga Not required before GA release, and not going to work on before GA spec:trace Related to the specification/trace directory
Projects
None yet
Development

No branches or pull requests

7 participants