Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BREAKING: Replace net.peer.*/net.host.* with client.*/server.* (and source.*/destination.*) #3402

Merged
merged 30 commits into from
May 8, 2023

Conversation

lmolkova
Copy link
Contributor

@lmolkova lmolkova commented Apr 15, 2023

Fixes #3199

Switches from peer and host to client and server terminology and renames:

  • net.peer.name and net.host.name -> server.address
  • net.peer.port and net.host.port -> server.port,
  • net.sock.peer.addr to
    • -> server.socket.address on client side
    • -> client.socket.address on server side,
  • net.sock.peer.port to
    • -> server.socket.port on client side
    • -> client.socket.port on server side,
  • net.sock.peer.name -> server.socket.domain,
  • net.sock.host.addr -> server.socket.address,
  • net.sock.host.port -> server.socket.port
  • http.client_ip -> client.address

Also adds source.* and destination.* ECS attributes to address network scenarios where client/server terminology is not applicable.

image

Summary of benefits

These describe the same things whether observed from one side or the other, while host and peer describe opposite things depending on which side they are observed from. This makes them more generally useful on logs, where the concept of SpanKind is not inherent in the data model.

net.peer.* may or may not be low cardinality depend on which side we are on, while things are clearer with server.* and client.*.

host and peer have been a source of confusion. While updating the RPC metrics spec to use client.* and server.*, we even found a bug due to our own confusion over host and peer.

@Oberon00
Copy link
Member

Oberon00 commented Apr 17, 2023

I think rather than just renaming, this is a complete change of concepts behind representing the related information. Which is a bit of a shame, since quite a bit of brain power went into the current net.* conventions by now.
Do we think the ECS conventions are better here? Or should we align in the other direction?

@lmolkova
Copy link
Contributor Author

lmolkova commented Apr 17, 2023

I think rather than just renaming, this is a complete change of concepts behind representing the related information. Which is a bit of a shame, since quite a bit of brain power went into the current net.* conventions by now. Do we think the ECS conventions are better here? Or should we align in the other direction?

great questions.

The good things about the change are:

  • we align with server/client terminology with ECS, which is a great win for both communities
  • queries are the same for client and server-sides: users/backends can use server.* to filter by the specific endpoint for both sides of communication.
    • essentially it's easier to understand what the server and client are and not get overwhelmed by peer/host that mean different things on different sides
  • we have fewer attributes: net.sock.peer.name can go away which only makes sense on the client side and never(?) on the server side.

The con is obviously that it will break everyone and that's why we're trying to come up with a transition plan #3404 (which currently suggest major version bump for instrumentations) plus months of support for old instrumentations)

@trask
Copy link
Member

trask commented Apr 17, 2023

essentially it's easier to understand what the server and client are and not get overwhelmed by peer/host that mean different things on different sides

as a layperson in networking terminology, I found peer/host to be very confusing when adding these attributes to the Java HTTP instrumentation. I think using client/server terminology is a big win.

@yurishkuro
Copy link
Member

we align with server/client terminology with ECS, which is a great win for both communities

it's a breaking change for OTEL, so hardly a win in itself.

queries are the same for client and server-sides: users/backends can use server.* to filter by the specific endpoint for both sides of communication. essentially it's easier to understand what the server and client are and not get overwhelmed by peer/host that mean different things on different sides

Not really following this, can you give an example?

To me, the client/server nomenclature is a simplification that does not always make sense. E.g. there's no such terminology in the TCP stack because it only deals with packets passing, not the purpose of the packets or the roles of the actors. And even when we rise to the application level, there are many scenarios where there are equal players (peers) communicating, e.g. nodes in Paxos or any gossip protocol, actors exchanging messages in systems like Akka, etc. How would one decide who is a client or a server in those scenarios?

schemas/1.21.0 Outdated Show resolved Hide resolved
schemas/1.21.0 Outdated Show resolved Hide resolved
@trask
Copy link
Member

trask commented Apr 18, 2023

there are many scenarios where there are equal players (peers) communicating, e.g. nodes in Paxos or any gossip protocol, actors exchanging messages in systems like Akka, etc. How would one decide who is a client or a server in those scenarios?

@AlexanderWert how would you model this in ECS? (is this the purpose of source.* and destination.*)?

specification/trace/sdk_exporters/zipkin.md Outdated Show resolved Hide resolved
semantic_conventions/trace/client.yaml Outdated Show resolved Hide resolved
@AlexanderWert
Copy link
Member

there are many scenarios where there are equal players (peers) communicating, e.g. nodes in Paxos or any gossip protocol, actors exchanging messages in systems like Akka, etc. How would one decide who is a client or a server in those scenarios?

@AlexanderWert how would you model this in ECS? (is this the purpose of source.* and destination.*)?

@trask @yurishkuro Yes, exactly!

In ECS client.*/server.* describes a connection level relationship, while source.*/destination.* describes the relationship on a network packet level. So, in the scenarios mentioned above (Akka, TCP, etc.) where no client-server relationship exists source.*/destination:* is the way to model the relationship on a packet / data level.

@trask
Copy link
Member

trask commented Apr 20, 2023

@lmolkova I'm wondering if source.* and destination.* should be included in this PR as well since they are needed to replace some of the current net.* use-cases outside of HTTP?

@lmolkova lmolkova requested a review from a team April 20, 2023 20:48
@jsuereth
Copy link
Contributor

jsuereth commented May 2, 2023

peer-to-peer communication (even if it's over TCP and could technically use client./server., in order to emphasize the symmetric nature and lack of distinct client/server roles)

I'm not a huge fan of this as I think it blurs the line in a way I had hoped to avoid. If this were phrases in a fashion like:

peer-to-peer communication (even if it's over TCP, but the "user-facing" surface of the protocol / API does not expose clear notions of client and server)

I'd prefer it more.

Specifically, I want to avoid ambiguity in semantic convention attribute choice.

Generally, really like your response, can you make sure that shows up in the PR itself? I think your response, at a minimum, belongs in the Markdown.

@Oberon00
Copy link
Member

Oberon00 commented May 2, 2023

For P2P source/destination uses, how would you find out which is the remote and which the local?

@trask
Copy link
Member

trask commented May 2, 2023

For P2P source/destination uses, how would you find out which is the remote and which the local?

I'm not sure, but one thought would be something like p2p.operation (e.g. "send" / "receive"), similar to messaging.operation

@Oberon00
Copy link
Member

Oberon00 commented May 2, 2023

I was thinking the same. We end up with an extra attribute for the position of the current operation in the protocol interaction, regardless of whether we use net/peer or source/destination. With net/peer we would need to store the server/client role distinction for where it exists and nothing in p2p cases. With source/destination+client/server we will need an attribute to indicate which namespace indicates the local/remote end.

(My previous comment was only about p2p, but I realized, actually the same applies to client/server, at least on non-span telemetry)

@trask trask changed the title BREAKING: Replace net.peer.*/net.host.* with client.*/server.* (and source.*/destination.* for P2P) BREAKING: Replace net.peer.*/net.host.* with client.*/server.* (and source.*/destination.*) May 2, 2023
@lmolkova
Copy link
Contributor Author

lmolkova commented May 2, 2023

I was thinking the same. We end up with an extra attribute for the position of the current operation in the protocol interaction, regardless of whether we use net/peer or source/destination

We can definitely add one (or keep using messaging.operation) once we get to gRPC streaming or similar. At the same time, it's not necessarily important to know a direction when tracing a packet.

E.g. I want to count all packets from source A to destination B, then assuming we add both source and destination to an event (or metric, mind the cardinality), knowing the direction (or which side recorded it) is not quite necessary.

@chameleon82
Copy link

If i have client and server tags, how I can differentiate signal come from client or from server, i.e. in or out direction?

@AlexanderWert
Copy link
Member

If i have client and server tags, how I can differentiate signal come from client or from server, i.e. in or out direction?

In the case of tracing signals (i.e. spans) the SpanKind (e.g. SERVER , CLIENT, PRODUCER, CONSUMER) specifies the direction.

In the case of metrics, the semantics of the metric itself imply whether it's server or client. For example, http.server.duration or http.client.duration clearly imply whether it's server or client.

Similar situation with logs. Here, the instrumentation scope and the resource attributes provide the context what kind of logs are being collected.

@Oberon00 summarized it very well:

... We end up with an extra attribute for the position of the current operation in the protocol interaction, regardless of whether we use net/peer or source/destination. ...

@Oberon00
Copy link
Member

Oberon00 commented May 8, 2023

Similar situation with logs. Here, the instrumentation scope and the resource attributes provide the context what kind of logs are being collected.

@AlexanderWert I don't think that's the case for logs. If you have a service that logs a line with client.address set, it might be the address of a client connecting to the service. Or it might be one of the service's own addresses that is used to connect with another service. That's where you need some attribute (that yet needs to be defined) on the log line to indicate whether the operation the log is about is one where the current service is acting as client or server.

When using source/destination instead of client/server, this applies to spans & metrics as well since source/destination may flip around for the same connection.

@Oberon00
Copy link
Member

Oberon00 commented May 9, 2023

Since this was merged without solving the problem, I created a follow-up issue: open-telemetry/semantic-conventions#62

@Luoc888
Copy link

Luoc888 commented May 20, 2023

Fixes #3199

Switches from peer and host to client and server terminology and renames:

  • net.peer.name and net.host.name -> server.address
  • net.peer.port and net.host.port -> server.port,
  • net.sock.peer.addr to
    • -> server.socket.address on client side
    • -> client.socket.address on server side,
  • net.sock.peer.port to
    • -> server.socket.port on client side
    • -> client.socket.port on server side,
  • net.sock.peer.name -> server.socket.domain,
  • net.sock.host.addr -> server.socket.address,
  • net.sock.host.port -> server.socket.port
  • http.client_ip -> client.address

Also adds source.* and destination.* ECS attributes to address network scenarios where client/server terminology is not applicable.

image

Summary of benefits

These describe the same things whether observed from one side or the other, while host and peer describe opposite things depending on which side they are observed from. This makes them more generally useful on logs, where the concept of SpanKind is not inherent in the data model.

net.peer.* may or may not be low cardinality depend on which side we are on, while things are clearer with server.* and client.*.

host and peer have been a source of confusion. While updating the RPC metrics - [ ] spec to use client.* and server.* , we even found a bug due to our own confusion over host and peer.

marcalff added a commit to marcalff/opentelemetry-cpp that referenced this pull request Jul 23, 2023
Rename http.client_ip -> client.address
carlosalberto pushed a commit that referenced this pull request Jan 9, 2024
Fixes
#3793

## Changes

Update OpenTelemetry to Zipkin Transformation to handle attributes from
older semantic conventions in a backwards compatible way so that
following attributes are (again) handled:
- `net.peer.name`
- `net.host.name`
- `net.sock.peer.addr` & `net.sock.peer.port`
- `server.socket.domain`
- `server.socket.address` & `server.socket.port`

The backwards compatibility of a stable OpenTelemetry to Zipkin
Transformation was broken by:
-
#3402
-
#3713
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rename net peer/host attributes to align with ECS