Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider impact of ECS attribute alignment on the HTTP semantic convention stability effort #3163

Closed
trask opened this issue Feb 1, 2023 · 18 comments
Assignees
Labels
semconv:HTTP spec:trace Related to the specification/trace directory

Comments

@trask
Copy link
Member

trask commented Feb 1, 2023

What are you trying to achieve?

Make sure we are considering the impact of open-telemetry/oteps#222 prior to marking the HTTP semantic conventions stable.

Relates to #2626 and #2489

@trask trask added spec:trace Related to the specification/trace directory semconv:HTTP labels Feb 1, 2023
@trask trask assigned trask and unassigned jmacd Feb 1, 2023
@trask trask changed the title Consider impact of ECS alignment to HTTP semantic convention stability effort Consider impact of ECS attribute alignment on the HTTP semantic convention stability effort Feb 2, 2023
@trask
Copy link
Member Author

trask commented Feb 2, 2023

Here's a few things to consider if we want to align attributes with ECS before marking HTTP semantic conventions stable.

Rename url-ish things from http.* to url.*

This kinda makes sense to me regardless of ECS alignment, to increase reusability of attribute names in OpenTelemetry.

Though the http.target change would be a big breaking change for users (with no schema translation support).

Rename remaining http.* attributes to match ECS equivalents

Rename net.peer.* and net.host.* things to source.* and destination.*

An alternative would be to keep net.peer.* and net.host.* (instead of adopting source.* and destination.*), but rename to better align, e.g.

  • net.peer.name -> net.peer.address
  • net.host.name -> net.host.address
  • net.sock.peer.addr -> net.peer.nat.address
  • net.sock.peer.port -> net.peer.nat.port
  • net.sock.host.addr -> net.host.nat.address
  • net.sock.host.port -> net.host.nat.port

These net.* changes require much more investigation as I have glossed over many details and nuances.

@AlexanderWert
Copy link
Member

AlexanderWert commented Feb 2, 2023

@trask Thanks for opening this discussion thread!

To answer your question from the ECS OTEP:

it would be really helpful to get input from this group whether the changes I've outlined are the kind of thing you were imagining would take place if this OTEP is accepted.

I think these are exactly the discussion we need with open-telemetry/oteps#222. And I appreciate that some things are more straightforward and others less, or not at all (as they would introduce breaking changes).

Some thoughts on your examples:

Rename url-ish things from http.* to url.*

This kinda makes sense to me regardless of ECS alignment, to increase reusability of attribute names in OpenTelemetry.

👍

Rename remaining http.* attributes to match ECS equivalents

👍 Makes sense to me!

For the attributes that do not have an equivalent in ECS, I think ECS could adopt those fields (e.g. http.route, http.request.header.*, http.response.header.*, http.resend_count).

What about additional attributes in corresponding namespaces that ECS defines but that are not available in OpenTelemetry? Those could be a 'low-hanging fruit' to enrich OTel semantic conventions without any transformations or breaking changes. Examples:

  • http.request.body.content (For example for the use case of capturing an HTTP request body with a span.)
  • http.request.id (To reflect request IDs other than the tracing context. For example the x-amz-request-id in AWS, client-request-id on Azure, etc.)
  • http.request.mime_type
  • http.response.*
  • user_agent.*

These net.* changes require much more investigation as I have glossed over many details and nuances.

I think the most fundamental question here is the *.peer.* concept vs. explicit namespaces as it is in ECS (server.* / client.*, source.* / destination.*). With the peer concept the semantics of the attributes change depending on where they are collected. For example net.sock.peer.addr on a client-side span means that it's the address of the server, but, the same attribute on a server-side span implies it's the address of the client. In ECS this is handled explicitly with client.address vs. server.address (or source.address vs. destination.address). The advantage of making it explicit is that attribute names can be reused as is on events that are not necessarily a span with a SpanKind (that, as described before, is required to derive the semantics of the *.peer.* fields). For example, attributes server.address and client.address for access logs of a web server have a clear meaning while net.host.name and net.peer.name on an access log would require some additional attributes that would characterize the type of the log event to provide a meaning to those fields.

  • net.sock.peer.name ->
  • net.sock.family ->

@trask
Copy link
Member Author

trask commented Feb 2, 2023

thx @AlexanderWert

as they would introduce breaking changes

we haven't declared any semantic conventions stable yet, so while we would love to avoid breaking changes, we do have that opportunity (for a limited time)

For the attributes that do not have an equivalent in ECS, I think ECS could adopt those fields (e.g. http.route, http.request.header.*, http.response.header.*, http.resend_count).

👍

What about additional attributes in corresponding namespaces that ECS defines but that are not available in OpenTelemetry? Those could be a 'low-hanging fruit' to enrich OTel semantic conventions without any transformations or breaking changes.

in the short-term I'd like to focus on getting the existing OTel HTTP semantic conventions to stability, but I agree if we go down this route we should have a separate(?) discussion on whether we proactively bring over more ECS attributes.

I think the most fundamental question here is the *.peer.* concept vs. explicit namespaces as it is in ECS (server.* / client.*, source.* / destination.*).

thx for the explanation behind ECS choice here

👍 I've updated my comment above to include these

@reyang
Copy link
Member

reyang commented Feb 2, 2023

@jsuereth FYI

@lmolkova
Copy link
Contributor

lmolkova commented Feb 2, 2023

Thanks @trask for putting it together. Overall the direction makes sense to me, but I have comments on individual attributes:

  • http.target is currently available on server only, where url.fragment is not available

  • http.version - we also need version for AMQP and other protocols, can we come up with network.version attribute and introduce it in both schemas?

  • http.request.body.bytes - we have similar attributes in grpc and messaging. Should we unify? also there was an attempt to say we only measure in bytes and remove bytes from names - discussion Give guideline of not using unit names in attributes (#1053) #1307.

  • http.resend_count (no equivalent?) - since retries aren't http-specific, there is an opportunity to generalize it too

  • destination.nat.ip and source.nat.ip are no great for UNIX domain socket communication or esoteric things like bluetooth, so I'd prefer address version of these

  • spec also vaguely mentions net.transport -> network.transport

  • nat in network attribute names would not make sense for UNIX domain sockets and other non-IP communication

Clarifications/cleanups:
net.peer.name -> destination.address only
net.peer.port -> destination.port only
net.host.name-> source.address
net.host.port -> source.port
and so on

@trask
Copy link
Member Author

trask commented Feb 3, 2023

Taking another try at the net.* attributes, I'm still not clear on some things on both sides (OTel and ECS).

HTTP client attributes:

  • net.peer.name -> server.address
  • net.peer.port -> server.port
  • net.sock.peer.name -> this is http proxy address? can it be other things? (@lmolkova)
  • net.sock.peer.addr -> this http proxy ip? or could also be resolved ip for net.peer.name? can it be other things? (@lmolkova)
  • net.sock.peer.port -> this is http proxy port? can it be other things? (@lmolkova)
  • net.socket.family -> network.type

HTTP server attributes:

@trask
Copy link
Member Author

trask commented Feb 3, 2023

cc @Oberon00

@trask
Copy link
Member Author

trask commented Feb 3, 2023

  • http.version - we also need version for AMQP and other protocols, can we come up with network.version attribute and introduce it in both schemas?

this makes sense to me. @AlexanderWert wdyt from ECS perspective?

  • http.request.body.bytes - we have similar attributes in grpc and messaging. Should we unify?
  • http.resend_count (no equivalent?) - since retries aren't http-specific, there is an opportunity to generalize it too

@AlexanderWert any thoughts about generalizing these fields? any idea what field set they would live in from ECS perspective if they were generalized?

@reyang
Copy link
Member

reyang commented Feb 3, 2023

  • http.request.body.bytes - we have similar attributes in grpc and messaging. Should we unify?

I suggest using the existing ECS one:

  1. It is more practical to use an established one than inventing a new version (more from the timeline and risk control perspectives)
  2. "body" "payload" "content" - different protocols have different concepts, terms, etc. I remember there is even messaging protocol with three parts - envelope headers, message metadata and message body. Be specific about "HTTP request body bytes" and "the content size in a messaging protocol which covers both metadata and body" seems to bring lots of clarity. Also, I think there is less intention to aggregate/sort different sizes across HTTP and messaging.

@trask trask moved this to Can be addressed after stability in Spec: HTTP Semantic Conventions Feb 3, 2023
@trask trask moved this from Can be addressed after stability to Blocker for stability in Spec: HTTP Semantic Conventions Feb 3, 2023
@lmolkova
Copy link
Contributor

lmolkova commented Feb 3, 2023

net.peer.name -> server.address
net.peer.port -> server.port

My understanding that net.peer.sock matches server namespace ("For TCP events, the server is the receiver of the initial SYN packet(s) of the TCP connection." - if there is a proxy, it would be it)

net.peer.name and port match destination namespace attributes?

But I'm not sure I understand the difference between destination and source
"Destination fields capture details about the receiver of a network exchange/packet."

@trask
Copy link
Member Author

trask commented Feb 3, 2023

@lmolkova and I aligned on what we think the mappings are for network attributes.

We definitely need confirmation from @AlexanderWert that our usage of source.* and destination.* below are correct from ECS perspective.

HTTP client attributes:

HTTP server attributes:

Consolidating open questions from above into one place:

  1. http.version - we also need version for AMQP and other protocols, can we come up with network.version attribute and introduce it in both schemas?

  2. should we generalize http.request.body.bytes and http.resend_count beyond just http?

    this is still interesting, but not critical

  3. if we decide to go with http.request.body.bytes, should we add unit in the name of other OTel semantic attributes which have so far avoided this (Give guideline of not using unit names in attributes (#1053) #1307)

  4. it looks like ECS recommends copying attributes from client./server. to source./destination., probably so that more general queries can be run across source./destination. fields. we would prefer not to duplicate telemetry on the SDK/Instrumentation side, and leave that for ingestion pipelines if needed.

@AlexanderWert
Copy link
Member

AlexanderWert commented Feb 6, 2023

Thanks @trask for consolidating this.

Some clarification on the server.*/ client.* vs. source.*/destination.* fieldsets from ECS:

source.*/destination.* fields are meant for network packet-based perspective of information. For example, when there is a client A that opens up a connection to a server B, for that connection A will stay all the time the client, while B is the server.

Now let's have a look at the network packet level: When A sends the request Req to B, then A is the source and B is the destination from the perspective of the network packet Req. However, when B responds to A with a response Resp, then the roles change on the packet level, so B becomes the source and A the destination from the perspective of the response Resp (while A remains the client and B remains the server in that connection).

So, for the above purposes I think it's more appropriate to use client.*/server.* fields instead of source.*/destination.*.

HTTP client attributes:

So based on that I'd propose the following:

HTTP server attributes:

@lmolkova
Copy link
Contributor

lmolkova commented Feb 6, 2023

thank you for the clarification, @AlexanderWert !

So my understanding that the simple case would look like this:

image

and case with proxy would be something like this:

image

and UNIX socket would be

image

I still have some questions regarding nat:

  • on figure 1 (simple case). Why server.**nat**.ip ? why not server.ip ? -> server.ip
  • on figure 2 (proxy). server.sock.address looks out of place around server.nat namespace
    - on figure 3 (UNIX). Assuming it's another socket address, e.g. bluetooth address, the port would be server.nat.port ? - can do later, no scenario yet

I would prefer to add a new server and client namespaces for socket-based communication with:
- server.sock.address - socket address
- server.sock.port - port - later

@trask
Copy link
Member Author

trask commented Feb 8, 2023

Summarizing the discussion with @AlexanderWert and @lmolkova around network attributes. Let me know if I missed anything.

HTTP client attributes:

  • net.peer.name -> server.address
  • net.peer.port -> server.port
  • net.sock.peer.name
    • when it represents a proxy -> server.nat.address (as a new field in ECS)
    • (is there any other case?)
  • net.sock.peer.addr
    • when it represents the ip address of the server.address -> server.ip
    • when it represents a proxy -> server.nat.ip
    • when it represents a unix socket -> server.socket.address (as a new field in ECS)
    • (is there any other case?)
  • net.sock.peer.port
  • net.socket.family -> network.type

HTTP server attributes:

  • net.host.name -> server.address
  • net.host.port -> server.port
  • net.sock.host.addr -> server.socket.ip (as a new field in ECS)
    • when it represents a unix socket -> server.socket.address (as a new field in ECS)
  • net.sock.host.port -> server.socket.port (as a new field in ECS)
  • net.sock.peer.name -> (remove from OTel, not needed)
  • net.sock.peer.addr -> client.ip
    • when it represents a unix socket -> client.socket.address (as a new field in ECS)
  • net.sock.peer.port -> client.port
  • net.socket.family -> network.type

@trask
Copy link
Member Author

trask commented Feb 9, 2023

related to #3192, here's the mapping for exception attributes:

@trask trask moved this from Blocker for stability (ECS) to Blocker for stability in Spec: HTTP Semantic Conventions Apr 4, 2023
@trask trask closed this as completed Apr 5, 2023
@github-project-automation github-project-automation bot moved this from Blocker for stability to Done in Spec: HTTP Semantic Conventions Apr 5, 2023
@Oberon00
Copy link
Member

Oberon00 commented Apr 6, 2023

Wouldn't it be good to keep this open as overall tracking issue?

@trask
Copy link
Member Author

trask commented Apr 6, 2023

is there any advice / general practice for when to use tracking issues vs using projects to track? e.g. we are also tracking these issues in the HTTP and general semantic conventions stability projects:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semconv:HTTP spec:trace Related to the specification/trace directory
Projects
Development

No branches or pull requests

6 participants