Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should HTTP server.address and server.port prioritize Host header over virtual host name? #388

Closed
trask opened this issue Oct 12, 2023 · 6 comments
Assignees

Comments

@trask
Copy link
Member

trask commented Oct 12, 2023

Since one of the motivations for these attributes on the server side:

  • server.address
  • server.port
  • url.path
  • url.query
  • url.scheme

is to reconstruct the original URL, does it make sense to prioritize capturing server.address and server.port from the Host header instead of the current state of prioritizing the virtual host name?

(and since we have more clarity on these being logical attributes now, and since we can introduce more lower-level attributes to capture virtual host name / port in the future as needed)

here's the current definition of server.address (as captured on the server side):

- ref: server.address
brief: >
Name of the local HTTP server that received the request.
note: |
Determined by using the first of the following that applies
- The [primary server name](/docs/http/http-spans.md#http-server-definitions) of the matched virtual host. MUST only
include host identifier.
- Host identifier of the [request target](https://www.rfc-editor.org/rfc/rfc9110.html#target.resource)
if it's sent in absolute-form.
- Host identifier of the `Host` header

@Oberon00
Copy link
Member

Oberon00 commented Oct 12, 2023

IMHO: No, the Host header should not be prioritized. Unfortunately, it often contains bogus content, and it can be captured separately in http.request.header.host (at least in HTTP/1, in HTTP/2 it might become http.request.header.:authority? Probably the header spec needs an update for HTTP/2 pseudo-headers)

Doing a bit of PR archeology, it seems it was @lmolkova who defined this priority order in open-telemetry/opentelemetry-specification#2469 and declared that to be the order to find the "best known server name". I think it is a very reasonable priority order.

@lmolkova
Copy link
Contributor

Agree with @Oberon00. Reverse proxies don't always preserve the original host header (and custom configuration can put anything there).

In my mental model, since the "primary server name" is essentially user-configurable, this is the way for users to force server.address to be something they want.

Something we should probably do:

  • change "The primary server name of the matched virtual host" to something like "Explicit user configuration (if provided) or a primary server name"
  • add "Original host used by the client such as host component of Forwarded header, X-Forwarded-Host or similar" to come 3rd before Host header

WDYT?

@Oberon00
Copy link
Member

Oberon00 commented Oct 14, 2023

I agree that "virtual host" verbiage could/should probably be removed and replaced by something simpler.

One thing that just occured to me: server.address is also used as metric attribute for HTTP server metric. A bad Host header could easily cause a cardinality explosion there. So that is a no-go also for that reason I believe. https://opentelemetry.io/docs/specs/otel/metrics/semantic_conventions/http-metrics/#metric-httpserverduration

Given that, "or a primary server name" might need an additional hint that it must still be something that needs to be low cardinality.

@trask
Copy link
Member Author

trask commented Oct 15, 2023

A bad Host header could easily cause a cardinality explosion there.

this discussion led me to a similar worry, let me know if you think #401 is sufficient

@trask
Copy link
Member Author

trask commented Oct 16, 2023

Explicit user configuration (if provided) or a primary server name

are you thinking instrumentation configuration? do we need to mention that? or is this important enough to include something like "instrumentation SHOULD provide configuration"

moving this question to #414

@trask
Copy link
Member Author

trask commented Oct 16, 2023

Closing this issue, I think the original question has been answered, and there are follow-up issues and PRs for the other discussion items:

@trask trask closed this as completed Oct 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

4 participants