Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce host.id attribute to traces #4368

Open
alex-fedotyev opened this issue Oct 30, 2020 · 11 comments
Open

Introduce host.id attribute to traces #4368

alex-fedotyev opened this issue Oct 30, 2020 · 11 comments

Comments

@alex-fedotyev
Copy link

Metrics and logs identified a problem of using host.name for correlation when ingesting data from cloud environments, as they don't provide proper host name.
Proposed solution is to introduce host.id field which is "calculated" and is equal host.name for on-premises environments, and for cloud it is equal cloud.instance.id.
Original issue and spreadsheet with the breakdown.

This seems to align well with OTel spec, as they are using cloud instance_id as the host.id.

The proposal for APM is to calculate host.id dynamically based on whether cloud metadata is present or using host.name otherwise.
We would leverage this when integrating products together, i.e. linking from Infra to APM and vice versa.

We would also need to recognize host.id when ingesting data from OTel.

CC: @graphaelli @felixbarny

@alex-fedotyev
Copy link
Author

Pinging: @kaiyan-sheng @exekias @sorantis
I just realized that OTel spec suggests using cloud.instance.id while we suggest using cloud.instance.name.

Are those fields the same? Or would it make more sense to align around cloud.instance.id?

@axw
Copy link
Member

axw commented Oct 30, 2020

The OTel spec is vague in the non-cloud case though. In that case what is the unique ID? Is it /etc/machine-id or is it FQDN...?

@exekias
Copy link

exekias commented Oct 30, 2020

I just realized that OTel spec suggests using cloud.instance.id while we suggest using cloud.instance.name.

We are suggesting cloud.instance.id too, see https://github.com/elastic/observability-dev/pull/1137/files#diff-c5a9ab0ff94fc3963d0bb04177a5a800457970a01608274951e8a6a0b0023057R40

The OTel spec is vague in the non-cloud case though. In that case what is the unique ID? Is it /etc/machine-id or is it FQDN...?

I would say FQDN works better, machine-id can only retrieved from inside the machine, so while it guarantees to be unique, it's not very useful for correlation (specifically to correlate events coming from monitoring the machine from outside vs inside).

@alex-fedotyev
Copy link
Author

The OTel spec is vague in the non-cloud case though. In that case what is the unique ID? Is it /etc/machine-id or is it FQDN...?

@cyrille-leclerc - any chance you know how OTel defines host.id in non-cloud environments?

@kaiyan-sheng
Copy link
Contributor

I just realized that OTel spec suggests using cloud.instance.id while we suggest using cloud.instance.name.

Are those fields the same? Or would it make more sense to align around cloud.instance.id?

Yes we are also using cloud.instance.id. Problem with using cloud.instance.name is, it is not a required field in some of the cloud providers. For example, in AWS EC2, instance name is not required and defined by tag Name.

@cyrille-leclerc
Copy link
Contributor

@axw my understanding is that the only host information we collect in OpenTelemetry traces is host.id and only when when there is a network communication, mapping the Otel net.* namespace.

I collected the documents of the transaction and all the spans of a trace. Unfortunately, everything runs on my local Macbook without Docker making it more difficult to understand the usage of the host.hostname, host.ip... attributes as everything is localhost/127.0.0.1.
See https://gist.github.com/cyrille-leclerc/e5b4a1fb214f83cc9e7819953ebbd3e3
I only found 2 occurences of host on span documents, on the connection spans.

@axw Could we have omitted to map other Otel host attributes?

I looked at https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/master/exporter/elasticexporter/internal/translator/elastic/traces.go but I didn't find any hint.

@axw
Copy link
Member

axw commented Nov 5, 2020

@axw Could we have omitted to map other Otel host attributes?

Yes; what is there is not comprehensive. We will need to add support for translating host.id, among others.

@cyrille-leclerc
Copy link
Contributor

@alex-fedotyev OpenTelemetry host.id is NOT defined by the OpenTelemetry collector outside of cloud deployments. I only found enrichment of host.idon AWS and GCP so far.


Research notes

@axw
Copy link
Member

axw commented Mar 15, 2021

#4955 will add host.id for OpenTelemetry data.

We still need some conclusion on what to do for our agents. We could just set it to cloud instance ID for now, when it's set.

I would say FQDN works better, machine-id can only retrieved from inside the machine, so while it guarantees to be unique, it's not very useful for correlation (specifically to correlate events coming from monitoring the machine from outside vs inside).

@exekias does beats already do this? I just took a quick look and it appears to be using go-sysinfo's "HostInfo.UniqueID", which is populated using machine-id.

@exekias
Copy link

exekias commented Mar 16, 2021

Not yet, right now beats report host.id as the machine id, so we will need to do a breaking change, or introduce the change directly in the agent. @kaiyan-sheng I think you had an issue to discuss this?

@kaiyan-sheng
Copy link
Contributor

Sorry I just saw this message 🤕 Yes here is the issue: elastic/beats#22739

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants