Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure metadata "compute-name" is lost when resource-detector host.name attribute is overwritten by other detectors #12779

Closed
dloucasfx opened this issue Jul 29, 2022 · 5 comments
Labels
bug Something isn't working closed as inactive priority:needed Triagers reviewed the issue but need code owner to set priority Stale

Comments

@dloucasfx
Copy link
Contributor

dloucasfx commented Jul 29, 2022

Describe the bug

The "Azure resource detector" stores the compute name in the host.name attribute Ref..
The host.name attribute is not unique to the "Azure resource detector" and can be overwritten by other detectors, for valid reasons; for instance, customer wants the FQDN representation for host.name (see repro steps below) whereas Azure detector is setting the host.name to the compute name (which is different).

This is all well and good until your code wants to use the original value set by Azure detector, at this point, the compute name fetched from Azure metadata is lost.

In our case, our code expects and uses the compute name to build the Azure Unique ID dimension, so it uses the host.name attribute Ref.. , but because user has configured their resourcedetection processor to get host.name in FQDN format , we end up with the wrong Azure Unique ID

Steps to reproduce
Use the following resourcedetection processor in your pipeline in order to get host.name in the FQDN format

resourcedetection:
    detectors: [system, azure, env]
    override: true
    system:
      hostname_sources: [lookup]   

What did you expect to see?
host.name is getting set correctly to FQDN test-az-es.az.splunk.com
and expect that azure_resource_id azure_resource_id is equal to (not the compute name at the end of the string)

zzzzzz-xxxx-ccc-9xxad8-xxx/east-us-2/microsoft.compute/virtualmachines/test-az-es

What did you see instead?
azure_resource_id azure_resource_id is set to

zzzzzz-xxxx-ccc-9xxad8-xxx/east-us-2/microsoft.compute/virtualmachines/test-az-es.az.splunk.com

What version did you use?
Version: latest

What config did you use?

resourcedetection:
    detectors: [system, azure, env]
    override: true
    system:
      hostname_sources: [lookup]   

Suggested Solution and other thoughts

Introduce cloud specific attributes for example cloud.host.name and cloud.host.id (if we want to only fix azure, then we can introduce azure.host.name azure.host.id) to persist the data parsed from the cloud metadata and maintain the flexibility for user to overwrite host.name with FQDN if they wish.
Also note that this applies to other cloud providers, example AWS, GCP, but at the moment, this edge case won't impact them as I am not aware of any code using the host.name like Azure is doing

@dloucasfx dloucasfx added the bug Something isn't working label Jul 29, 2022
@codeboten codeboten changed the title Azure metadata "compute-name" is lost when resource-detector hotst.name attribute is overwritten by other detectors Azure metadata "compute-name" is lost when resource-detector host.name attribute is overwritten by other detectors Jul 29, 2022
@codeboten
Copy link
Contributor

@mx-psi can you take a look as the owner of processor/resourcedetectionprocessor/internal/azure

@codeboten codeboten added the priority:needed Triagers reviewed the issue but need code owner to set priority label Jul 29, 2022
@dloucasfx
Copy link
Contributor Author

dloucasfx commented Jul 29, 2022

Also note that Azure Compute Name is not always the same as the VM Hostname.
They are only the same when you create Azure VM "without" setting hostname at creation or after creating the VM.

Here is an example where hostname is changed after VM is created, notice that they are different. SQLVMTEST vs SQLVM .

PS C:\Users\dlouca> hostname
SQLVMTEST
PS C:\Users\dlouca> Invoke-RestMethod -Headers @{"Metadata"="true"} -Method GET -Uri "http://169.254.169.254/metadata/instance?api-version=2021-02-01" | ConvertTo-Json -Depth 64
{
    "compute":  {
                    "azEnvironment":  "AzurePublicCloud",
                    "customData":  "",
                    "evictionPolicy":  "Deallocate",
                    "isHostCompatibilityLayerVm":  "false",
                    "licenseType":  "",
                    "location":  "eastus",
                    "name":  "SQLVM",
                    "offer":  "sql2017-ws2019",
                    "osProfile":  {
                                      "adminUsername":  "dlouca",
                                      "computerName":  "SQLVM",
                                      "disablePasswordAuthentication":  ""
                                  },
                    "osType":  "Windows",
                    "placementGroupId":  "",
                    "plan":  {
                                 "name":  "",
                                 "product":  "",
                                 "publisher":  ""
                             },

Although OTEL resources are well defined w/ semantic conventions and one can argue that we don't need to introduce other attributes to persist the Azure metadata, but this example here shows that name for a compute is not always the same as hostname.. based on this, instead of my initial suggestion of cloud.host.name/azure.host.name maybe introduce an attribute named azure.compute.name

@mx-psi
Copy link
Member

mx-psi commented Aug 3, 2022

IMO the problem here is that the semantics of host.name are not entirely clear and there are cases like this in which there are several possible valid values of host.name. I believe the value currently used is within what the spec allows, but I see that this can cause problems. It also seems like it would be good to discuss this at the specification level first, so that other OpenTelemetry components adopt a more specific convention about this.

@dloucasfx could you open an issue/PR over at https://github.com/open-telemetry/opentelemetry-specification to discuss adding cloud.host.name/azure.host.name/azure.compute.name as a semantic convention?

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Nov 10, 2022
@github-actions
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working closed as inactive priority:needed Triagers reviewed the issue but need code owner to set priority Stale
Projects
None yet
Development

No branches or pull requests

3 participants