Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

selective attribute appending for resourcedetectionprocessor #8242

Closed
davidmirza408 opened this issue Mar 1, 2022 · 17 comments
Closed

selective attribute appending for resourcedetectionprocessor #8242

davidmirza408 opened this issue Mar 1, 2022 · 17 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@davidmirza408
Copy link
Contributor

davidmirza408 commented Mar 1, 2022

ResourceDetectionProcessor adds several attributes to identify the host machine. There are scenarios where all of the attributes are not desired.

I would like to propose that we add a config setting to allow the user to specify that only a subset of the available attributes be appended.

For example, the AWS EC2 resource detection processor adds the following attributes by default:

  • cloud.provider ("aws")
  • cloud.platform ("aws_ec2")
  • cloud.account.id
  • cloud.region
  • cloud.availability_zone
  • host.id
  • host.image.id
  • host.name
  • host.type

My proposal would be to add an "attributes" config setting that allows the user to provide a list of attributes that they would like to have appended. If no "attributes" are specified then we would default to the current behavior of adding all attributes. Continuing with the AWS example, if a user only wanted "host.id" and "host.name" the config would be as follows:

processors:
  resourcedetection:
    detectors: [ec2]
    timeout: 2s
    override: false
    attributes: [host.name, host.id]
@jpkrohling
Copy link
Member

@jpkrohling
Copy link
Member

@jrcamp, are you still a code owner of this component?

@pmm-sumo
Copy link
Contributor

pmm-sumo commented Mar 2, 2022

BTW, as a workaround one could use resource processor and delete unwanted attributes

@mx-psi
Copy link
Member

mx-psi commented Mar 2, 2022

What are the benefits of having this on the resourcedetection processor vs. using a resource processor (or the transform processor in the future)?

@pmm-sumo
Copy link
Contributor

pmm-sumo commented Mar 2, 2022

What are the benefits of having this on the resourcedetection processor vs. using a resource processor (or the transform processor in the future)?

I think reduced memory pressure and some cpu cycles, though I believe the difference is not going to be anywhere substantial. Also, a simpler configuration

@mx-psi
Copy link
Member

mx-psi commented Mar 2, 2022

I think reduced memory pressure and some cpu cycles, though I believe the difference is not going to be anywhere substantial

I agree, at least for the detectors I am familiar with the calls that we do to the different cloud provider APIs are not going to change (we either get all atribute values or we get no attribute values), so any performance improvements would be in avoiding to add an attribute and then remove it (which seems like a negligible thing for most users).

Also, a simpler configuration

That's fair, I personally would favor leaving this functionality to the transform/resource processor but it's true that the configuration is slightly more complex.

@davidmirza408
Copy link
Contributor Author

What are the benefits of having this on the resourcedetection processor vs. using a resource processor (or the transform processor in the future)?

I think reduced memory pressure and some cpu cycles, though I believe the difference is not going to be anywhere substantial. Also, a simpler configuration

The above mentioned benefits are all valid. One more minor benefit is that a configuration that specifies the desired attributes would not be surprised with new attributes in future releases.

@mx-psi
Copy link
Member

mx-psi commented Mar 3, 2022

One more minor benefit is that a configuration that specifies the desired attributes would not be surprised with new attributes in future releases.

That's a good point. I think you have convinced me that this is worth having :)

@mx-psi mx-psi added enhancement New feature or request good first issue Good for newcomers labels Mar 8, 2022
@TylerHelmuth
Copy link
Member

TylerHelmuth commented Mar 14, 2022

What would happen if someone added a string in the attributes section that doesn't make sense for the detector type? For example, if the detector is only system but attributes includes cloud.region?

Also, if the configuration includes multiple detectors, would the attributes section apply to all detectors or should you be able to configure the attribute values for each detector?

@TylerHelmuth
Copy link
Member

@mx-psi @jpkrohling I'd like to take a shot at this issue as my first contribution.

@mx-psi
Copy link
Member

mx-psi commented Mar 15, 2022

What would happen if someone added a string in the attributes section that doesn't make sense for the detector type? For example, if the detector is only system but attributes includes cloud.region?

You could check this on the Validate call of the detector, but it's going to take some work to do it: the detectors don't expose right now what attributes they add/not. To me, it's fine to not validate this for now.

Also, if the configuration includes multiple detectors, would the attributes section apply to all detectors or should you be able to configure the attribute values for each detector?

From the example, I would say the original intention is for this to apply to all detectors.

@TylerHelmuth
Copy link
Member

To me, it's fine to not validate this for now.

So in my cloud.region example, would the fact that cloud.region exist in the list be ignored in the filter process and what would the result be? If cloud.region is the only attribute configured, should the result be no attributes get added, effectively configuring the processor into uselessness, or should we go back to using the full default list?

And if the list includes both a valid and invalid, such as host.id and cloud.region, should the attributes be filtered to only host.id ignoring the invalid cloud.region, or do the presence of an invalid attribute mean that we should revert to the full default list?

@Aneurysm9
Copy link
Member

I would view the list of attributes as an allowlist. The user is indicating that the will accept only these attributes and that all others should be rejected. Alternately, it could be made more flexible by having inclusion/exclusion lists:

processors:
  resourcedetection:
    detectors: [ec2]
    timeout: 2s
    override: false
    attributes: 
      include: [host.name, host.id]
      exclude: [cloud.provider]

The exclusion here would be redundant since inclusion should take precedence and any attributes not explicitly included should be excluded, but I included both for illustration.

@TylerHelmuth
Copy link
Member

Ok, treating it as an allowlist seems simple enough. I will move forward with that approach. I do like the idea of validating the allowlist to help stop customers from hurting themselves by listing invalid attributes, but maybe thats a future enhancement.

@TylerHelmuth
Copy link
Member

Should have a PR for this tomorrow.

@TylerHelmuth
Copy link
Member

@open-telemetry/collector-contrib-approvers I think this issue and be closed.

@djaglowski
Copy link
Member

Closed by #8547

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

7 participants