Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to mute all errors (mainly due to access rights) coming from process scraper of the hostmetricsreceiver #34981

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

khalillilahk
Copy link

Description:
We are currently encountering an issue with the process scraper in the hostmetricsreceiver, primarily due to access rights restrictions for certain processes like system processes for example. This is resulting in a large number of verbose error logs. Most of them are coming from the process.open_file_descriptors metric but we have errors coming from other metrics as well.

In order to solve this issue, we added a flag mute_process_all_errors that mutes errors comming from the process scraper metrics, as these errors are predominantly associated with processes that we should not be monitoring anyways.

Link to tracking Issue: #20435

Testing: Added unit tests

Documentation:

Errors:

  • Permission denied errors:
go.opentelemetry.io/collector/receiver@v0.90.1/scraperhelper/scrapercontroller.go:176
2024-09-02T17:24:10.341+0200    error	scraping metrics        {"kind": "receiver", "name": "hostmetrics/linux/localhost", "data_type": "metrics", "error": "error reading open file descriptor count for process \"systemd\" (pid 1): open /proc/1/fd: permission denied;

  • File not found errors:
go.opentelemetry.io/collector/receiver@v0.90.1/scraperhelper/scrapercontroller.go:176
2024-09-02T17:25:38.688+0200    error   scraperhelper/scrapercontroller.go:200  Error scraping metrics  {"kind": "receiver", "name": "hostmetrics/process", "data_type": "metrics", "error": "error reading cpu times for process \"java\" (pid 466650): open /proc/466650/stat: no such file or directory; error reading memory info for process \"java\" (pid 466650): open /proc/466650/statm: no such file or directory; error reading thread info for process \"java\" (pid 466650): open /proc/466650/status: no such file or directory; error reading cpu times for process \"java\" (pid 474774): open /proc/474774/stat: no such file or directory; error reading memory info for process \"java\" (pid 474774): open /proc/474774/statm: no such file or directory; error reading thread info for process \"java\" (pid 474774): open /proc/474774/status: no such file or directory; error reading cpu times for process \"java\" (pid 481780): open /proc/481780/stat: no such file or directory; error reading memory info for process \"java\" (pid 481780): open /proc/481780/statm: no such file or directory; error reading thread info for process \"java\" (pid 481780): open /proc/481780/status: no such file or directory", "scraper": "process"}

Config:

receiver
  hostmetrics/process:
    collection_interval: ${PROCESSES_COLLECTION_INTERVAL}s
    scrapers:
      process:
        mute_process_name_error: true
        mute_process_exe_error: true
        mute_process_io_error: true
        mute_process_user_error: true
        resource_attributes:
          # disable non_used default attributes
          process.command:
            enabled: false
          process.command_line:
            enabled: false
          process.executable.path:
            enabled: false
          process.owner:
            enabled: false
          process.parent_pid:
            enabled: false
        metrics:
          # disable non-used default metrics
          process.cpu.time:
            enabled: false
          process.memory.virtual:
            enabled: false
          # enable used optional metrics
          process.cpu.utilization:
            enabled: true
          process.open_file_descriptors:
            enabled: true
          process.threads:
            enabled: true

@khalillilahk khalillilahk requested a review from a team September 3, 2024 14:29
Copy link

linux-foundation-easycla bot commented Sep 3, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@khalillilahk khalillilahk changed the title Add ability to mute errors (mainly due to access rights) coming from process scraper of the hostmetricsreceiver Add ability to mute all errors (mainly due to access rights) coming from process scraper of the hostmetricsreceiver Sep 4, 2024
@khalillilahk
Copy link
Author

@crobert-1 will this PR be merged into the upcoming release?

@crobert-1
Copy link
Member

@crobert-1 will this PR be merged into the upcoming release?

I don't know for sure. We're waiting on a project maintainer to merge and optionally a component code owner to review.

It does look like there's some ongoing discussion around the path forward with errors from the process scraper (#34988), so it would be good to hear from code owners if this is an acceptable approach.

@braydonk
Copy link
Contributor

braydonk commented Sep 6, 2024

For all intents and purposes #34988 is pretty much sure to happen. We'll still have to iron out details. In the meantime, I am fine with bringing this in, and it can be part of the deprecated batch along with all the other mutes once that work is done.

@crobert-1
Copy link
Member

Thanks for the review and input, @braydonk!

@crobert-1 crobert-1 added the ready to merge Code review completed; ready to merge by maintainers label Sep 6, 2024
@khalillilahk
Copy link
Author

@dmitryax can this PR be merged please ?

@SamerJ
Copy link
Contributor

SamerJ commented Sep 18, 2024

Hello All,

This PR also applies to our use case.
Today, the error logs we get are too verbose and are expected.

Is there any plan to merge this PR?

Thanks in Advance,

@crobert-1
Copy link
Member

Is there any plan to merge this PR?

I've added the ready to merge label, which lets the project maintainers know that this PR is ready. They will merge at their earliest convenience 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge Code review completed; ready to merge by maintainers receiver/hostmetrics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants