-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to mute all errors (mainly due to access rights) coming from process scraper of the hostmetricsreceiver #34981
Conversation
|
receiver/hostmetricsreceiver/internal/scraper/processscraper/config.go
Outdated
Show resolved
Hide resolved
15b82ca
to
3621e85
Compare
@crobert-1 will this PR be merged into the upcoming release? |
I don't know for sure. We're waiting on a project maintainer to merge and optionally a component code owner to review. It does look like there's some ongoing discussion around the path forward with errors from the process scraper (#34988), so it would be good to hear from code owners if this is an acceptable approach. |
For all intents and purposes #34988 is pretty much sure to happen. We'll still have to iron out details. In the meantime, I am fine with bringing this in, and it can be part of the deprecated batch along with all the other mutes once that work is done. |
Thanks for the review and input, @braydonk! |
@dmitryax can this PR be merged please ? |
Hello All, This PR also applies to our use case. Is there any plan to merge this PR? Thanks in Advance, |
I've added the |
@dmitryax any updates on this one please? |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #34981 +/- ##
==========================================
+ Coverage 81.27% 81.55% +0.27%
==========================================
Files 2112 2143 +31
Lines 165912 176766 +10854
==========================================
+ Hits 134846 144156 +9310
- Misses 25940 27305 +1365
- Partials 5126 5305 +179 ☔ View full report in Codecov by Sentry. |
…rom process scraper of the hostmetricsreceiver (open-telemetry#34981) **Description:** We are currently encountering an issue with the `process` scraper in the `hostmetricsreceiver`, primarily due to access rights restrictions for certain processes like system processes for example. This is resulting in a large number of verbose error logs. Most of them are coming from the `process.open_file_descriptors` metric but we have errors coming from other metrics as well. In order to solve this issue, we added a flag `mute_process_all_errors `that mutes errors comming from the process scraper metrics, as these errors are predominantly associated with processes that we should not be monitoring anyways. **Link to tracking Issue:** open-telemetry#20435 **Testing:** Added unit tests **Documentation:** **Errors**: - Permission denied errors: ``` go.opentelemetry.io/collector/receiver@v0.90.1/scraperhelper/scrapercontroller.go:176 2024-09-02T17:24:10.341+0200 error scraping metrics {"kind": "receiver", "name": "hostmetrics/linux/localhost", "data_type": "metrics", "error": "error reading open file descriptor count for process \"systemd\" (pid 1): open /proc/1/fd: permission denied; ``` - File not found errors: ``` go.opentelemetry.io/collector/receiver@v0.90.1/scraperhelper/scrapercontroller.go:176 2024-09-02T17:25:38.688+0200 error scraperhelper/scrapercontroller.go:200 Error scraping metrics {"kind": "receiver", "name": "hostmetrics/process", "data_type": "metrics", "error": "error reading cpu times for process \"java\" (pid 466650): open /proc/466650/stat: no such file or directory; error reading memory info for process \"java\" (pid 466650): open /proc/466650/statm: no such file or directory; error reading thread info for process \"java\" (pid 466650): open /proc/466650/status: no such file or directory; error reading cpu times for process \"java\" (pid 474774): open /proc/474774/stat: no such file or directory; error reading memory info for process \"java\" (pid 474774): open /proc/474774/statm: no such file or directory; error reading thread info for process \"java\" (pid 474774): open /proc/474774/status: no such file or directory; error reading cpu times for process \"java\" (pid 481780): open /proc/481780/stat: no such file or directory; error reading memory info for process \"java\" (pid 481780): open /proc/481780/statm: no such file or directory; error reading thread info for process \"java\" (pid 481780): open /proc/481780/status: no such file or directory", "scraper": "process"} ``` **Config**: ``` receiver hostmetrics/process: collection_interval: ${PROCESSES_COLLECTION_INTERVAL}s scrapers: process: mute_process_name_error: true mute_process_exe_error: true mute_process_io_error: true mute_process_user_error: true resource_attributes: # disable non_used default attributes process.command: enabled: false process.command_line: enabled: false process.executable.path: enabled: false process.owner: enabled: false process.parent_pid: enabled: false metrics: # disable non-used default metrics process.cpu.time: enabled: false process.memory.virtual: enabled: false # enable used optional metrics process.cpu.utilization: enabled: true process.open_file_descriptors: enabled: true process.threads: enabled: true ```
Description:
We are currently encountering an issue with the
process
scraper in thehostmetricsreceiver
, primarily due to access rights restrictions for certain processes like system processes for example. This is resulting in a large number of verbose error logs. Most of them are coming from theprocess.open_file_descriptors
metric but we have errors coming from other metrics as well.In order to solve this issue, we added a flag
mute_process_all_errors
that mutes errors comming from the process scraper metrics, as these errors are predominantly associated with processes that we should not be monitoring anyways.Link to tracking Issue: #20435
Testing: Added unit tests
Documentation:
Errors:
Config: