Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to mute all errors (mainly due to access rights) coming from process scraper of the hostmetricsreceiver #34981

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .chloggen/hostmetricsreceiver-mute-all-errors.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: hostmetricsreceiver

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Add ability to mute all errors (mainly due to access rights) coming from process scraper of the hostmetricsreceiver

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [20435]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: []
13 changes: 7 additions & 6 deletions receiver/hostmetricsreceiver/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ process:
<include|exclude>:
names: [ <process name>, ... ]
match_type: <strict|regexp>
mute_process_all_errors: <true|false>
mute_process_name_error: <true|false>
mute_process_exe_error: <true|false>
mute_process_io_error: <true|false>
Expand All @@ -123,12 +124,12 @@ process:
```

The following settings are optional:

- `mute_process_name_error` (default: false): mute the error encountered when trying to read a process name the collector does not have permission to read
- `mute_process_io_error` (default: false): mute the error encountered when trying to read IO metrics of a process the collector does not have permission to read
- `mute_process_cgroup_error` (default: false): mute the error encountered when trying to read the cgroup of a process the collector does not have permission to read
- `mute_process_exe_error` (default: false): mute the error encountered when trying to read the executable path of a process the collector does not have permission to read (Linux only)
- `mute_process_user_error` (default: false): mute the error encountered when trying to read a uid which doesn't exist on the system, eg. is owned by a user that only exists in a container.
- `mute_process_all_errors` (default: false): mute all the errors encountered when trying to read metrics of a process. When this flag is enabled, there is no need to activate any other error suppression flags.
- `mute_process_name_error` (default: false): mute the error encountered when trying to read a process name the collector does not have permission to read. This flag is ignored when `mute_process_all_errors` is set to true as all errors are muted.
- `mute_process_io_error` (default: false): mute the error encountered when trying to read IO metrics of a process the collector does not have permission to read. This flag is ignored when `mute_process_all_errors` is set to true as all errors are muted.
- `mute_process_cgroup_error` (default: false): mute the error encountered when trying to read the cgroup of a process the collector does not have permission to read. This flag is ignored when `mute_process_all_errors` is set to true as all errors are muted.
- `mute_process_exe_error` (default: false): mute the error encountered when trying to read the executable path of a process the collector does not have permission to read (Linux only). This flag is ignored when `mute_process_all_errors` is set to true as all errors are muted.
- `mute_process_user_error` (default: false): mute the error encountered when trying to read a uid which doesn't exist on the system, eg. is owned by a user that only exists in a container. This flag is ignored when `mute_process_all_errors` is set to true as all errors are muted.

## Advanced Configuration

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,29 +22,38 @@ type Config struct {
Include MatchConfig `mapstructure:"include"`
Exclude MatchConfig `mapstructure:"exclude"`

// MuteProcessAllErrors is a flag that will mute all the errors encountered when trying to read metrics of a process.
// When this flag is enabled, there is no need to activate any other error suppression flags.
MuteProcessAllErrors bool `mapstructure:"mute_process_all_errors,omitempty"`

// MuteProcessNameError is a flag that will mute the error encountered when trying to read a process name the
// collector does not have permission to read.
// See https://github.com/open-telemetry/opentelemetry-collector/issues/3004 for more information.
// This flag is ignored when MuteProcessAllErrors is set to true as all errors are muted.
MuteProcessNameError bool `mapstructure:"mute_process_name_error,omitempty"`

// MuteProcessIOError is a flag that will mute the error encountered when trying to read IO metrics of a process
// the collector does not have permission to read.
// This flag is ignored when MuteProcessAllErrors is set to true as all errors are muted.
MuteProcessIOError bool `mapstructure:"mute_process_io_error,omitempty"`

// MuteProcessCgroupError is a flag that will mute the error encountered when trying to read the cgroup of a process
// the collector does not have permission to read.
// This flag is ignored when MuteProcessAllErrors is set to true as all errors are muted.
MuteProcessCgroupError bool `mapstructure:"mute_process_cgroup_error,omitempty"`

// MuteProcessExeError is a flag that will mute the error encountered when trying to read the executable path of a process
// the collector does not have permission to read (Linux)
// the collector does not have permission to read (Linux).
// This flag is ignored when MuteProcessAllErrors is set to true as all errors are muted.
MuteProcessExeError bool `mapstructure:"mute_process_exe_error,omitempty"`

// MuteProcessUserError is a flag that will mute the error encountered when trying to read uid which
// doesn't exist on the system, eg. is owned by user existing in container only
// doesn't exist on the system, eg. is owned by user existing in container only.
// This flag is ignored when MuteProcessAllErrors is set to true as all errors are muted.
MuteProcessUserError bool `mapstructure:"mute_process_user_error,omitempty"`

// ScrapeProcessDelay is used to indicate the minimum amount of time a process must be running
// before metrics are scraped for it. The default value is 0 seconds (0s)
// before metrics are scraped for it. The default value is 0 seconds (0s).
ScrapeProcessDelay time.Duration `mapstructure:"scrape_process_delay"`
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,10 @@ func (s *scraper) scrape(ctx context.Context) (pmetric.Metrics, error) {
}
}

if s.config.MuteProcessAllErrors {
return s.mb.Emit(), nil
}

return s.mb.Emit(), errs.Combine()
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1003,6 +1003,7 @@ func TestScrapeMetrics_MuteErrorFlags(t *testing.T) {
muteProcessExeError bool
muteProcessIOError bool
muteProcessUserError bool
muteProcessAllErrors bool
skipProcessNameError bool
omitConfigField bool
expectedError string
Expand Down Expand Up @@ -1093,6 +1094,30 @@ func TestScrapeMetrics_MuteErrorFlags(t *testing.T) {
return 4
}(),
},
{
name: "All Process Errors Muted",
muteProcessNameError: false,
muteProcessExeError: false,
muteProcessIOError: false,
muteProcessUserError: false,
muteProcessAllErrors: true,
expectedCount: 0,
},
{
name: "Process User Error Enabled and All Process Errors Muted",
muteProcessUserError: false,
skipProcessNameError: true,
muteProcessExeError: true,
muteProcessNameError: true,
muteProcessAllErrors: true,
expectedCount: func() int {
if runtime.GOOS == "darwin" {
// disk.io is not collected on darwin
return 3
}
return 4
}(),
},
}

for _, test := range testCases {
Expand All @@ -1106,6 +1131,7 @@ func TestScrapeMetrics_MuteErrorFlags(t *testing.T) {
config.MuteProcessExeError = test.muteProcessExeError
config.MuteProcessIOError = test.muteProcessIOError
config.MuteProcessUserError = test.muteProcessUserError
config.MuteProcessAllErrors = test.muteProcessAllErrors
}
scraper, err := newProcessScraper(receivertest.NewNopSettings(), config)
require.NoError(t, err, "Failed to create process scraper: %v", err)
Expand Down Expand Up @@ -1135,7 +1161,7 @@ func TestScrapeMetrics_MuteErrorFlags(t *testing.T) {

assert.Equal(t, test.expectedCount, md.MetricCount())

if config.MuteProcessNameError && config.MuteProcessExeError && config.MuteProcessUserError {
if (config.MuteProcessNameError && config.MuteProcessExeError && config.MuteProcessUserError) || config.MuteProcessAllErrors {
assert.NoError(t, err)
} else {
assert.EqualError(t, err, test.expectedError)
Expand Down