Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodic system/process metricset crashes under Windows #12826

Closed
adriansr opened this issue Jul 9, 2019 · 6 comments · Fixed by #12833
Closed

Periodic system/process metricset crashes under Windows #12826

adriansr opened this issue Jul 9, 2019 · 6 comments · Fixed by #12833
Assignees
Labels

Comments

@adriansr
Copy link
Contributor

adriansr commented Jul 9, 2019

  • Version: 7.2.0
  • Operating System: Windows
  • Discuss Forum URL: -
  • Steps to Reproduce:

Users report a periodic crash in Metricbeat under Windows, when the system/process metricset has been running for some hours. After enabling logging.files.redirect_stderr a crash similar to this appears in the logs:

signal arrived during external code execution

syscall.Syscall(0x7ffb7d55ce00, 0x2, 0xc000ffff80, 0xc000e30f74, 0x0, 0x0, 0x0, 0x0)
	/usr/local/go/src/runtime/syscall_windows.go:184 +0xea
syscall.CommandLineToArgv(0xc000ffff80, 0xc000e30f74, 0x80, 0x0, 0x0)
	/usr/local/go/src/syscall/zsyscall_windows.go:930 +0x80
github.com/elastic/beats/vendor/github.com/elastic/gosigar/sys/windows.ByteSliceToStringSlice(0xc000ffff80, 0x80, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/elastic/beats/vendor/github.com/elastic/gosigar/sys/windows/syscall_windows.go:525 +0x75
github.com/elastic/beats/vendor/github.com/elastic/gosigar.(*ProcArgs).Get(0xc000e31240, 0x170, 0x0, 0x0)
	/go/src/github.com/elastic/beats/vendor/github.com/elastic/gosigar/sigar_windows.go:376 +0x1f1
github.com/elastic/beats/libbeat/metric/system/process.(*Process).getDetails(0xc000a9b040, 0xc000e313b0, 0xb, 0xc0008a5401)
	/go/src/github.com/elastic/beats/libbeat/metric/system/process/process.go:138 +0x63c
github.com/elastic/beats/libbeat/metric/system/process.(*Stats).getSingleProcess(0xc000964510, 0x170, 0xc00025de60, 0x8)
	/go/src/github.com/elastic/beats/libbeat/metric/system/process/process.go:484 +0x240
github.com/elastic/beats/libbeat/metric/system/process.(*Stats).Get(0xc000964510, 0xc000c22390, 0x0, 0xc0007d7710, 0x0, 0x0)
	/go/src/github.com/elastic/beats/libbeat/metric/system/process/process.go:425 +0x113
github.com/elastic/beats/metricbeat/module/system/process.(*MetricSet).Fetch(0xc000980000, 0x9fd0a80, 0xc0008a5170)
	/go/src/github.com/elastic/beats/metricbeat/module/system/process/process.go:102 +0x51
github.com/elastic/beats/metricbeat/mb/module.(*metricSetWrapper).fetch(0xc000954e60, 0x254d1a0, 0xc0008a5170)
	/go/src/github.com/elastic/beats/metricbeat/mb/module/wrapper.go:238 +0x2b7
github.com/elastic/beats/metricbeat/mb/module.(*metricSetWrapper).startPeriodicFetching(0xc000954e60, 0x254d1a0, 0xc0008a5170)
	/go/src/github.com/elastic/beats/metricbeat/mb/module/wrapper.go:219 +0x121
github.com/elastic/beats/metricbeat/mb/module.(*metricSetWrapper).run(0xc000954e60, 0xc000a1c060, 0xc000a06180)
	/go/src/github.com/elastic/beats/metricbeat/mb/module/wrapper.go:196 +0x676
github.com/elastic/beats/metricbeat/mb/module.(*Wrapper).Start.func1(0xc000a04050, 0xc000a1c060, 0xc000a06180, 0xc000954e60)
	/go/src/github.com/elastic/beats/metricbeat/mb/module/wrapper.go:137 +0x27e
created by github.com/elastic/beats/metricbeat/mb/module.(*Wrapper).Start
	/go/src/github.com/elastic/beats/metricbeat/mb/module/wrapper.go:125 +0x147
@adriansr adriansr added bug Metricbeat Metricbeat labels Jul 9, 2019
@adriansr adriansr self-assigned this Jul 9, 2019
adriansr added a commit that referenced this issue Jul 9, 2019
This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2.

Both releases fix a similar bug under Windows when fetching the command-line of a running process:
The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash.

This bug manifested in:

Metricbeat's system/process metricset.
It is also used by:

Auditbeat's system/process.
Packetbeat's process monitor (disabled by default).
The add_process_metadata processor.
Beats monitoring.
libbeat/cmd/instance/beat.go

Fixes #12826
adriansr added a commit to adriansr/beats that referenced this issue Jul 9, 2019
…12833)

This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2.

Both releases fix a similar bug under Windows when fetching the command-line of a running process:
The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash.

This bug manifested in:

Metricbeat's system/process metricset.
It is also used by:

Auditbeat's system/process.
Packetbeat's process monitor (disabled by default).
The add_process_metadata processor.
Beats monitoring.
libbeat/cmd/instance/beat.go

Fixes elastic#12826

(cherry picked from commit 2b6763d)
adriansr added a commit to adriansr/beats that referenced this issue Jul 9, 2019
…12833)

This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2.

Both releases fix a similar bug under Windows when fetching the command-line of a running process:
The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash.

This bug manifested in:

Metricbeat's system/process metricset.
It is also used by:

Auditbeat's system/process.
Packetbeat's process monitor (disabled by default).
The add_process_metadata processor.
Beats monitoring.
libbeat/cmd/instance/beat.go

Fixes elastic#12826

(cherry picked from commit 2b6763d)
adriansr added a commit that referenced this issue Jul 9, 2019
…12837)

This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2.

Both releases fix a similar bug under Windows when fetching the command-line of a running process:
The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash.

This bug manifested in:

Metricbeat's system/process metricset.
It is also used by:

Auditbeat's system/process.
Packetbeat's process monitor (disabled by default).
The add_process_metadata processor.
Beats monitoring.
libbeat/cmd/instance/beat.go

Fixes #12826

(cherry picked from commit 2b6763d)
adriansr added a commit that referenced this issue Jul 9, 2019
…ocess information (#12835)

This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2.

Both releases fix a similar bug under Windows when fetching the command-line of a running process:
The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash.

This bug manifested in:

Metricbeat's system/process metricset.
It is also used by:

Auditbeat's system/process.
Packetbeat's process monitor (disabled by default).
The add_process_metadata processor.
Beats monitoring.
libbeat/cmd/instance/beat.go

Fixes #12826

(cherry picked from commit 2b6763d)
@cpiment
Copy link

cpiment commented Jul 12, 2019

@adriansr thanks for the patch for 7.2. Is there any way we can download the windows binaries with this patch merged? Or the only way is to build the binaries?

@adriansr
Copy link
Contributor Author

@cpiment until 7.2.1 is out (next Tuesday?), the only way will be to provide them with patched binaries. Are they experiencing crashes in beats other than Metricbeat?

@cpiment
Copy link

cpiment commented Jul 12, 2019

We are using heartbeat, winlogbeat and metricbeat (all 7.2.0), but metricbeat is the only one that crashes.

@detro
Copy link

detro commented Jul 16, 2019

Hi @adriansr , from reading information around I think this fix is landing today, right? How long before we can pull the image? We are encountering this issue in almost all our K8S nodes and we need this fix to ensure Filebeat can cope with malformed docker logs.

We are also looking into how did we get malformed Docker logs and it looks like it's because we had a forced restart of the docker demon (unrelated).

@adriansr
Copy link
Contributor Author

@detro you'll have to wait until 7.2.1 is released or build your own 7.2.1-snapshot (is not complicated at all). Are you having this same issue? It's a Windows-only crash and has only affected Metricbeat so far. Please share if you're experiencing otherwise.

@detro
Copy link

detro commented Jul 16, 2019

Sorry for the mistake: I meant to comment on this one #12268 - I think in digging into the git history and trying to figure out when the issue was fixed, I took a wrong turn and then commented in the wrong place. This is really about skipping malformed docker log lines. Apologies for the confusion.

In regards to building a snapshot, is there a pointer to a Dockerfile I could use for this?

leweafan pushed a commit to leweafan/beats that referenced this issue Apr 28, 2023
…12833) (elastic#12837)

This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2.

Both releases fix a similar bug under Windows when fetching the command-line of a running process:
The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash.

This bug manifested in:

Metricbeat's system/process metricset.
It is also used by:

Auditbeat's system/process.
Packetbeat's process monitor (disabled by default).
The add_process_metadata processor.
Beats monitoring.
libbeat/cmd/instance/beat.go

Fixes elastic#12826

(cherry picked from commit 3374b3a)
leweafan pushed a commit to leweafan/beats that referenced this issue Apr 28, 2023
…hing process information (elastic#12835)

This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2.

Both releases fix a similar bug under Windows when fetching the command-line of a running process:
The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash.

This bug manifested in:

Metricbeat's system/process metricset.
It is also used by:

Auditbeat's system/process.
Packetbeat's process monitor (disabled by default).
The add_process_metadata processor.
Beats monitoring.
libbeat/cmd/instance/beat.go

Fixes elastic#12826

(cherry picked from commit 3374b3a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants