-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Periodic system/process metricset crashes under Windows #12826
Comments
This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2. Both releases fix a similar bug under Windows when fetching the command-line of a running process: The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash. This bug manifested in: Metricbeat's system/process metricset. It is also used by: Auditbeat's system/process. Packetbeat's process monitor (disabled by default). The add_process_metadata processor. Beats monitoring. libbeat/cmd/instance/beat.go Fixes #12826
…12833) This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2. Both releases fix a similar bug under Windows when fetching the command-line of a running process: The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash. This bug manifested in: Metricbeat's system/process metricset. It is also used by: Auditbeat's system/process. Packetbeat's process monitor (disabled by default). The add_process_metadata processor. Beats monitoring. libbeat/cmd/instance/beat.go Fixes elastic#12826 (cherry picked from commit 2b6763d)
…12833) This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2. Both releases fix a similar bug under Windows when fetching the command-line of a running process: The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash. This bug manifested in: Metricbeat's system/process metricset. It is also used by: Auditbeat's system/process. Packetbeat's process monitor (disabled by default). The add_process_metadata processor. Beats monitoring. libbeat/cmd/instance/beat.go Fixes elastic#12826 (cherry picked from commit 2b6763d)
…12837) This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2. Both releases fix a similar bug under Windows when fetching the command-line of a running process: The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash. This bug manifested in: Metricbeat's system/process metricset. It is also used by: Auditbeat's system/process. Packetbeat's process monitor (disabled by default). The add_process_metadata processor. Beats monitoring. libbeat/cmd/instance/beat.go Fixes #12826 (cherry picked from commit 2b6763d)
…ocess information (#12835) This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2. Both releases fix a similar bug under Windows when fetching the command-line of a running process: The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash. This bug manifested in: Metricbeat's system/process metricset. It is also used by: Auditbeat's system/process. Packetbeat's process monitor (disabled by default). The add_process_metadata processor. Beats monitoring. libbeat/cmd/instance/beat.go Fixes #12826 (cherry picked from commit 2b6763d)
@adriansr thanks for the patch for 7.2. Is there any way we can download the windows binaries with this patch merged? Or the only way is to build the binaries? |
@cpiment until 7.2.1 is out (next Tuesday?), the only way will be to provide them with patched binaries. Are they experiencing crashes in beats other than Metricbeat? |
We are using heartbeat, winlogbeat and metricbeat (all 7.2.0), but metricbeat is the only one that crashes. |
Hi @adriansr , from reading information around I think this fix is landing today, right? How long before we can pull the image? We are encountering this issue in almost all our K8S nodes and we need this fix to ensure Filebeat can cope with malformed docker logs. We are also looking into how did we get malformed Docker logs and it looks like it's because we had a forced restart of the docker demon (unrelated). |
@detro you'll have to wait until 7.2.1 is released or build your own 7.2.1-snapshot (is not complicated at all). Are you having this same issue? It's a Windows-only crash and has only affected Metricbeat so far. Please share if you're experiencing otherwise. |
Sorry for the mistake: I meant to comment on this one #12268 - I think in digging into the git history and trying to figure out when the issue was fixed, I took a wrong turn and then commented in the wrong place. This is really about skipping malformed docker log lines. Apologies for the confusion. In regards to building a snapshot, is there a pointer to a Dockerfile I could use for this? |
…12833) (elastic#12837) This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2. Both releases fix a similar bug under Windows when fetching the command-line of a running process: The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash. This bug manifested in: Metricbeat's system/process metricset. It is also used by: Auditbeat's system/process. Packetbeat's process monitor (disabled by default). The add_process_metadata processor. Beats monitoring. libbeat/cmd/instance/beat.go Fixes elastic#12826 (cherry picked from commit 3374b3a)
…hing process information (elastic#12835) This updates gosigar to v0.10.4 and go-sysinfo to v1.0.2. Both releases fix a similar bug under Windows when fetching the command-line of a running process: The offending code expected the command-line strings read from a target process to contain a null character as a terminator. However, this is not always true, and sometimes a terminator needs to be added. Most of the time the missing terminator wasn't an issue due to the runtime allocating extra space for the string, but in some extreme cases it caused a crash. This bug manifested in: Metricbeat's system/process metricset. It is also used by: Auditbeat's system/process. Packetbeat's process monitor (disabled by default). The add_process_metadata processor. Beats monitoring. libbeat/cmd/instance/beat.go Fixes elastic#12826 (cherry picked from commit 3374b3a)
Users report a periodic crash in Metricbeat under Windows, when the
system/process
metricset has been running for some hours. After enabling logging.files.redirect_stderr a crash similar to this appears in the logs:The text was updated successfully, but these errors were encountered: