Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telegraf 1.21.0 - segmentation violation when using logparser #10286

Closed
hluaces opened this issue Dec 16, 2021 · 11 comments · Fixed by #10288 or #10296
Closed

Telegraf 1.21.0 - segmentation violation when using logparser #10286

hluaces opened this issue Dec 16, 2021 · 11 comments · Fixed by #10288 or #10296
Labels
area/tail bug unexpected problem or unintended behavior regression something that used to work, but is now broken

Comments

@hluaces
Copy link

hluaces commented Dec 16, 2021

Relevent telegraf.conf

[global_tags]

[agent]
  interval = "15s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  debug = false
  quiet = false
  logfile = ""
  hostname = ""
  omit_hostname = false

System info

CentOS 7, Telegraf 1.21.0 (git: HEAD efa6522)

Docker

The Docker file is here:

https://github.com/hluaces/telegraf-bug-1.21.0

Please see the README.md for instructions on reproducing the error.

Steps to reproduce

See the linked Dockerfile for instructions on how to reproduce.

Apparently the tail input seems to be failing when parsing logs. This did not happen in the 1.20 branches.

Expected behavior

Logs should be parsed without unexpected failures.

Actual behavior

Telegraf is failing with an error.

Additional info

This is the error output:

2021-12-16T10:11:13Z I! Starting Telegraf 1.21.0
2021-12-16T10:11:13Z I! Loaded inputs: tail
2021-12-16T10:11:13Z I! Loaded aggregators:
2021-12-16T10:11:13Z I! Loaded processors:
2021-12-16T10:11:13Z I! W! Outputs are not used in testing mode!
2021-12-16T10:11:13Z I! Tags enabled: host=30c097b3ff33
2021-12-16T10:11:13Z D! [agent] Initializing plugins
2021-12-16T10:11:13Z D! [agent] Starting service inputs
2021-12-16T10:11:13Z D! [inputs.tail] Tail added for "/usr/local/cpanel/logs/spamd_error_log"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x7c8b5f]

goroutine 72 [running]:
github.com/influxdata/telegraf/plugins/parsers/grok.(*Parser).ParseLine(0xc0004821e0, {0xc0000bcee0, 0x68})
    /go/src/github.com/influxdata/telegraf/plugins/parsers/grok/parser.go:213 +0x2ff
github.com/influxdata/telegraf/plugins/parsers/grok.(*Parser).Parse(0x44ab40, {0xc0000bce70, 0x68, 0x70})
    /go/src/github.com/influxdata/telegraf/plugins/parsers/grok/parser.go:386 +0x15c
github.com/influxdata/telegraf/plugins/inputs/tail.parseLine({0x577f6b0, 0xc0004821e0}, {0xc0000bd1f0, 0x0})
    /go/src/github.com/influxdata/telegraf/plugins/inputs/tail/tail.go:303 +0x10f
github.com/influxdata/telegraf/plugins/inputs/tail.(*Tail).receiver(0xc0001af600, {0x577f6b0, 0xc0004821e0}, 0xc0007266e0)
    /go/src/github.com/influxdata/telegraf/plugins/inputs/tail/tail.go:371 +0x4f4
github.com/influxdata/telegraf/plugins/inputs/tail.(*Tail).tailNewFiles.func2()
    /go/src/github.com/influxdata/telegraf/plugins/inputs/tail/tail.go:275 +0x85
created by github.com/influxdata/telegraf/plugins/inputs/tail.(*Tail).tailNewFiles
    /go/src/github.com/influxdata/telegraf/plugins/inputs/tail/tail.go:273 +0x7af
@hluaces hluaces added the bug unexpected problem or unintended behavior label Dec 16, 2021
@ohopp
Copy link

ohopp commented Dec 16, 2021

The same here unter Debian Bullseye:

2021-12-16T10:49:37.481624+01:00 test.example.org telegraf[883492]: panic: runtime error: invalid memory address or nil pointer dereference
2021-12-16T10:49:37.481708+01:00 test.example.org telegraf[883492]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x7c8b5f]

Reverting back to 1.20 works fine.
No 'tail' is involved here, but there is a hint for it in the syslog:
2021-12-16T10:49:37.334625+01:00 test.example.com telegraf[883492]: 2021-12-16T09:49:37Z W! [inputs.logparser] The logparser plugin is deprecated; please use the 'tail' input with the 'grok' data_format

@srebhan
Copy link
Member

srebhan commented Dec 16, 2021

Hey @hluaces and @ohopp , can you test the artifacts built in #10288?

@hluaces
Copy link
Author

hluaces commented Dec 16, 2021

Hey @hluaces and @ohopp , can you test the artifacts built in #10288?

Seems to be working for me on the Dockerfile I've provided, LGTM

@srebhan
Copy link
Member

srebhan commented Dec 16, 2021

@hluaces can you please also add this as a short note to the PR?

@srebhan srebhan added the regression something that used to work, but is now broken label Dec 16, 2021
@ohopp
Copy link

ohopp commented Dec 17, 2021

Hi, even with Version 1.21.1 from last night it fails under Debian Bullseye:

2021-12-17T09:42:12.553882+01:00 localhost.example.org telegraf[711430]: 2021-12-17T08:42:12Z I! Starting Telegraf 1.21.1
2021-12-17T09:42:12.557013+01:00 localhost.example.org telegraf[711430]: 2021-12-17T08:42:12Z W! DeprecationWarning: Plugin "inputs.logparser" deprecated since version 1.15.0 and will be removed in 2.0.0: use 'inputs.tail' with 'grok' data format instead
2021-12-17T09:42:12.557121+01:00 localhost.example.org telegraf[711430]: 2021-12-17T08:42:12Z I! Loaded inputs: cpu disk diskio exec kernel logparser mem net netstat openldap processes swap system unbound
2021-12-17T09:42:12.557151+01:00 localhost.example.org telegraf[711430]: 2021-12-17T08:42:12Z I! Loaded aggregators:
2021-12-17T09:42:12.557177+01:00 localhost.example.org telegraf[711430]: 2021-12-17T08:42:12Z I! Loaded processors:
2021-12-17T09:42:12.557214+01:00 localhost.example.org telegraf[711430]: 2021-12-17T08:42:12Z I! Loaded outputs: influxdb
2021-12-17T09:42:12.557242+01:00 localhost.example.org telegraf[711430]: 2021-12-17T08:42:12Z I! Tags enabled: host=scc-mailout-kit-03 task=mailout
2021-12-17T09:42:12.557267+01:00 localhost.example.org telegraf[711430]: 2021-12-17T08:42:12Z W! Deprecated inputs: 1 and 0 options
2021-12-17T09:42:12.557293+01:00 localhost.example.org telegraf[711430]: 2021-12-17T08:42:12Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"scc-mailout-kit-03", Flush Interval:10s
2021-12-17T09:42:12.557331+01:00 localhost.example.org telegraf[711430]: 2021-12-17T08:42:12Z W! [inputs.logparser] The logparser plugin is deprecated; please use the 'tail' input with the 'grok' data_format
2021-12-17T09:42:13.073932+01:00 localhost.example.org telegraf[711430]: panic: runtime error: invalid memory address or nil pointer dereference
2021-12-17T09:42:13.074013+01:00 localhost.example.org telegraf[711430]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x7c8b5f]
2021-12-17T09:42:13.074049+01:00 localhost.example.org telegraf[711430]: goroutine 66 [running]:
2021-12-17T09:42:13.074075+01:00 localhost.example.org telegraf[711430]: github.com/influxdata/telegraf/plugins/parsers/grok.(*Parser).ParseLine(0xc00030ce10, {0xc00107a000, 0x2e})
2021-12-17T09:42:13.074113+01:00 localhost.example.org telegraf[711430]: #011/go/src/github.com/influxdata/telegraf/plugins/parsers/grok/parser.go:213 +0x2ff
2021-12-17T09:42:13.074137+01:00 localhost.example.org telegraf[711430]: github.com/influxdata/telegraf/plugins/inputs/logparser.(*LogParserPlugin).parser(0xc000144900)
2021-12-17T09:42:13.074188+01:00 localhost.example.org telegraf[711430]: #011/go/src/github.com/influxdata/telegraf/plugins/inputs/logparser/logparser.go:314 +0x16b
2021-12-17T09:42:13.074214+01:00 localhost.example.org telegraf[711430]: created by github.com/influxdata/telegraf/plugins/inputs/logparser.(*LogParserPlugin).Start
2021-12-17T09:42:13.074240+01:00 localhost.example.org telegraf[711430]: #011/go/src/github.com/influxdata/telegraf/plugins/inputs/logparser/logparser.go:189 +0x38f
2021-12-17T09:42:13.087604+01:00 localhost.example.org systemd[1]: telegraf.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
2021-12-17T09:42:13.087704+01:00 localhost.example.org systemd[1]: telegraf.service: Failed with result 'exit-code'.
2021-12-17T09:42:13.245367+01:00 localhost.example.org systemd[1]: telegraf.service: Scheduled restart job, restart counter is at 8.
2021-12-17T09:42:13.245504+01:00 localhost.example.org systemd[1]: Stopped The plugin-driven server agent for reporting metrics into InfluxDB.
2021-12-17T09:42:13.245568+01:00 localhost.example.org systemd[1]: telegraf.service: Start request repeated too quickly.
2021-12-17T09:42:13.245618+01:00 localhost.example.org systemd[1]: telegraf.service: Failed with result 'exit-code'.
2021-12-17T09:42:13.245664+01:00 localhost.example.org systemd[1]: Failed to start The plugin-driven server agent for reporting metrics into InfluxDB.

@srebhan
Copy link
Member

srebhan commented Dec 17, 2021

@ohopp sorry for missing this one. Please check the artifacts on PR #10296 built by CI (once it finished). As a side note: logparser is deprecated and will be removed in the future (see the warning). To be future proof (and also as a workaround) please use inputs.tail with the grok dataformat.
Anyway, please let me know if #10296 fixes the issue for you!

@jhass
Copy link

jhass commented Dec 17, 2021

Seeing something similar in the tail plugin I think

telegraf[19432]: panic: runtime error: invalid memory address or nil pointer dereference
telegraf[19432]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x55e7c626c67f]
telegraf[19432]: goroutine 69 [running]:
telegraf[19432]: github.com/influxdata/telegraf/plugins/parsers/grok.(*Parser).ParseLine(0xc000116000, {0xc000dcf360, 0xa0})
telegraf[19432]:         github.com/influxdata/telegraf/plugins/parsers/grok/parser.go:213 +0x2ff
telegraf[19432]: github.com/influxdata/telegraf/plugins/parsers/grok.(*Parser).Parse(0xc0008445c0, {0xc000dcf2c0, 0xa0, 0xa0})
telegraf[19432]:         github.com/influxdata/telegraf/plugins/parsers/grok/parser.go:386 +0x15c
telegraf[19432]: github.com/influxdata/telegraf/plugins/inputs/tail.parseLine({0x55e7cd5f5408, 0xc000116000}, {0xc0006306e0, 0x0})
telegraf[19432]:         github.com/influxdata/telegraf/plugins/inputs/tail/tail.go:303 +0x10f
telegraf[19432]: github.com/influxdata/telegraf/plugins/inputs/tail.(*Tail).receiver(0xc000d7a300, {0x55e7cd5f5408, 0xc000116000}, 0xc000ce84d0)
telegraf[19432]:         github.com/influxdata/telegraf/plugins/inputs/tail/tail.go:371 +0x4f4
telegraf[19432]: github.com/influxdata/telegraf/plugins/inputs/tail.(*Tail).tailNewFiles.func2()
telegraf[19432]:         github.com/influxdata/telegraf/plugins/inputs/tail/tail.go:275 +0x85
telegraf[19432]: created by github.com/influxdata/telegraf/plugins/inputs/tail.(*Tail).tailNewFiles
telegraf[19432]:         github.com/influxdata/telegraf/plugins/inputs/tail/tail.go:273 +0x7af

@powersj
Copy link
Contributor

powersj commented Dec 18, 2021

@jhass can you confirm you are running v1.21.1. We quickly rolled a patch release yesterday and it included a fix for tail, file, and directory_monitor

@jhass
Copy link

jhass commented Dec 18, 2021

Ah indeed, the package I'm using didn't update yet. Sorry for the noise!

@srebhan
Copy link
Member

srebhan commented Dec 20, 2021

@hluaces can you please confirm that the mentioned PR fixes your issue!?

@hluaces
Copy link
Author

hluaces commented Dec 20, 2021

@hluaces can you please confirm that the mentioned PR fixes your issue!?

#10296 isn't crashing for me so it seems to behave correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tail bug unexpected problem or unintended behavior regression something that used to work, but is now broken
Projects
None yet
5 participants