Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filebeat (5.0.0-Alpha4 and 5.0.0-Alpha5-snapshot) and filter bug #2178

Closed
remotesyssupport opened this issue Aug 5, 2016 · 3 comments
Closed
Assignees
Labels

Comments

@remotesyssupport
Copy link

Initially discussed on: https://discuss.elastic.co/t/filebeat-5-0-0alpha4-and-filter/57287

Our present setup is:

FileBeat (5.0.0 Alpha4) -> Kafka -> Logstash -> ElasticSearch

We were looking into the option of filtering (dropping) un-needed event logs at the source that is using FileBeat.

Our typical Log line (celery logs) that we would like to drop look like:

{"relativeCreated": 8381439.963102341, "process": 6651, "@timestamp": "2016-08-04T20:41:52.197Z", "args": {"exc": "Retry in 60s", "id": "57d51895-aab5-4662-b458-1b068305836f", "name": "XXXXXXXXXX"}, "module": "job", "funcName": "on_retry", "message": "Task XXXXXXXXXX[57d51895-aab5-4662-b458-1b068305836f] retry: Retry in 60s", "name": "celery.worker.job", "thread": 139742007183168, "created": 1470343312.197371, "threadName": "MainThread", "msecs": 197.3710060119629, "filename": "job.py", "levelno": 20, "processName": "MainProcess", "source_host": "worker-XXXXXXXXXX", "pathname": "XXXXXXXXXX/venv/local/lib/python2.7/site-packages/celery/worker/job.py", "lineno": 415, "@Version": 1, "levelname": "INFO"}

The Filter in filebeat.yml (in reduced form), is

### Filters
filters:
  - drop_event:
      contains:
          message: "Retry"

The filebeat log in debug shows:

2016-08-04T20:42:08Z DBG  filters: drop_event, condition=contains: map[message:Retry]

2016-08-04T20:42:13Z WARN unexpected type *string in contains condition as it accepts only strings.

2016-08-04T20:42:13Z DBG  Publish: {
  "@timestamp": "2016-08-04T20:42:08.510Z",
  "beat": {
    "hostname": "worker-XXXXXXXXXX",
    "name": "worker-XXXXXXXXXX"
  },
  "input_type": "log",
  "message": "{\"relativeCreated\": 8381439.963102341, \"process\": 6651, \"@timestamp\": \"2016-08-04T20:41:52.197Z\", \"args\": {\"exc\": \"Retry in 60s\", \"id\": \"57d51895-aab5-4662-b458-1b068305836f\", \"name\": \"XXXXXXXXXX\"}, \"module\": \"job\", \"funcName\": \"on_retry\", \"message\": \"Task XXXXXXXXXX[57d51895-aab5-4662-b458-1b068305836f] retry: Retry in 60s\", \"name\": \"celery.worker.job\", \"thread\": 139742007183168, \"created\": 1470343312.197371, \"threadName\": \"MainThread\", \"msecs\": 197.3710060119629, \"filename\": \"job.py\", \"levelno\": 20, \"processName\": \"MainProcess\", \"source_host\": \"worker-XXXXXXXXXX\", \"pathname\": \"XXXXXXXXXX/venv/local/lib/python2.7/site-packages/celery/worker/job.py\", \"lineno\": 415, \"@version\": 1, \"levelname\": \"INFO\"}",
  "offset": 24647014,
  "role": "worker",
  "source": "XXXXXXXXXX/logs/celery_supervisor.log",
  "type": "workerlog"
}

I have tried to use various combination of "contains" condition and have found that either

  • the event is published, which actually should have been dropped,
    OR
  • all events/log lines are dropped even log lines that dont have the mentioned condition

As suggested by @andrewkroh on the elastic discussion, the testing was done with 5.0.0-Alpha5 (snapshot build) with the suggested configuration changes

processors:
 - drop_event:
     when:
        contains:
           message: "Retry"

But the net results were just the same, logs containing the string "Retry" were also published

From the debug log

2016-08-04T23:23:03Z INFO Home path: [/usr/share/filebeat] Config path: [/etc/filebeat] Data path: [/var/lib/filebeat] Logs path: [/var/log/filebeat]
2016-08-04T23:23:03Z INFO Setup Beat: filebeat; Version: 5.0.0-alpha5
2016-08-04T23:23:03Z DBG  New condition contains: map[message:Retry]
2016-08-04T23:23:03Z DBG  Processors: drop_event, condition=contains: map[message:Retry]

2016-08-04T23:23:03Z WARN unexpected type *string in contains condition as it accepts only strings.
2016-08-04T23:23:03Z DBG  Publish: {
  "@timestamp": "2016-08-04T23:23:03.297Z",
  "beat": {
    "hostname": "worker-XXXXXXXXXXX",
    "name": "worker-XXXXXXXXXXX"
  },
  "input_type": "log",
  "message": "{\"relativeCreated\": 17917357.42020607, \"process\": 6651, \"@timestamp\": \"2016-08-04T23:20:48.114Z\", \"args\": {\"exc\": \"Retry in 60s\", \"id\": \"48bd61be-1b94-415f-8f4f-94ed1c0a463b\", \"name\": \"XXXXXXXXXXX\"}, \"module\": \"job\", \"funcName\": \"on_retry\", \"message\": \"Task XXXXXXXXXXX[48bd61be-1b94-415f-8f4f-94ed1c0a463b] retry: Retry in 60s\", \"name\": \"celery.worker.job\", \"thread\": 139742007183168, \"created\": 1470352848.114828, \"threadName\": \"MainThread\", \"msecs\": 114.82810974121094, \"filename\": \"job.py\", \"levelno\": 20, \"processName\": \"MainProcess\", \"source_host\": \"worker-XXXXXXXXXXX\", \"pathname\": \"XXXXXXXXXXX/venv/local/lib/python2.7/site-packages/celery/worker/job.py\", \"lineno\": 415, \"@version\": 1, \"levelname\": \"INFO\"}",
  "offset": 98172086,
  "role": "worker",
  "source": "XXXXXXXXXXX/logs/celery_supervisor.log",
  "type": "workerlog"
}
@spacewander
Copy link
Contributor

spacewander commented Aug 5, 2016

unexpected type *string in contains condition as it accepts only strings.

Configure relative bug?
https://github.com/elastic/beats/blob/master/libbeat/processors/condition.go#L138

@andrewkroh
Copy link
Member

This bug affects the regex, contains, and equals conditions when used with Filebeat's message field.

andrewkroh added a commit to andrewkroh/beats that referenced this issue Aug 9, 2016
When using any of those conditions with the `message` field in Filebeat a warning would occur and no processor would be applied. The warning message was:

    WARN unexpected type *string in contains condition as it accepts only strings.

This occurred because Filebeat was passing the message field as a *string (string pointer). The processor code only expected to receive string values.

This PR contains three changes:

- Enhance the processor code to accept *string and string.
- Make filebeat pass the message field as a string rather than *string.
- Modify a test case to work against the message field rather than the source field.

Fixes elastic#2178
@andrewkroh
Copy link
Member

I opened PR #2209 to fix this in master. That PR will need merged to the 5.0 branch too.

ruflin pushed a commit that referenced this issue Aug 10, 2016
#2209)

When using any of those conditions with the `message` field in Filebeat a warning would occur and no processor would be applied. The warning message was:

    WARN unexpected type *string in contains condition as it accepts only strings.

This occurred because Filebeat was passing the message field as a *string (string pointer). The processor code only expected to receive string values.

This PR contains three changes:

- Enhance the processor code to accept *string and string.
- Make filebeat pass the message field as a string rather than *string.
- Modify a test case to work against the message field rather than the source field.

Fixes #2178
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants