Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] Check expand_event_list_from_field before checking content-type #16441

Merged
merged 5 commits into from
Feb 26, 2020
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
- Prevent Elasticsearch from spewing log warnings about redundant wildcards when setting up ingest pipelines for the `elasticsearch` module. {issue}15840[15840] {pull}15900[15900]
- Fix mapping error for cloudtrail additionalEventData field {pull}16088[16088]
- Fix a connection error in httpjson input. {pull}16123[16123]
- Fix s3 input with cloudtrail fileset reading json file. {issue}16374[16374] {pull}16441[16441]
- Adding the var definitions in azure manifest files, fix for errors when executing command setup. {issue}16270[16270] {pull}16468[16468]
- Fix merging of fileset inputs to replace paths and append processors. {pull}16450{16450}

Expand Down
6 changes: 5 additions & 1 deletion x-pack/filebeat/docs/inputs/input-aws-s3.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,11 @@ If the fileset using this input expects to receive multiple messages bundled
under a specific field then the config option expand_event_list_from_field value
can be assigned the name of the field. This setting will be able to split the
messages under the group value into separate events. For example, CloudTrail logs
are in JSON format and events are found under the JSON object "Records":
are in JSON format and events are found under the JSON object "Records".

Note: When `expand_event_list_from_field` parameter is given in the config, s3
input will assume the logs are in JSON format and decode them as JSON. Content
type will not be checked.

[float]
==== `api_timeout`
Expand Down
23 changes: 12 additions & 11 deletions x-pack/filebeat/input/s3/input.go
Original file line number Diff line number Diff line change
Expand Up @@ -427,17 +427,6 @@ func (p *s3Input) createEventsFromS3Info(svc s3iface.ClientAPI, info s3Info, s3C
defer resp.Body.Close()

reader := bufio.NewReader(resp.Body)
// Check content-type
if (resp.ContentType != nil && *resp.ContentType == "application/x-gzip") || strings.HasSuffix(info.key, ".gz") {
gzipReader, err := gzip.NewReader(resp.Body)
if err != nil {
err = errors.Wrap(err, "gzip.NewReader failed")
p.logger.Error(err)
return err
}
reader = bufio.NewReader(gzipReader)
gzipReader.Close()
}

// Decode JSON documents when expand_event_list_from_field is given in config
if p.config.ExpandEventListFromField != "" {
Expand All @@ -451,6 +440,18 @@ func (p *s3Input) createEventsFromS3Info(svc s3iface.ClientAPI, info s3Info, s3C
return nil
}

// Check content-type
if (resp.ContentType != nil && *resp.ContentType == "application/x-gzip") || strings.HasSuffix(info.key, ".gz") {
gzipReader, err := gzip.NewReader(resp.Body)
if err != nil {
err = errors.Wrap(err, "gzip.NewReader failed")
p.logger.Error(err)
return err
}
reader = bufio.NewReader(gzipReader)
gzipReader.Close()
}

// handle s3 objects that are not json content-type
offset := 0
for {
Expand Down