-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Filebeat] Sort array fields in generated data #25320
[Filebeat] Sort array fields in generated data #25320
Conversation
27d2bcb
to
1c5a3ca
Compare
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
0a83fd1
to
b8b7268
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this will make things more consistent.
@@ -194,6 +194,9 @@ def _test_expected_events(self, test_file, objects): | |||
for k, obj in enumerate(objects): | |||
objects[k] = self.flatten_object(obj, {}, "") | |||
clean_keys(objects[k]) | |||
for key in objects[k].keys(): | |||
if isinstance(objects[k][key], list) and len(objects[k][key]) > 0 and not isinstance(objects[k][key][0], dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little worried about the last check to see if the first element is a dict, if any of the items in the list can't be compared to one another with "<" we will get a runtime error. What about switching to:
if isinstance(objects[k][key], list):
objects[k][key].sort(key=str)
That will force string comparison for everything.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never thought about forcing everything to a string. That's a really interesting idea. I'll try that tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things work fine when I do modules individually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird. I'm not seeing those errors when I run your PR locally.
anything in the build/system-tests/run/<failed test name>
directory? It looks like some documents got left in elasticsearch expected event.module=kafka but got redis
.
Might need to clean out docker images and re-run the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll schwack them and start clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still got the same error. This is what shows in the output.log
file under build/system-tests/run/<failed test name>
2021-05-04T21:36:37.226-0200 INFO instance/beat.go:651 Home path: [/go/src/github.com/elastic/beats/filebeat] Config path: [/go/src/github.com/elastic/beats/filebeat] Data path: [/go/src/github.com/elastic/beats/filebeat/data] Logs path: [/go/src/github.com/elastic/beats/filebeat/logs]
2021-05-04T21:36:37.237-0200 DEBUG [beat] instance/beat.go:704 Beat metadata path: /go/src/github.com/elastic/beats/filebeat/data/meta.json
2021-05-04T21:36:37.237-0200 INFO instance/beat.go:659 Beat ID: 32ac6545-bf91-437e-8aa3-a64e3359883a
2021-05-04T21:36:37.238-0200 INFO instance/beat.go:407 filebeat stopped.
2021-05-04T21:36:37.238-0200 ERROR instance/beat.go:963 Exiting: data path already locked by another beat. Please make sure that multiple beats are not sharing the same data path (path.data).
Exiting: data path already locked by another beat. Please make sure that multiple beats are not sharing the same data path (path.data).
/go/src/github.com/elastic/beats/filebeat/filebeat.test -systemTest -e -d * -once -c /go/src/github.com/elastic/beats/filebeat/build/system-tests/run/test_modules.Test.test_fileset_file_076_elasticsearch/filebeat.yml -E setup.ilm.enabled=false -modules=elasticsearch -M elasticsearch.*.enabled=false -M elasticsearch.gc.enabled=true -M elasticsearch.gc.var.input=file -M elasticsearch.gc.var.paths=[/go/src/github.com/elastic/beats/filebeat/module/elasticsearch/gc/test/test.log] -M *.*.input.close_eof=true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like filebeat didn't exit before the next test of a module started.
I'm going to approve and see if we get the same errors with CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I still would like to have the generated data files updated so follow on PRs don't have to worry about the unrelated changes to the data files. So if you're able to run all the modules together and update the files, can you push those changes to the branch??
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
This pull request is now in conflicts. Could you fix it? 🙏
|
23b9383
to
c41305c
Compare
This pull request is now in conflicts. Could you fix it? 🙏
|
c41305c
to
0334c60
Compare
0334c60
to
d5c9e65
Compare
d5c9e65
to
ffd2a18
Compare
@leehinman I was able to run all the filebeat modules successfully. Ready for CI tests. |
I ran it multiple times and no change in order of array fields. |
- sort arrays by using string representation of values - update generated data (cherry picked from commit 52f2265) # Conflicts: # x-pack/filebeat/module/azure/signinlogs/test/signinlogs.log-expected.json # x-pack/filebeat/module/cyberark/corepas/test/generated.log-expected.json # x-pack/filebeat/module/f5/bigipafm/test/generated.log-expected.json # x-pack/filebeat/module/fortinet/fortimanager/test/generated.log-expected.json # x-pack/filebeat/module/imperva/securesphere/test/generated.log-expected.json # x-pack/filebeat/module/iptables/log/test/geo.log-expected.json # x-pack/filebeat/module/iptables/log/test/icmp.log-expected.json # x-pack/filebeat/module/iptables/log/test/iptables.log-expected.json # x-pack/filebeat/module/iptables/log/test/ipv6.log-expected.json # x-pack/filebeat/module/iptables/log/test/ubiquiti.log-expected.json # x-pack/filebeat/module/panw/panos/test/global_protect.log-expected.json # x-pack/filebeat/module/panw/panos/test/pan_inc_other.log-expected.json # x-pack/filebeat/module/panw/panos/test/pan_inc_threat.log-expected.json # x-pack/filebeat/module/panw/panos/test/pan_inc_traffic.log-expected.json # x-pack/filebeat/module/panw/panos/test/userid.log-expected.json # x-pack/filebeat/module/snort/log/test/generated.log-expected.json # x-pack/filebeat/module/sonicwall/firewall/test/generated.log-expected.json # x-pack/filebeat/module/sophos/utm/test/generated.log-expected.json # x-pack/filebeat/module/tomcat/log/test/generated.log-expected.json
- sort arrays by using string representation of values - update generated data (cherry picked from commit 52f2265)
What does this PR do?
Currently the array fields in generated data don't have any sorting or consistent ordering. This PR sorts any field that is a list/array when the generated data files are created.
Why is it important?
Whenever data files are generated, there are a large number of changes in the commit due to array fields changing order of the items even when the contents didn't actually change. This makes it very difficult to actually find what changed, if anything.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
How to test this PR locally
The generated data above should always come out the same and the list orders shouldn't change
Related issues
Use cases
Screenshots
Logs