Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensuring pytest catches pipeline errors #20997

Closed
P1llus opened this issue Sep 6, 2020 · 1 comment · Fixed by #20999
Closed

Ensuring pytest catches pipeline errors #20997

P1llus opened this issue Sep 6, 2020 · 1 comment · Fixed by #20999
Labels
enhancement Filebeat Filebeat flaky-test Unstable or unreliable test cases.

Comments

@P1llus
Copy link
Member

P1llus commented Sep 6, 2020

Would be nice to know what you think about this one @jsoriano .

With both pytest and nosetest there has been one small change in need that I hope would be easy to implement, and that is the possibility for it to catch "HTTP Code 400" returned by Elasticsearch when running test_modules or test_xpack_modules.

For each module that is tested, before ingesting the test files and comparing it with the golden files, it will always have to install the pipeline and index pattern, however, even when pipeline installation fails it will continue trying until it times out, especially in CI processes where timeout in some cases is increased this creates unnecessary wait times.

Would it be possible to implement this check? If you don't have time could you point me towards the part handling pipeline installations and I can create a PR around it?

Example output we want to catch early but don't:

cat build/system-tests/run/test_xpack_modules.XPackTest.test_fileset_file_2_azure/output.log | grep ERROR
2020-09-06T06:53:38.648-0200	ERROR	[publisher_pipeline_output]	pipeline/output.go:154	Failed to connect to backoff(elasticsearch(http://elasticsearch:9200)): Connection marked as failed because the onConnect callback failed: 1 error: Error loading pipeline for fileset azure/activitylogs: couldn't load pipeline: couldn't load json. Error: 400 Bad Request: {"error":{"root_cause":[{"type":"parse_exception","reason":"[type] required property is missing","property_name":"type","processor_type":"convert"}],"type":"parse_exception","reason":"[type] required property is missing","property_name":"type","processor_type":"convert"},"status":400}. Response body: {"error":{"root_cause":[{"type":"parse_exception","reason":"[type] required property is missing","property_name":"type","processor_type":"convert"}],"type":"parse_exception","reason":"[type] required property is missing","property_name":"type","processor_type":"convert"},"status":400}
2020-09-06T06:53:42.509-0200	ERROR	[publisher_pipeline_output]	pipeline/output.go:154	Failed to connect to backoff(elasticsearch(http://elasticsearch:9200)): Connection marked as failed because the onConnect callback failed: 1 error: Error loading pipeline for fileset azure/activitylogs: couldn't load pipeline: couldn't load json. Error: 400 Bad Request: {"error":{"root_cause":[{"type":"parse_exception","reason":"[type] required property is missing","property_name":"type","processor_type":"convert"}],"type":"parse_exception","reason":"[type] required property is missing","property_name":"type","processor_type":"convert"},"status":400}. Response body: {"error":{"root_cause":[{"type":"parse_exception","reason":"[type] required property is missing","property_name":"type","processor_type":"convert"}],"type":"parse_exception","reason":"[type] required property is missing","property_name":"type","processor_type":"convert"},"status":400}

The above error will not cause any test to fail, which is weird since without the pipeline everything else will 100% fail always. While it should not stop it from testing further on other modules, it should stop testing for that one module and fail it with a descriptive error message, currently the output from this is:

(2 durations < 0.005s hidden.  Use -vv to show these durations.)
==================================================================== short test summary info ====================================================================
FAILED tests/system/test_xpack_modules.py::XPackTest::test_fileset_file_2_azure - Failed: Timeout >90.0s
FAILED tests/system/test_xpack_modules.py::XPackTest::test_fileset_file_3_azure - beat.beat.TimeoutError: Timeout waiting for 'cond' to be true. Waited 10 sec..
@P1llus P1llus added enhancement Filebeat Filebeat flaky-test Unstable or unreliable test cases. Team: Ingest labels Sep 6, 2020
@jsoriano
Copy link
Member

jsoriano commented Sep 6, 2020

Hey @P1llus,

This would be actually a nice enhancement 🙂

These tests for modules are implemented here:

def run_on_file(self, module, fileset, test_file, cfgfile):

There is no separated process to install the pipelines, they are installed by Filebeat when it needs them.

To improve this situation you would need to add something around the execution of Filebeat here:

subprocess.Popen(cmd,
env=local_env,
stdin=None,
stdout=output,
stderr=subprocess.STDOUT,
bufsize=0).wait()

I see two options:

  • After filebeat is executed, and before doing anything with Elasticsearch, check if there were any ERRORs in logs, if there were, raise them.
  • Before executing Filebeat there, explicitly run filebeat setup with the same parameters. This will install the pipeline, and fail if it is not correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Filebeat Filebeat flaky-test Unstable or unreliable test cases.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants