Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added benchmark output validation #645

Merged
merged 10 commits into from
Mar 17, 2022

Conversation

stvoutsin
Copy link
Collaborator

Modified test configurations to include md5 checksums for cell outputs. Does not include all cells. as some have output that changes per run (For example including current timestamp).
Upgrade version of aglais-testing that we use to 0.1.8 (which checks the output of the cell with the expected value)
Added notes on test deploy

Copy link
Collaborator

@Zarquan Zarquan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Can we add a test to demonstrate that the test fails when it should and showing what the fail output looks like.
Perhaps by making a copy of the quick config in /tmp and deliberately editing it to set the wrong checksum and then running the test to demonstrate a fail.

@stvoutsin
Copy link
Collaborator Author

I've added a bit more output to the testing suite, to include completion status (SUCCESS / SLOW / FAILED) as well as output validity of results (VALID/ INVALID). This might not be the clearest way to show the results, open to suggestions. However I've added some notes on testing this version with the valid checksums, as well as with modified checksums to check that we capture the missmatch.

@Zarquan Zarquan mentioned this pull request Mar 8, 2022
Comment on lines +254 to +256
'status': 'SLOW',
'msg': 'Expected/Actual output missmatch of cell #0! ',
'valid': 'FALSE'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what we need to avoid.

We need one value that indicates if the test passes or fails. If we are just checking the status, then SLOW suggests that the test passed but was slower than expected. Having to check a combination of both status and valid to determine if the test passed or not is a recipe for confusion.

Other suggestions aside (#649) the overall test result should be represented in a single value, so if the output validation fails, then status should be set to FAIL. Completing the test run but producing the wrong results should not be considered a success.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some modifications to the benchmarker to match your suggestions at (#649) and some notes testing these changes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@stvoutsin stvoutsin mentioned this pull request Mar 16, 2022
Copy link
Collaborator

@Zarquan Zarquan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

Comment on lines +254 to +256
'status': 'SLOW',
'msg': 'Expected/Actual output missmatch of cell #0! ',
'valid': 'FALSE'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Zarquan Zarquan merged commit 973819e into wfau:master Mar 17, 2022
@stvoutsin stvoutsin deleted the feature/upgrade-testing branch July 4, 2022 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants