-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added benchmark output validation #645
Conversation
…n/aglais into feature/upgrade-testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Can we add a test to demonstrate that the test fails when it should and showing what the fail output looks like.
Perhaps by making a copy of the quick config in /tmp and deliberately editing it to set the wrong checksum and then running the test to demonstrate a fail.
I've added a bit more output to the testing suite, to include completion status (SUCCESS / SLOW / FAILED) as well as output validity of results (VALID/ INVALID). This might not be the clearest way to show the results, open to suggestions. However I've added some notes on testing this version with the valid checksums, as well as with modified checksums to check that we capture the missmatch. |
'status': 'SLOW', | ||
'msg': 'Expected/Actual output missmatch of cell #0! ', | ||
'valid': 'FALSE' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what we need to avoid.
We need one value that indicates if the test passes or fails. If we are just checking the status
, then SLOW
suggests that the test passed but was slower than expected. Having to check a combination of both status
and valid
to determine if the test passed or not is a recipe for confusion.
Other suggestions aside (#649) the overall test result should be represented in a single value, so if the output validation fails, then status
should be set to FAIL
. Completing the test run but producing the wrong results should not be considered a success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made some modifications to the benchmarker to match your suggestions at (#649) and some notes testing these changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
'status': 'SLOW', | ||
'msg': 'Expected/Actual output missmatch of cell #0! ', | ||
'valid': 'FALSE' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Modified test configurations to include md5 checksums for cell outputs. Does not include all cells. as some have output that changes per run (For example including current timestamp).
Upgrade version of aglais-testing that we use to 0.1.8 (which checks the output of the cell with the expected value)
Added notes on test deploy