-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let measure-complexity
output the worst performing verification tasks by resource count
#5631
Let measure-complexity
output the worst performing verification tasks by resource count
#5631
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very nice feature! There are a few places where I think slightly different terminology might be better, but I otherwise think it looks good.
CliCompilation cliCompilation, | ||
IObservable<CanVerifyResult> verificationResults) { | ||
|
||
PriorityQueue<VerificationTaskResult, int> worstPerformers = new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect place to use a priority queue!
} | ||
|
||
foreach (var performer in decreasingWorst) { | ||
await output.WriteLineAsync($"Verification task on line {performer.Task.Token.line} in file {performer.Task.Token.filename} consumed {performer.Result.ResourceCount} resources"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really long message to parse. I'd prefer something like `file.dfy(9): <some description of context, maybe function/method/lemma name> used 4389 RU". If you leave it as-is, the term "resources" is kind of vague. I'd maybe use "RU" to match what the IDE does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the line number in there is a must. We will move towards --isolate-assertions
being the default, so having a function/method/lemma name won't say much.
Updated it to:
Starting verification of iteration 1/1 with seed 0
measure-complexity.dfy(5,18): Error: assertion might not hold
The most demanding 100 verification tasks consumed these resources:
measure-complexity.dfy(8,18): 9984
measure-complexity.dfy(7,18): 9065
measure-complexity.dfy(8,15): 8745
...
I'm not a fan of abbreviations in user interfaces. resources
is understandable for someone who has not read the manual, while RU
is not. resource units
as opposed to resources
does not give me any extra information. Alternatively we could call it complexity. Then it would say "The most complex 100 verification tasks had this complexity:", which I think would be more intuitive than resources. I'd be in favor of that change but then we should make it everywhere.
} | ||
|
||
private static readonly Option<uint> TopX = new("--top-x", () => 10U, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there must be a better name for this. But I haven't been able to come up with one yet!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to --worst-amount
@@ -0,0 +1,30 @@ | |||
Starting verification of iteration 1/1 with seed 0 | |||
TestFiles/LitTests/LitTest/cli/measure-complexity.dfy(5,18): Error: assertion might not hold | |||
Verification task on line 8 in file measure-complexity.dfy consumed 9984 resources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these lines will be fragile (and might cause the nightly to fail, since Z3 tends to have different resource use on different platforms). Could you move it from an .expect
file to CHECK:
directives in the source file, with wildcards for the actual numbers?
199bb77
to
d1354e7
Compare
Description
measure-complexity
output the worst performing verification tasks by resource count--top-x
tomeasure-complexity
to configure how many of the worst tasks are shown--iterations
from 10 to 1How has this been tested?
By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.