Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: possibly revisit how we set assertion prop table size #10591

Open
AndyAyersMS opened this issue Jun 27, 2018 · 4 comments
Open

JIT: possibly revisit how we set assertion prop table size #10591

AndyAyersMS opened this issue Jun 27, 2018 · 4 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@AndyAyersMS
Copy link
Member

AndyAyersMS commented Jun 27, 2018

Some data on how often we drop assertions (from investigations into #10461, via pmi over fx):

Table Size Num Methods Avg Assertion Count Methods with Dropped Assertions
64 159493 4.6 222
128 1801 42.3 28
256 237 81.2 4

Overall this doesn't seem too bad -- only about 0.16% of methods lost assertions from table size limits. I unfortunately didn't track how many of the size 64 cases were ones where we deliberately limited the size for throughput reasons.

Table size is determined by looking at compILCodeSize which will not take additions from inlining into account (among other things). So it seems like we perhaps could find a better size predictor.

Or perhaps just allow the table to grow?

Here's a breakdown of how many methods would fit into our existing size bins:

Assertion Count Methods
0 - 64 161835
65 - 128 585
129 - 256 104
more than 256 21

Max number of assertions seen was 615 (in a method that used a size 64 table), in Microsoft.CodeAnalysis.VisualBasic.VisualBasicCommandLineParser:Parse(ref,ref,ref,ref):ref:this. So 551 of these were dropped.

There is obviously some consideration of the overall cost of the subsequent dataflow analysis too, which is roughly proportional to max assertion size * number of blocks. So if we allow max assertion size to scale up with blocks we potentially see some quadratic behavior. But this seems perhaps solvable too -- we generate all (or more) assertions initially, and then only limit the ones we use if we see bad scaling.

This might let us also try and prioritize which assertions should survive, though I don't have any particular idea yet how to formulate that (something like the value number equivalent of weighted ref count usage ...?).

Note today we actually generate all assertions. Assertion gen doesn't early out once the table is full (though perhaps it should). It just stops recording assertions once the table fills up. So we are already paying the cost of finding all these assertions. And the table size itself is not a burden

So a rough proposal would be to find and record all assertions -- allowing the table to grow larger (with perhaps some size limit) and have a budget based on blocks * assertions, and, if over budget, start trimming assertions somehow to get under budget.

category:implementation
theme:assertion-prop
skill-level:intermediate
cost:medium
impact:small

@mikedn
Copy link
Contributor

mikedn commented Jun 28, 2018

Should the number of assertions that are actually used be looked at as well? I suppose it's quite possible that we generate more assertions than needed.

FWIW I started a RangeCheck prototype that does not existing assertions but rather extracts them on demand. I need to go back to it to see what it would take to bring it to a reasonable functionality level to allow some measurements.

@AndyAyersMS
Copy link
Member Author

Started to look at how many assertions are used with some ad-hoc instrumentation [raw data]. Here's a summary plot. X axis is the number of assertions created; Y is the number used. Box-and whiskers show distribution of Y for various X's. Red lines give the 0.5, 0,90, and 0.95 linear quantile estimates.

Upshot is that typically about 1 in 10 assertions gets used; 95%ile usage is about 1 in 5. There are some outliers but overall of corelib the maximum number used was 22 (out of 161).

1551 of the 28649 methods used all their assertions, but none of those cases generated more than 5 assertions.

So it would appear most assertions are not useful (at least to assertion prop).

image

@erozenfeld
Copy link
Member

@dotnet/jit-contrib

@kunalspathak
Copy link
Member

Some related work done in #73035 and #72700

@BruceForstall BruceForstall removed the JitUntriaged CLR JIT issues needing additional triage label Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

No branches or pull requests

6 participants