-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache "global" trait resolutions #26326
Cache "global" trait resolutions #26326
Conversation
that are known to have been satisfied *somewhere*. This means that if one fn finds that `SomeType: Foo`, then every other fn can just consider that to hold. Unfortunately, there are some complications: 1. If `SomeType: Foo` includes dependent conditions, those conditions may trigger an error. This error will be repored in the first fn where `SomeType: Foo` is evaluated, but not in the other fns, which can lead to uneven error reporting (which is sometimes confusing). 2. This kind of caching can be unsound in the presence of unsatisfiable where clauses. For example, suppose that the first fn has a where-clause like `i32: Bar<u32>`, which in fact does not hold. This will "fool" trait resolution into thinking that `i32: Bar<u32>` holds. This is ok currently, because it means that the first fn can never be calle (since its where clauses cannot be satisfied), but if the first fn's successful resolution is cached, it can allow other fns to compile that should not. This problem is fixed in the next commit.
again, do it once and then just remember the expanded form. At the same time, filter globally nameable predicates out of the environment, since they can cause cache errors (and they are not necessary in any case).
different parts of the crate, so modify the test cases that were relying on that to test distinct types etc.
r? @pcwalton (rust_highfive has picked a reviewer for you, use r? to override) |
r? @nrc |
cc @metajack |
// Bar<u32>. But the code doesn't check that this could never be | ||
// satisfied. | ||
require::<i32, u32>(); | ||
//~^ ERROR the trait `Bar<u32>` is not implemented for the type `i32` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a "breaking change" by any means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of. This function does compile uselessly in stable/nightly.
trans creates tons of these useless nested obligations - I think the primary benefits may be Empirically, dealing with Maybe |
Anyway, rustc perf numbers:
This is a 2%-3% improvement on rustc. It is indeed mostly a trans improvement (improves translation from 89.6s to 79.3s, but type-checking from 62.6s to 59s) - probably because of workload differences between rustc and trans. Still nice. I guess the mostly-uniformity of profiling info across rustc isn't worth much - I really need to profile other crates too. |
@@ -3018,11 +3066,12 @@ impl FlagComputation { | |||
} | |||
|
|||
&TyClosure(_, substs) => { | |||
self.add_flags(TypeFlags::HAS_TY_CLOSURE); | |||
self.add_flags(TypeFlags::HAS_LOCAL_NAMES); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks to be a mistake - at least the removal.
r=me with the rebasing errors fixed |
@arielb1 time numbers for compiling script (in dev mode) on my machine: After:
Before:
Regarding error messages, I discussed it with @pcwalton, and we felt like it was worth landing this sooner and dealing with error messages later -- though I left a note for what I think is the best fix. Doesn't seem too hard, I could do some local experiments. |
@arielb1 based on the profiling I did, the benefits were not just in trans, but they were primarily due to Sized and Copy bounds. However, the numbers I see now look quite different than what I was seeing then, so I ought to re-run the profiles (I think this is due to how I compiled rustc, but I'm not sure). |
I feel like a better way of doing this is to do each global predicate in a fresh "global" fulfillcx, and merge duplicates if fulfillment is successful (I don't think it is worth to keep track of errors). I don't like the idea of polluting a global cache with potentially-bogus results - it seems to me like it would easily cause mysterious ICE-s or other error cascades. |
@arielb1 well, it doesn't really matter so much what fulfill cx we do it in, the problem is more that we have to track when all dependent resolutions are complete. I do agree there is the potential for ICEs here. |
This does indeed seem to produce 13-14% improvements in both typeck and trans time when assertions are enabled (I guess these @nikomatsakis |
@bors: r+ |
📌 Commit 957935a has been approved by |
…, r=pcwalton When we successfully resolve a trait reference with no type/lifetime parameters, like `i32: Foo` or `Box<u32>: Sized`, this is in fact globally true. This patch adds a simple global to the tcx to cache such cases. The main advantage of this is really about caching things like `Box<Vec<Foo>>: Sized`. It also points to the need to revamp our caching infrastructure -- the current caches make selection cost cheaper, but we still wind up paying a high cost in the confirmation process, and in particular unrolling out dependent obligations. Moreover, we should probably do caching more uniformly and with a key that takes the where-clauses into account. But that's for later. For me, this shows up as a reasonably nice win (20%) on Servo's script crate (when built in dev mode). This is not as big as my initial measurements suggested, I think because I was building my rustc with more debugging enabled at the time. I've not yet done follow-up profiling and so forth to see where the new hot spots are. Bootstrap times seem to be largely unaffected. cc @pcwalton This is technically a [breaking-change] in that functions with unsatisfiable where-clauses may now yield errors where before they may have been accepted. Even before, these functions could never have been *called* by actual code. In the future, such functions will probably become illegal altogether, but in this commit they are still accepted, so long as they do not rely on the unsatisfiable where-clauses. As before, the functions still cannot be called in any case.
Regressed by rust-lang#26326
When we successfully resolve a trait reference with no type/lifetime parameters, like
i32: Foo
orBox<u32>: Sized
, this is in fact globally true. This patch adds a simple global to the tcx to cache such cases. The main advantage of this is really about caching things likeBox<Vec<Foo>>: Sized
. It also points to the need to revamp our caching infrastructure -- the current caches make selection cost cheaper, but we still wind up paying a high cost in the confirmation process, and in particular unrolling out dependent obligations. Moreover, we should probably do caching more uniformly and with a key that takes the where-clauses into account. But that's for later.For me, this shows up as a reasonably nice win (20%) on Servo's script crate (when built in dev mode). This is not as big as my initial measurements suggested, I think because I was building my rustc with more debugging enabled at the time. I've not yet done follow-up profiling and so forth to see where the new hot spots are. Bootstrap times seem to be largely unaffected.
cc @pcwalton
This is technically a [breaking-change] in that functions with unsatisfiable where-clauses may now yield errors where before they may have been accepted. Even before, these functions could never have been called by actual code. In the future, such functions will probably become illegal altogether, but in this commit they are still accepted, so long as they do not rely on the unsatisfiable where-clauses. As before, the functions still cannot be called in any case.