-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First draft at constructing a dependency graph #30393
First draft at constructing a dependency graph #30393
Conversation
`librustc_data_structures`. This includes support for pushing/popping current nodes and so forth.
This consists of: - a definition of dependency nodes (`DepNode`) - a `DepTrackingMap` that is basically API-compatible with existing maps, but which tracks reads/writes - a replacement for `visit_all_items` that also creates appropriate nodes and reads/writes - eventually, though in this PR, we will instrument the HIR map to add reads/writes, and so touching the `krate` directly would be something of a no-no
routines so they work over more than one kind of map
74cc0d6
to
5668191
Compare
Nice |
…e while we're at it
94f52b5
to
e05b8c6
Compare
Hmm, my measurements suggest more overhead than I expected. Compiling trans goes from 85sec to 89sec (a 5% increase). I am going to do some tweaks to see what the effect is. In working through this, I have been wrestling with a few emotions :)
But also:
So overall I am basically debating about whether it'd be easier/better to "derive" dependencies from first principles by walking the HIR/MIR. But when I tried to draw up a list of things to be concerned about, I confess the list started to seem quite long. I'm going to try and draw it up more thoroughly now, since I want to make a comprehensive set of tests anyhow, which may inform my ultimate opinion. |
Actually, I think I was being misled by the old "shifting measurement target" phenomena. If I measure libsyntax, I don't see the same perf regression. I do see a shift of ~1% though. Update: note that librustc_trans has changed, and includes some amount of new code, whereas libsyntax has not. |
@@ -840,13 +841,12 @@ fn check_adjustments<'a, 'tcx>(v: &mut CheckCrateVisitor<'a, 'tcx>, e: &hir::Exp | |||
} | |||
|
|||
pub fn check_crate(tcx: &ty::ctxt) { | |||
tcx.map.krate().visit_all_items(&mut CheckCrateVisitor { | |||
tcx.visit_all_items_in_krate(DepNode::CheckConst, &mut CheckCrateVisitor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice how simple adapting existing passes seems to be :)
☔ The latest upstream changes (presumably #30410) made this pull request unmergeable. Please resolve the merge conflicts. |
// Computing the type scheme of an item is a discrete task: | ||
let item_def_id = ccx.tcx.map.local_def_id(it.id); | ||
let _task = ccx.tcx.dep_graph.in_task(DepNode::TypeScheme(item_def_id)); | ||
ccx.tcx.dep_graph.read(DepNode::Hir(item_def_id)); // we have access to `it` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be moved to compute_type_scheme_of_item()
below, since we don't need a dependency from TypeScheme
to Tcache
, only the other way round (i.e. creating the TypeScheme does not really access the TCache). Right now we get an unnecessary circular dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be moved to compute_type_scheme_of_item() below, since we don't need a dependency from TypeScheme to Tcache, only the other way round (i.e. creating the TypeScheme does not really access the TCache). Right now we get an unnecessary circular dependency.
Yes, true. I debated about doing that. The circular dependency is harmless, but I guess it would indeed be better to push the new task creation down, if for no other reason than other hypothetical callers of compute_type_scheme_of_item
(none exist at present), would get the benefit of a new task.
This is actually one of the more subtle points of this work. Introducing a new node is a useful optimization for the dependency graph when you have lazy computation like this. If you don't do it, you wind up with artificial edges -- e.g., imagine both foo
and bar
reference baz
, and we check foo
first. Then we will add an edge that foo
writes to TYPE(baz)
. When we check bar
(and baz
), we will get a read of TYPE(baz)
only, because it is already computed. This means that if foo
must be recomputed, we will also recompute both bar
and baz
, even though that may not be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, one other way to structure this would be to not have separate TypeScheme and TCache nodes at all, because essentially they describe the same thing. If the computation of foo(Bar)
's TypeScheme requires the TypeScheme of Bar
, it would go to the tcache
and try to look it up. If it's not there, push the TypeScheme(Bar)
task (but without automatically adding an edge from task to subtask), compute and add it to the map, and then accessing the map would introduce a read-edge from TypeScheme(foo(Bar))
to TypeScheme(Bar)
. That would avoid adding artificial edges and lead to a cleaner, more stable dependency graph (in this case at least). More stable because the graph would not depend on (re-)execution order.
Correct me if I'm wrong.
OK, I've looked through the PR and my overall impressions are:
I'd also be interested in hearing more about the hybrid approach (mentioned by @nmatsakis on IRC) between calculating dependencies explicitly and doing it via instrumentation. |
I basically agree with everything you wrote. Let me give some off-the-cuff thoughts:
It's a bit unclear how imp't this is. Certainly introducing the notion of "read-only" nodes, that don't produce outputs, will cover most of the existing "global" cases (e.g., lints). Though lints are also a good case study in that some amount of thought should be put into when to "re-run" the lint pass -- but I think it's the kind of thing where it's not worth making things too fine-grained and incremental.
Yes. So what I had hoped to do in this respect was to beef up that test suite. Basically I plan to systematically try to uncover the kinds of changes and what kinds of rebuilds should be triggered and lay those out as test cases. I am toying around with the best way to organize all of this. In any case, this seems to be one of the core trade-offs: if we use this approach, we will wind up with a less explicit notion of what the graph is, but on the other hand we don't have to think quite as actively about it. You just have to identify "shared state" and make sure to use a I'm now sort of debating around a kind of hybrid scheme, where we scrape some of the dependencies by walking the (fully typed) HIR (or maybe MIR), and generate the rest dynamically. This has the downside that we must actively think about what our dependencies are or things will get out of sync -- precisely what I was trying to avoid -- but the upside that the dep graph will probably be cleaner. I think we'll find that in some cases though it's too hard to make something precise and we wind up with approximations: e.g., we'll probably have a de-facto edge from every function to the set of traits (and names in general) that are in scope, even if those traits didn't wind up being important for method resolution or something. I have to experiment a bit to see what that means.
Certainly the test cases can help here, though one could imagine |
Yeah, no reason. I'll change it. |
I think given the uncertainty, maybe the best way to proceed is:
|
Agreed. |
Yes, I agree. Incrementalizing those can always be done later but I would only attempt do so if there's promise of real performance wins (which I doubt)
In theory, if compiler internal datastructures are set up the right way, there is no reason for things to go wrong, so to speak. It helps to think in terms of pieces of data depending on other pieces of data and have dependency nodes for those pieces of data (and not so much for tasks producing them). Although that might mean restructuring some of the code that naturally was written without incremental compilation in mind. EDIT: Although the "In theory" at the beginning of the previous paragraph should give away that this is all conjecture on my part |
So I've been pushing on this branch a bit more. I think I'm going to close this PR for now and re-open it in a day or so. Given the minimally invasive nature, I don't expect rebasing to be too painful. Anyway, I also hope to write up a kind of "readme" that explains the general ideas a bit more clearly. |
This PR starts the work of constructing a dependency graph. It only handles two tables right now, but it builds up all the infrastructure, including a unit-testing system. Instrumenting more tables should be pretty straightforward.
I've not done performance measurements, I'll try to do that. I am thinking of tweaking the
DepGraphState
implementation in various ways to make it lighterweight in any case. It's not clear that reusingGraph
is buying us much here, given the limited API.r? @michaelwoerister