Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft at constructing a dependency graph #30393

Closed

Conversation

nikomatsakis
Copy link
Contributor

This PR starts the work of constructing a dependency graph. It only handles two tables right now, but it builds up all the infrastructure, including a unit-testing system. Instrumenting more tables should be pretty straightforward.

I've not done performance measurements, I'll try to do that. I am thinking of tweaking the DepGraphState implementation in various ways to make it lighterweight in any case. It's not clear that reusing Graph is buying us much here, given the limited API.

r? @michaelwoerister

`librustc_data_structures`.  This includes support for pushing/popping
current nodes and so forth.

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
This consists of:

- a definition of dependency nodes (`DepNode`)
- a `DepTrackingMap` that is basically API-compatible with existing
  maps, but which tracks reads/writes
- a replacement for `visit_all_items` that also creates appropriate
  nodes and reads/writes
  - eventually, though in this PR, we will instrument the HIR map
    to add reads/writes, and so touching the `krate` directly would
    be something of a no-no

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
routines so they work over more than one kind of map

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
@michaelwoerister
Copy link
Member

Nice :)
I'll take a thorough look at this tomorrow.

@nikomatsakis nikomatsakis force-pushed the cross-item-dependencies branch from 94f52b5 to e05b8c6 Compare December 15, 2015 17:53

Verified

This commit was signed with the committer’s verified signature.
CraftSpider Rune Tynan
@nikomatsakis
Copy link
Contributor Author

Hmm, my measurements suggest more overhead than I expected. Compiling trans goes from 85sec to 89sec (a 5% increase). I am going to do some tweaks to see what the effect is.

In working through this, I have been wrestling with a few emotions :)

  1. A pleasant surprise at how elegantly this instrumentation works out.
  2. A happiness that this approach gives us overly conservative results unless we intentionally tweak things (so long as we remember to use DepTrackingMaps and so forth).

But also:

  1. A wariness of overhead. After all, I've only instrumented a fraction of the tables, for example! (But I've also made basically zero effort to optimize.)
  2. I haven't decided on the best way to handle IVar yet (though it doesn't seem super hard to handle)
  3. This system can even make people overconfident by the fact that it handles much, but not ALL, automatically.

So overall I am basically debating about whether it'd be easier/better to "derive" dependencies from first principles by walking the HIR/MIR. But when I tried to draw up a list of things to be concerned about, I confess the list started to seem quite long. I'm going to try and draw it up more thoroughly now, since I want to make a comprehensive set of tests anyhow, which may inform my ultimate opinion.

@nikomatsakis
Copy link
Contributor Author

Actually, I think I was being misled by the old "shifting measurement target" phenomena. If I measure libsyntax, I don't see the same perf regression. I do see a shift of ~1% though.

Update: note that librustc_trans has changed, and includes some amount of new code, whereas libsyntax has not.

@@ -840,13 +841,12 @@ fn check_adjustments<'a, 'tcx>(v: &mut CheckCrateVisitor<'a, 'tcx>, e: &hir::Exp
}

pub fn check_crate(tcx: &ty::ctxt) {
tcx.map.krate().visit_all_items(&mut CheckCrateVisitor {
tcx.visit_all_items_in_krate(DepNode::CheckConst, &mut CheckCrateVisitor {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice how simple adapting existing passes seems to be :)

@bors
Copy link
Contributor

bors commented Dec 16, 2015

☔ The latest upstream changes (presumably #30410) made this pull request unmergeable. Please resolve the merge conflicts.

// Computing the type scheme of an item is a discrete task:
let item_def_id = ccx.tcx.map.local_def_id(it.id);
let _task = ccx.tcx.dep_graph.in_task(DepNode::TypeScheme(item_def_id));
ccx.tcx.dep_graph.read(DepNode::Hir(item_def_id)); // we have access to `it`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be moved to compute_type_scheme_of_item() below, since we don't need a dependency from TypeScheme to Tcache, only the other way round (i.e. creating the TypeScheme does not really access the TCache). Right now we get an unnecessary circular dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelwoerister

This could be moved to compute_type_scheme_of_item() below, since we don't need a dependency from TypeScheme to Tcache, only the other way round (i.e. creating the TypeScheme does not really access the TCache). Right now we get an unnecessary circular dependency.

Yes, true. I debated about doing that. The circular dependency is harmless, but I guess it would indeed be better to push the new task creation down, if for no other reason than other hypothetical callers of compute_type_scheme_of_item (none exist at present), would get the benefit of a new task.

This is actually one of the more subtle points of this work. Introducing a new node is a useful optimization for the dependency graph when you have lazy computation like this. If you don't do it, you wind up with artificial edges -- e.g., imagine both foo and bar reference baz, and we check foo first. Then we will add an edge that foo writes to TYPE(baz). When we check bar (and baz), we will get a read of TYPE(baz) only, because it is already computed. This means that if foo must be recomputed, we will also recompute both bar and baz, even though that may not be necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, one other way to structure this would be to not have separate TypeScheme and TCache nodes at all, because essentially they describe the same thing. If the computation of foo(Bar)'s TypeScheme requires the TypeScheme of Bar, it would go to the tcache and try to look it up. If it's not there, push the TypeScheme(Bar) task (but without automatically adding an edge from task to subtask), compute and add it to the map, and then accessing the map would introduce a read-edge from TypeScheme(foo(Bar)) to TypeScheme(Bar). That would avoid adding artificial edges and lead to a cleaner, more stable dependency graph (in this case at least). More stable because the graph would not depend on (re-)execution order.
Correct me if I'm wrong.

@michaelwoerister
Copy link
Member

OK, I've looked through the PR and my overall impressions are:

  1. It's really nice how the dependency tracking happens really almost completely transparently.
  2. The transparency of the approach is also a bit of a downside: It's easy to introduce unnecessary dependency edges by inadvertently accessing data that does not strictly need to be accessed.
  3. It's very important to connect all relevant pieces of data to the tracking infrastructure. Otherwise it's easy to miss introducing edges.
  4. It's also important to keep the number of 'global' nodes small, unless they are leaf nodes. Although that is less of an issue than I thought initially, as long as the global nodes (like Coherence) don't depend on more than they need to.
  5. It's easy to lose track of what is actually going on, i.e. what the dependency graph actually looks like. We should try to make debugging this as easy as possible, for example by having a test suite that lets us easily pinpoint where possible regressions have been introduced.
  6. As @nmatsakis suggested on IRC, having a special 'read-only' task in addition to 'ignore' -- for things like save-analysis that are not supposed to introduce new nodes to the dep-graph -- is a good idea for catching errors.
  7. More information about the performance implications would indeed be interesting.
  8. Maybe it would be good to be able specify runtime-checked conditions like "nodes of type x must never depend on nodes of type y" to guard against regressions. Although I don't have a concrete example yet (maybe things from external crates)

I'd also be interested in hearing more about the hybrid approach (mentioned by @nmatsakis on IRC) between calculating dependencies explicitly and doing it via instrumentation.

@nikomatsakis
Copy link
Contributor Author

@michaelwoerister

I basically agree with everything you wrote. Let me give some off-the-cuff thoughts:

It's also important to keep the number of 'global' nodes small, unless they are leaf nodes. Although that is less of an issue than I thought initially, as long as the global nodes (like Coherence) don't depend on more than they need to.

It's a bit unclear how imp't this is. Certainly introducing the notion of "read-only" nodes, that don't produce outputs, will cover most of the existing "global" cases (e.g., lints). Though lints are also a good case study in that some amount of thought should be put into when to "re-run" the lint pass -- but I think it's the kind of thing where it's not worth making things too fine-grained and incremental.

It's easy to lose track of what is actually going on, i.e. what the dependency graph actually looks like. We should try to make debugging this as easy as possible, for example by having a test suite that lets us easily pinpoint where possible regressions have been introduced.

Yes. So what I had hoped to do in this respect was to beef up that test suite. Basically I plan to systematically try to uncover the kinds of changes and what kinds of rebuilds should be triggered and lay those out as test cases. I am toying around with the best way to organize all of this.

In any case, this seems to be one of the core trade-offs: if we use this approach, we will wind up with a less explicit notion of what the graph is, but on the other hand we don't have to think quite as actively about it. You just have to identify "shared state" and make sure to use a DepTrackingMap or other similar tool. But I think then another consequence will be that we will get less incremental support in some cases unless we dive in and do some special treatment: one example I've been thinking over is trait selection, where I think we'll probably ultimately want some amount of custom logic.

I'm now sort of debating around a kind of hybrid scheme, where we scrape some of the dependencies by walking the (fully typed) HIR (or maybe MIR), and generate the rest dynamically. This has the downside that we must actively think about what our dependencies are or things will get out of sync -- precisely what I was trying to avoid -- but the upside that the dep graph will probably be cleaner. I think we'll find that in some cases though it's too hard to make something precise and we wind up with approximations: e.g., we'll probably have a de-facto edge from every function to the set of traits (and names in general) that are in scope, even if those traits didn't wind up being important for method resolution or something. I have to experiment a bit to see what that means.

Maybe it would be good to be able specify runtime-checked conditions like "nodes of type x must never depend on nodes of type y" to guard against regressions. Although I don't have a concrete example yet (maybe things from external crates)

Certainly the test cases can help here, though one could imagine debug_assert-like things that trigger over ALL inputs, and not just the ones identified in the test suites.

@nikomatsakis
Copy link
Contributor Author

@jonas-schievink

Why not default to a file in the current working directory instead?

Yeah, no reason. I'll change it.

@nikomatsakis
Copy link
Contributor Author

I think given the uncertainty, maybe the best way to proceed is:

  1. Add in a -Z incr-comp flag and do not create the dependency graph unless it is passed.
  2. Land the patch as is (which should introduce basically zero overhead, though I will check).
  3. Continue iterating locally :)

@michaelwoerister
Copy link
Member

I think given the uncertainty, maybe the best way to proceed is: ...

Agreed.

@michaelwoerister
Copy link
Member

It's a bit unclear how imp't this is. Certainly introducing the notion of "read-only" nodes, that don't produce outputs, will cover most of the existing "global" cases (e.g., lints). Though lints are also a good case study in that some amount of thought should be put into when to "re-run" the lint pass -- but I think it's the kind of thing where it's not worth making things too fine-grained and incremental.

Yes, I agree. Incrementalizing those can always be done later but I would only attempt do so if there's promise of real performance wins (which I doubt)

I think we'll find that in some cases though it's too hard to make something precise and we wind up with approximations: e.g., we'll probably have a de-facto edge from every function to the set of traits (and names in general) that are in scope, even if those traits didn't wind up being important for method resolution or something. I have to experiment a bit to see what that means.

In theory, if compiler internal datastructures are set up the right way, there is no reason for things to go wrong, so to speak. It helps to think in terms of pieces of data depending on other pieces of data and have dependency nodes for those pieces of data (and not so much for tasks producing them). Although that might mean restructuring some of the code that naturally was written without incremental compilation in mind.

EDIT: Although the "In theory" at the beginning of the previous paragraph should give away that this is all conjecture on my part :)

@nikomatsakis
Copy link
Contributor Author

So I've been pushing on this branch a bit more. I think I'm going to close this PR for now and re-open it in a day or so. Given the minimally invasive nature, I don't expect rebasing to be too painful.

Anyway, I also hope to write up a kind of "readme" that explains the general ideas a bit more clearly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants