Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further --changed optimization #5579

Merged

Conversation

stuhood
Copy link
Member

@stuhood stuhood commented Mar 9, 2018

Problem

A few more performance issues were discovered in ./pants --changed=.. list due to the removal of the v1 engine:

  • In order to calculate direct or transitive dependents, ChangeCalculator needs to compute the dependents graph. But it was doing this using HydratedTargets (ie, with sources globs and other fields expanded).
  • After the ChangeCalculator was used to compute the new set of TargetRoots, we were converting the matched Address objects back into Specs, triggering Engine executions for multiple SingleAddresses are redundant #4533.
  • When locating candidate owning targets for some sources, SourceMapper was using HydratedTargets, which are (implicitly/unnecessarily) transitive.
  • When matching source files against candidate targets, SourceMapper was using un-batched regex-based spec matching calls.

Solution

  • Add and switch to computing HydratedStructs in ChangeCalculator, which do not require expanding sources globs.
  • Rename HydratedTargets to TransitiveHydratedTargets, and add a non-transitive rule to compute HydratedTargets. Use this in SourceMapper.
  • Preserve Address objects in ChangedTargetRoots, which allows for a single deduped TransitiveHydratedTarget walk to warm and populate the graph.
  • Add and used batched forms of the filespec matching methods to avoid re-parsing/compiling regexes. These matches should be ported to rust at some point.

Result

Transitive runs that touch more of the graph take time closer to the ./pants list :: time, as expected.

Before:

$ time ./pants --changed-diffspec=22ca0604b1c6ce8de019214b821e922aac66b026^..22ca0604b1c6ce8de019214b821e922aac66b026 --changed-include-dependees=transitive list | wc -l

     562

real	0m15.945s
user	0m15.180s
sys	0m1.731s

After:

$ time ./pants --changed-diffspec=22ca0604b1c6ce8de019214b821e922aac66b026^..22ca0604b1c6ce8de019214b821e922aac66b026 --changed-include-dependees=transitive list | wc -l

     563

real	0m5.402s
user	0m4.580s
sys	0m1.319s

Larger runs (more files/targets) see an even more significant speedup: on the order of 5-10x.

@stuhood stuhood added this to the 1.5.x milestone Mar 9, 2018
@stuhood
Copy link
Member Author

stuhood commented Mar 9, 2018

I also explored actually implementing a solution to #4533... might want to resume and land that one instead, since it dovetails more naturally with #4769 (which is proposing to make Spec objects first class).

# For dependee finding, we need to parse all build files to collect all structs. But we
# don't need to fully hydrate targets (ie, expand their source globs).
adaptor_iter = (t
for targets in self._scheduler.product_request(HydratedStructs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't followed this one... What's a HydratedStruct? What does it actually contain? Is this the equivalent of a Payload in the Task world? If so, maybe we can call it a Payload? Or a PayloadWithUnresolvedGlobs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was about to answer inline, but then added more information below the line on #4535 (comment) instead. Short answer: about 80% of this will need to change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool :) That's a very useful summary, thanks! Can you paste it into a comment in code somewhere (maybe src/python/pants/engine/struct.py) and reference it from the places that each of these types is defined, so that people can work out what's going on by reading the code, rather than having to know which tickets explain the code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll split the baby and leave TODOs referring to that ticket.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can live with that if it's short-term, but I definitely value being able to develop while offline and reading comments in code is useful :)

@illicitonion
Copy link
Contributor

Also, why does this need a cherrypick?

It feels like random perf improvements should just be rolled up into the next release they're going to be in? Otherwise I'm not sure I understand the branching model here...

@stuhood
Copy link
Member Author

stuhood commented Mar 9, 2018

It feels like random perf improvements should just be rolled up into the next release they're going to be in? Otherwise I'm not sure I understand the branching model here...

@illicitonion : Marking it needs-cherrypick is just a way to indicate that we will (eventually) want it in the 1.5.x stable branch. It doesn't declare which release it should land with: ie, it doesn't necessarily block 1.5.0... that's up to the release manager.

The branching strategy is described here: https://www.pantsbuild.org/release_strategy.html, but it puts scare quotes around what is "sufficient" for a release (should maybe clarify that it's the release manager's call).

Copy link
Member

@kwlzn kwlzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@@ -17,6 +17,7 @@
from pants.build_graph.address_lookup_error import AddressLookupError
from pants.build_graph.injectables_mixin import InjectablesMixin
from pants.build_graph.target import Target
from pants.init.target_roots import ChangedTargetRoots, LiteralTargetRoots
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used?

graph = _HydratedTargetDependentGraph.from_iterable(target_types_from_symbol_table(self._symbol_table),
product_iter)
graph = _DependentGraph.from_iterable(target_types_from_symbol_table(self._symbol_table),
adaptor_iter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indent

@stuhood stuhood force-pushed the stuhood/further-changed-optimization branch 4 times, most recently from b0c2139 to f719dd6 Compare March 14, 2018 00:12
Copy link
Member

@kwlzn kwlzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks! one comment.


@staticmethod
def is_declaring_file(address, file_path):
return any_declaring_file(address, [file_path])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any_is_declaring_file ?

@stuhood stuhood force-pushed the stuhood/further-changed-optimization branch from c65a57d to 17c2dba Compare March 14, 2018 04:52
@stuhood stuhood force-pushed the stuhood/further-changed-optimization branch from 17c2dba to c46d659 Compare March 14, 2018 06:12
@stuhood
Copy link
Member Author

stuhood commented Mar 14, 2018

Also, why does this need a cherrypick?

@illicitonion : So, now this really needs a cherry pick in order to make sure people can have a good experience on 1.5.x without v1 around anymore. It doesn't need to block 1.5.0 unless @wisechengyi would like it to, but it should probably go into 1.5.1 at the very least.

@wisechengyi
Copy link
Contributor

wisechengyi commented Mar 14, 2018

Since the coursier change will need to go in 1.5.1, I will add this one together for 1.5.1. This way 1.5.0 can get out this Friday, then we can immediately iterate on 1.5.1 changes afterwards.

Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As these are separate changes, it would be nice to land this as a separate commits, rather than squashed into one

# For dependee finding, we need to parse all build files to collect all structs. But we
# don't need to fully hydrate targets (ie, expand their source globs).
adaptor_iter = (t
for targets in self._scheduler.product_request(HydratedStructs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can live with that if it's short-term, but I definitely value being able to develop while offline and reading comments in code is useful :)

@stuhood stuhood merged commit c981c39 into pantsbuild:master Mar 14, 2018
@stuhood stuhood deleted the stuhood/further-changed-optimization branch March 14, 2018 17:17
wisechengyi pushed a commit that referenced this pull request Mar 14, 2018
### Problem

A few more performance issues were discovered in `./pants --changed=.. list` due to the removal of the v1 engine:

* In order to calculate `direct` or `transitive` dependents, `ChangeCalculator` needs to compute the dependents graph. But it was doing this using `HydratedTargets` (ie, with sources globs and other fields expanded).
* After the `ChangeCalculator` was used to compute the new set of `TargetRoots`, we were converting the matched `Address` objects back into `Spec`s, triggering #4533.
* When locating candidate owning targets for some sources, `SourceMapper` was using `HydratedTargets`, which are (implicitly/unnecessarily) transitive.
* When matching source files against candidate targets, `SourceMapper` was using un-batched regex-based spec matching calls.

### Solution

* Add and switch to computing `HydratedStructs` in `ChangeCalculator`, which do not require expanding sources globs.
* Rename `HydratedTargets` to `TransitiveHydratedTargets`, and add a non-transitive rule to compute `HydratedTargets`. Use this in `SourceMapper`.
* Preserve `Address` objects in `ChangedTargetRoots`, which allows for a single deduped `TransitiveHydratedTarget` walk to warm and populate the graph.
* Add and used batched forms of the `filespec` matching methods to avoid re-parsing/compiling regexes. These matches should be ported to rust at some point. 

### Result

Transitive runs that touch more of the graph take time closer to the `./pants list ::` time, as expected.
 
Before:
```
$ time ./pants --changed-diffspec=22ca0604b1c6ce8de019214b821e922aac66b026^..22ca060 --changed-include-dependees=transitive list | wc -l

     562

real	0m15.945s
user	0m15.180s
sys	0m1.731s
```
After:
```
$ time ./pants --changed-diffspec=22ca0604b1c6ce8de019214b821e922aac66b026^..22ca060 --changed-include-dependees=transitive list | wc -l

     563

real	0m5.402s
user	0m4.580s
sys	0m1.319s
```

Larger runs (more files/targets) see an even more significant speedup: on the order of 5-10x.
stuhood pushed a commit that referenced this pull request Mar 28, 2018
### Problem

#5579 broke detection of deleted targets.

### Solution

As described in #382 (mega classic!), it might be possible to more deeply integrate change detection into the v2 engine itself to compute a delta on the graph. But for now we defer a deeper solution and simply ensure that we fail for deleted targets by transitively expanding targets. Adds a test to cover the behaviour.

### Result

Due to fully hydrating targets, this represents a linear performance regression from #5579: the runtime of `--changed-include-dependees=transitive` for a small set of roots used to be slightly lower than the runtime for `./pants list ::`, because the operation that occurred "on the entire graph" was computing `HydratedStructs`, rather than computing `TransitiveHydratedTargets`.

The impact for exactly that step is constant, and fairly high:
```
# before:
  DEBUG] Root Select(Collection(dependencies=(DescendantAddresses(directory=u''),)), =Collection.of(Struct)) completed.
  DEBUG] computed 1 nodes in 1.709688 seconds. there are 8858 total nodes.

# after:
  DEBUG] Root Select(Collection(dependencies=(DescendantAddresses(directory=u''),)), =TransitiveHydratedTargets) completed.
  DEBUG] computed 1 nodes in 2.989497 seconds. there are 15916 total nodes.
```
... but the impact on overall runtime is dependent on the count of targets that are transitively affected, because for all affected targets, we're going to need to compute transitive roots anyway. So for the example change from #5579 which affects 567 targets:
```
time ./pants --changed-diffspec=22ca0604b1c6ce8de019214b821e922aac66b026^..22ca060 --changed-include-dependees=transitive list | wc -l
# before:
  real	0m4.877s
  user	0m4.081s
  sys	0m1.068s

# after
  real	0m5.294s
  user	0m4.487s
  sys	0m1.142s
```
For a change impacting only 14 targets the difference is slightly more pronounced:
```
$ time ./pants --changed-diffspec=f35e1e6fb1cdf45fcb5080cfe567bdbae8060125^..f35e1e6 --changed-include-dependees=transitive list | wc -l
# before:
  real	0m4.279s
  user	0m3.376s
  sys	0m1.011s

# after:
  real	0m4.954s
  user	0m4.284s
  sys	0m1.120s
```
wisechengyi pushed a commit to wisechengyi/pants that referenced this pull request Mar 31, 2018
As described in pantsbuild#382 (mega classic!), it might be possible to more deeply integrate change detection into the v2 engine itself to compute a delta on the graph. But for now we defer a deeper solution and simply ensure that we fail for deleted targets by transitively expanding targets. Adds a test to cover the behaviour.

Due to fully hydrating targets, this represents a linear performance regression from pantsbuild#5579: the runtime of `--changed-include-dependees=transitive` for a small set of roots used to be slightly lower than the runtime for `./pants list ::`, because the operation that occurred "on the entire graph" was computing `HydratedStructs`, rather than computing `TransitiveHydratedTargets`.

The impact for exactly that step is constant, and fairly high:
```
  DEBUG] Root Select(Collection(dependencies=(DescendantAddresses(directory=u''),)), =Collection.of(Struct)) completed.
  DEBUG] computed 1 nodes in 1.709688 seconds. there are 8858 total nodes.

  DEBUG] Root Select(Collection(dependencies=(DescendantAddresses(directory=u''),)), =TransitiveHydratedTargets) completed.
  DEBUG] computed 1 nodes in 2.989497 seconds. there are 15916 total nodes.
```
... but the impact on overall runtime is dependent on the count of targets that are transitively affected, because for all affected targets, we're going to need to compute transitive roots anyway. So for the example change from pantsbuild#5579 which affects 567 targets:
```
time ./pants --changed-diffspec=22ca0604b1c6ce8de019214b821e922aac66b026^..22ca060 --changed-include-dependees=transitive list | wc -l
  real	0m4.877s
  user	0m4.081s
  sys	0m1.068s

  real	0m5.294s
  user	0m4.487s
  sys	0m1.142s
```
For a change impacting only 14 targets the difference is slightly more pronounced:
```
$ time ./pants --changed-diffspec=f35e1e6fb1cdf45fcb5080cfe567bdbae8060125^..f35e1e6 --changed-include-dependees=transitive list | wc -l
  real	0m4.279s
  user	0m3.376s
  sys	0m1.011s

  real	0m4.954s
  user	0m4.284s
  sys	0m1.120s
```
stuhood pushed a commit that referenced this pull request Nov 20, 2018
### Problem

#5579 added an optimization to avoid hydrating targets, of which a significant portion was later rolled back by #5636. #5636 reasoned that it was necessary to fully construct the graph in order to detect when targets have been deleted.

The solution used in #5636 though, was heavy-handed. By requesting `TransitiveHydratedTargets`, we fully constructed the graph of dependencies and then later _also_ constructed the dependent graph (in `DependentGraph`). Moreover, fully hydrating the targets meant unnecessarily fully expanding their source globs.

### Solution

Create a `find_owners` rule that consumes an `OwnersRequest` to compute the owning targets of some sources (and, optionally, their transitive dependents). Moving this logic into an `@rule` allows us to directly consume the symbol_table_contraint, without a wrapping collection (as in #5579).

Additionally, switch back to requesting target adaptor subclasses while building the `DependentGraph`. The graph-validation that use to occur via requesting `TransitiveHydratedTargets` now occurs as an explicit step during construction of a `DependentGraph`. A TODO there explains an existing weakness of change-detection in the engine.

### Result

```
./pants --changed-diffspec='49cc12e5cccc6164a2c4aa83c23300ad7254836f^..49cc12e' --changed-include-dependees=direct --glob-expansion-failure=ignore list
```
is 30 percent faster in the pantsbuild pants repo, primarily due to not expanding source globs.
stuhood pushed a commit that referenced this pull request Nov 20, 2018
The solution used in #5636 though, was heavy-handed. By requesting `TransitiveHydratedTargets`, we fully constructed the graph of dependencies and then later _also_ constructed the dependent graph (in `DependentGraph`). Moreover, fully hydrating the targets meant unnecessarily fully expanding their source globs.

Create a `find_owners` rule that consumes an `OwnersRequest` to compute the owning targets of some sources (and, optionally, their transitive dependents). Moving this logic into an `@rule` allows us to directly consume the symbol_table_contraint, without a wrapping collection (as in #5579).

Additionally, switch back to requesting target adaptor subclasses while building the `DependentGraph`. The graph-validation that use to occur via requesting `TransitiveHydratedTargets` now occurs as an explicit step during construction of a `DependentGraph`. A TODO there explains an existing weakness of change-detection in the engine.

```
./pants --changed-diffspec='49cc12e5cccc6164a2c4aa83c23300ad7254836f^..49cc12e' --changed-include-dependees=direct --glob-expansion-failure=ignore list
```
is 30 percent faster in the pantsbuild pants repo, primarily due to not expanding source globs.
wisechengyi pushed a commit that referenced this pull request Nov 20, 2018
### Problem

#5579 added an optimization to avoid hydrating targets, of which a significant portion was later rolled back by #5636. #5636 reasoned that it was necessary to fully construct the graph in order to detect when targets have been deleted.

The solution used in #5636 though, was heavy-handed. By requesting `TransitiveHydratedTargets`, we fully constructed the graph of dependencies and then later _also_ constructed the dependent graph (in `DependentGraph`). Moreover, fully hydrating the targets meant unnecessarily fully expanding their source globs.

### Solution

Create a `find_owners` rule that consumes an `OwnersRequest` to compute the owning targets of some sources (and, optionally, their transitive dependents). Moving this logic into an `@rule` allows us to directly consume the symbol_table_contraint, without a wrapping collection (as in #5579).

Additionally, switch back to requesting target adaptor subclasses while building the `DependentGraph`. The graph-validation that use to occur via requesting `TransitiveHydratedTargets` now occurs as an explicit step during construction of a `DependentGraph`. A TODO there explains an existing weakness of change-detection in the engine.

### Result

```
./pants --changed-diffspec='49cc12e5cccc6164a2c4aa83c23300ad7254836f^..49cc12e' --changed-include-dependees=direct --glob-expansion-failure=ignore list
```
is 30 percent faster in the pantsbuild pants repo, primarily due to not expanding source globs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants