Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix edge case causing modules to be incorrectly omitted from delta updates #819

Closed
wants to merge 2 commits into from

Conversation

motiz88
Copy link
Contributor

@motiz88 motiz88 commented May 13, 2022

Summary:
Fixes a bug in DeltaCalculator / traverseDependencies that was causing modules to be incorrectly omitted from the bundle while other modules were still depending on them.

The bug

Specifically, the following new test cases would fail without the fixes in this diff:

Test: "all in one delta, adding the live edge after the dead edge"

In the test, we perform the following sequence of mutations on an initial graph, and then compute the delta between the initial and final states of the graph:

  1. Create the file /quux.
  2. Add a dependency between /foo and /quux, marking /foo as modified.
  3. Add a dependency between /bundle and /quux, marking /bundle as modified.
  4. Remove the pre-existing dependency between /bundle and /foo.

metro-delta-bundler-bug-animation

With /bundle as the entry point, this should leave us with a graph consisting of /bundle and /quux. Before this diff, traverseDependencies would omit /quux because it incorrectly concluded that /quux must have been present in the initial graph.

NOTE: In the test code, the terms "live edge" and "dead edge" refer to the fact that the edge between /bundle and /quux is "live" i.e. present in the final graph, while the edge between /foo and /quux is "dead" i.e. removed from the final graph (because /foo itself is pruned).

Test: "more than two state transitions in one delta calculation"

Perform the following sequence of operations (from the same initial graph as the other test):

  1. Create the file /quux.
  2. Process the filesystem change event, i.e. compute an empty delta (since /quux is not reachable).
  3. Add a dependency between /bar and /quux, marking /bar as modified.
  4. Remove the pre-existing dependency between /foo and /bar, marking /foo as modified.
  5. Add a dependency between /bundle and /quux, marking /bundle as modified.
  6. Compute another delta representing the mutations since step 2.

metro-delta-bundler-bug-2-animation

As with the other test, before the fix, /quux would be incorrectly omitted from the final delta.

Impact

The bug would typically manifest as various runtime errors after a large filesystem update (e.g. rebasing) due to modules being missing from the module registry on the client. In our example, we would not send down the code for /quux, so we can expect evaluating /bundle to fail when it tries to invoke the equivalent of require('/quux').

The fix

The root of the problem is that we record deletions and additions on the same module during traversal, and then try to disambiguate them in a postprocessing pass using incomplete information. Instead, we now track the correct delta state throughout the algorithm, with addition and deletion being mutually exclusive.

Note that an earlier version of this diff contained the wrong solution:

traverseDependencies now expects the unfiltered set of paths that changed on disk, so it can determine which modules are new to the graph.

The second test above shows why this would be an incorrect heuristic: The filesystem change affecting /quux can occur in a separate traverseDependencies run, so we can't use that to disambiguate dependencies that appear to be both deleted+added. The right fix is to track the state correctly to begin with.

Differential Revision: D36371424

@facebook-github-bot facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported labels May 13, 2022
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D36371424

@motiz88
Copy link
Contributor Author

motiz88 commented May 13, 2022

cc @cpojer, @raejin who have hacked on traverseDependencies in the past and might find this interesting.

Copy link
Contributor

@cpojer cpojer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to me but I worked on this code mostly before the DeltaBundler was a thing.

How did you find this?

cc @rafeca

Copy link
Contributor

@raejin raejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, love the tests

// Generate the initial dependency graph.
/*
Generate the initial dependency graph:
┌─────────┐ ┌──────┐ ┌──────┐
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice graphs!

*/
Actions.addDependency('/bundle', '/quux');

/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests look very similar to the ones I illustrated here -- #500

expect(
getPaths(await traverseDependencies([...files], graph, options)),
).toEqual({
added: new Set(['/quux']),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing the PR didn't fix the appropriate delta calculations

@motiz88
Copy link
Contributor Author

motiz88 commented May 13, 2022

Thanks for taking a look @cpojer @raejin! Unfortunately I think I've found a couple of holes in my approach, so I'm trying something new and will update this PR next week. I hope you won't mind taking another look at that point, as I am still wrapping my head around this code. 😅

I now believe the fundamental problem is that we're losing information in delta as we traverse the graph: when a module is marked both "deleted" and "added", we don't know which came first. Trying to guess this from the unfiltered path set, like in my first attempt here, just moves the race condition elsewhere (I have a test that shows this). I think we can do a better job of maintaining the added and deleted sets during traversal (including the "added + deleted = noop" logic) so they're mutually exclusive by construction, and we don't need the sketchy postprocessing to return a precise delta.

One complication is the use of delta.deleted.has() in a couple of places in #506. I don't think those call sites really care about the modules' status relative to the base graph ( = "deleted" and "never existed" should be equivalent), so I am replacing them with !graph.dependencies.has(). I wonder if there was any particular reason we wanted to specifically use the delta state there, though.

@motiz88
Copy link
Contributor Author

motiz88 commented May 13, 2022

@cpojer

How did you find this?

One of the teams at Meta reported that updating to particular source control revision with Metro running resulted in a corrupted bundle ( = runtime errors complaining about modules not being found, or evaluating to undefined). It reproed reliably enough that I was able to attach a debugger and write a corresponding failing test.

motiz88 added 2 commits May 16, 2022 04:19
Differential Revision: D36393090

fbshipit-source-id: 4be48cd19f138cfd74bbad696bc1de1293e4684b
…a updates (facebook#819)

Summary:
Pull Request resolved: facebook#819

Fixes a bug in `DeltaCalculator` / `traverseDependencies` that was causing modules to be incorrectly omitted from the bundle while other modules were still depending on them.

## The bug

Specifically, the following new test cases would fail without the fixes in this diff:

### Test: "all in one delta, adding the live edge after the dead edge"

In the test, we perform the following sequence of mutations on an initial graph, and then compute the delta between the initial and final states of the graph:

1. Create the file `/quux`.
2. Add a dependency between `/foo` and `/quux`, marking `/foo` as modified.
3. Add a dependency between `/bundle` and `/quux`, marking `/bundle` as modified.
4. Remove the pre-existing dependency between `/bundle` and `/foo`.

{F731851675}

With `/bundle` as the entry point, this should leave us with a graph consisting of `/bundle` and `/quux`. Before this diff, `traverseDependencies` would omit `/quux` because it incorrectly concluded that `/quux` must have been present in the initial graph.

NOTE: In the test code, the terms "live edge" and "dead edge" refer to the fact that the edge between `/bundle` and `/quux` is "live" i.e. present in the final graph, while the edge between `/foo` and `/quux` is "dead" i.e. removed from the final graph (because `/foo` itself is pruned).

### Test: "more than two state transitions in one delta calculation"

Perform the following sequence of operations (from the same initial graph as the other test), and then compute the delta between the initial and final states of the graph:

1. Create the file `/quux`.
2. **Process the filesystem change event**, i.e. compute an empty delta (since `/quux` is not reachable).
3. Add a dependency between `/bar` and `/quux`, marking `/bar` as modified.
4. Remove the pre-existing dependency between `/foo` and `/bar`, marking `/foo` as modified.
3. Add a dependency between `/bundle` and `/quux`, marking `/bundle` as modified.

{F732573059}

As with the other test, before the fix, `/quux` would be incorrectly omitted from the final delta.

### Impact

The bug would typically manifest as various runtime errors after a large filesystem update (e.g. rebasing) due to modules being missing from the module registry on the client. In our example, we would not send down the code for `/quux`, so we can expect evaluating `/bundle` to fail when it tries to invoke the equivalent of `require('/quux')`.

## The fix

The root of the problem is that we record deletions and additions on the same module during traversal, and then try to disambiguate them in a postprocessing pass using incomplete information. Instead, we now track the correct delta state throughout the algorithm, with addition and deletion being mutually exclusive.

Note that an earlier version of this diff contained the wrong solution:

> `traverseDependencies` now expects the *unfiltered* set of paths that changed on disk, **so it can determine which modules are new to the graph**.

The second test above shows why this would be an incorrect heuristic: The filesystem change affecting `/quux` can occur in a separate `traverseDependencies` run, so we can't use that to disambiguate dependencies that appear to be both deleted+added. The right fix is to track the state correctly to begin with.

Differential Revision: D36371424

fbshipit-source-id: 5984d1a776ab6f4ee5bcb54833d50562ef93b91b
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D36371424

@motiz88 motiz88 force-pushed the export-D36371424 branch from 2117bf3 to 00c5d56 Compare May 16, 2022 11:20
@motiz88
Copy link
Contributor Author

motiz88 commented May 16, 2022

Updated! I'm much happier with this solution.

I'm also thinking of implementing a different approach to #506, essentially doing a separate GC pass instead of nested traversals. (That way we could even make the GC optional, e.g. skip it while processing updates for HMR but run it for resets/full bundles)

@motiz88
Copy link
Contributor Author

motiz88 commented May 16, 2022

See microsoft/rnx-kit#1514 re: thoughts on rewriting the GC algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants