Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take RemoteStore offline during user change #3193

Merged
merged 4 commits into from
Jun 22, 2020

Conversation

schmidt-sebastian
Copy link
Contributor

@schmidt-sebastian schmidt-sebastian commented Jun 9, 2020

This changes the user change to be much more explicit:

  • At first, we take RemoteStore offline
  • Then SyncEngine applies the user change (with recovery)
  • Only when the user change in SyncEngine succeeds do we take the network back online

Replaces #3192

Fixes #3179

@google-oss-bot
Copy link
Contributor

google-oss-bot commented Jun 9, 2020

Binary Size Report

Affected SDKs

  • @firebase/firestore

    Type Base (35bc0a2) Head (90e850f) Diff
    browser 252 kB 250 kB -1.91 kB (-0.8%)
    esm2017 195 kB 194 kB -297 B (-0.2%)
    main 495 kB 493 kB -1.96 kB (-0.4%)
    module 249 kB 248 kB -1.91 kB (-0.8%)
    react-native 195 kB 195 kB -277 B (-0.1%)
  • @firebase/firestore/lite

    Type Base (35bc0a2) Head (90e850f) Diff
    main 151 kB 140 kB -11.0 kB (-7.3%)
  • @firebase/firestore/memory

    Type Base (35bc0a2) Head (90e850f) Diff
    browser 192 kB 187 kB -5.04 kB (-2.6%)
    esm2017 149 kB 145 kB -3.39 kB (-2.3%)
    main 371 kB 363 kB -8.32 kB (-2.2%)
    module 190 kB 185 kB -5.04 kB (-2.7%)
    react-native 149 kB 146 kB -3.37 kB (-2.3%)
  • @firebase/testing

    Type Base (35bc0a2) Head (90e850f) Diff
    main 6.67 kB 6.34 kB -331 B (-5.0%)
  • firebase

    Type Base (35bc0a2) Head (90e850f) Diff
    firebase-firestore.js 290 kB 287 kB -2.37 kB (-0.8%)
    firebase-firestore.memory.js 231 kB 226 kB -5.38 kB (-2.3%)
    firebase.js 822 kB 819 kB -2.37 kB (-0.3%)

Test Logs

logDebug(LOG_TAG, 'RemoteStore restarting streams for new credential');
await this.restartNetwork();

this.networkEnabled = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me uneasy. Previously networkEnabled was only used to signal user intent (around the enableNetwork and disableNetwork calls). Reusing this here means that a public call to enableNetwork can disrupt this logic. I don't think this is desirable, but perhaps that's something you intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches the code in restartNetwork() and is needed to silence the asserts that would trigger if the network stream closes without an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this is needed to silence assertions in the current implementation, but the larger point I'm after here is that this potentially doesn't actually work.

Consider this sequence of events:

  • There's a credential change
  • IndexedDB fails during syncEngine.handleUserChange, leading that block to be retried later
  • The user calls enableNetwork, disabling the guard rails in this implementation
  • The user performs a listen which may successfully start a stream with the new identity before we've completed the transition within the sync engine.

Or as an alternative to the last step, the device encounters a network transition that prompts restartNetwork to run, again breaking the assumption in here that we won't allow the network to restart until the credential change transition has completed.

What I'm after is that by mixing the intent of the networkEnabled signal between what the user asked for (enableNetwork) and what's possible (IndexedDB is failing, so the streams should be shut down), we open ourselves up to having to reason about these circumstances.

To be clear: the use of networkEnabled in restartNetwork is also suspicious, but it's less immediately problematic because the interval where it's being manipulated is so short. While we're in the background and IndexedDB is failing it could be hours before the sync engine succeeds, which suggests we need a better way to handle this. When this was the only abuse of the user signal this seemed like an ~OK hack. Now that we've got multiple kinds of things potentially interacting with it, the negative effect has multiplied.

So, unless I'm grossly misunderstanding what's happening here, I think I'm proposing we:

  • only track user intent in networkEnabled
  • track network-transition-induced disablement separately from networkEnabled
  • track identity-change-induced disablement separately from both
  • only allow network access when all conditions are independently satisfied.
  • adjust the assertions to account for the fact that there can be multiple reasons why we might disable network access, not just that the user requested it.

This way these events can happen all at once or in any order and we won't accidentally enable the network during an event ordering we didn't anticipate.

Another possible alternative, rather than tracking each condition independently, is to track a cumulative failure counter. I think you still need to track networkEnabled separately because user intent has to be idempotent, but you could make each of the three cases (IndexedDB failed, restart in progress, credential change in progress) increment a count on the initial condition and decrement once it's resolved. Only once all network-preventing actions have completed would we be able to start the stream.

I haven't fully thought this through, but this also seems to help with the case where multiple events of the same type happen before the first event is resolved. For example: if multiple credential change events happen before the syncEngine recovers, we won't start to enable the network after the first completes because the second in-flight credential change prevents it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the long insightful comment! I am convinced that re-using this flag was not the right idea. While we could go with a counter, I have a feeling that it will go out of sync at some point, and we will not be able to figure out why. In the end, I used something similar to what the realtime database does:

(but instead of strings I used enum constants)

As for having multiple user changes, since they are schedule via the retryable operations, only one user change can be active at a time.

@schmidt-sebastian schmidt-sebastian force-pushed the mrschmidt/credentialchange branch 2 times, most recently from 3b34952 to a1faa32 Compare June 19, 2020 04:13
Copy link
Contributor

@wilhuff wilhuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a nit

@@ -860,10 +860,12 @@ export class SyncEngine implements RemoteSyncer {
);
}

async handleCredentialChange(user: User): Promise<void> {
async handleUserChange(user: User): Promise<void> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still plenty of other references to handleCredentialChange and CredentialChangeListener. It seems preferable to keep these consistent, no? That way you can search for credentialchange and find all the places we do something about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed the RemoteStore and SyncEngine method to handleCredentialChange. LocalStore and SharedClientState still use handleUserChange since SyncEngine "swallows" no-op user changes.

@wilhuff wilhuff assigned schmidt-sebastian and unassigned wilhuff Jun 19, 2020
Copy link
Contributor

@wilhuff wilhuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@schmidt-sebastian schmidt-sebastian merged commit 58e1b6f into master Jun 22, 2020
@schmidt-sebastian schmidt-sebastian deleted the mrschmidt/credentialchange branch June 22, 2020 22:21
@firebase firebase locked and limited conversation to collaborators Jul 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Firestore appears offline after fetching auth token failed: getToken aborted due to token change error
3 participants