-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(@aws-amplify/datastore): mutation hub event drops during reconnect #11132
fix(@aws-amplify/datastore): mutation hub event drops during reconnect #11132
Conversation
…races with reconnect
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #11132 +/- ##
=======================================
Coverage 82.36% 82.36%
=======================================
Files 204 204
Lines 19505 19507 +2
Branches 4213 4213
=======================================
+ Hits 16065 16067 +2
Misses 3154 3154
Partials 286 286 see 2 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few questions
@@ -256,7 +267,7 @@ class MutationProcessor { | |||
hasMore = (await this.outbox.peek(storage)) !== undefined; | |||
}); | |||
|
|||
this.observer.next!({ | |||
this.observer?.next?.({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's entirely possible I'm being dense, but according to the PR description: "An undefined
observer signals to the mutation processor not to immediately resume on mutation, but instead wait for an observer (the sync engine) to register itself and re-start() the mutation processor." Wouldn't this mean that resume doesn't happen if the observer is undefined
? And if so, why is it possible that the observer is possibly undefined
, here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrestled with this little part of the diff for longer than I care to admit, actually. TypeScript isn't convinced that this.observer
always exists here, or that .next
exists on the observer. So, with that said, I'm actually not 100% sure PR eliminates all cases where a mutation event goes missing. It does provide an incremental improvement to one edge I identified while trying to add tests.
Maybe it makes more sense to revert to bangs to at least get errors logged if someone hits that edge. I think we can shake out more edges after #11131, which will give us a little more control to stress the protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, though we check that the observer isn't undefined
at the beginning of resume
(i.e. the status of isReady
), it's possible that between there and here, the observer can now become undefined
. I think I'm just most concerned about silent failures, and think that error logs would be better. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, there are 100% still some potential edges here. This PR only addresses a semi-related edge wherein reconnect fundamentally pits mutation processing and sync engine restart against each other. I want to be cautious about changing the failure mode at this line. Today, it will be silent if the observer has unsubscribed, AFAIK. This line will happily call .next()
on a closed subscriber, if I'm not mistaken and silently deliver that message into the ether.
I could be wrong. Happy to be shown I'm wrong. I just want to avoid entangling the purpose of this PR with the lesser-known edges here. This is especially true if we don't have any repros or reports of those edges, and when the right solution there is really unclear to me.
I dunno. Happy to chat on Monday about this if needed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line will happily call .next() on a closed subscriber, if I'm not mistaken and silently deliver that message into the ether.
Just following up on this, I've tested my recollection on this and it looks correct. Completed and/or unsubscribed observers simply cause next()
to be a no-op. No error message will be logged. It will just a silent failure. I'll create a backlog item for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think TypeScript is just confused that this.isReady
actually asserts that this.observer
is defined. Even if you place this.isReady
directly before this.observer.next
you will still get the same error.
if (this.isReady()) {
// still thinks that this.observer could be undefined
this.observer.next ...
You could use assertion predicates to give TypeScript enough information to know it would be defined in this code path. But it looks like it isn't well supported on private class properties and would require some hacks.
https://www.typescriptlang.org/docs/handbook/release-notes/typescript-3-7.html#assertion-functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's still a potential issue that could occur if an unsubscribe occurs while a request is in flight. But, it's an existing and less likely race, outside the scope of the more predictable and easy to repro race this PR addresses.
In any case, there's a story in the backlog and on the grooming list to revisit this at some point.
* receive all of the expected Hub events. | ||
*/ | ||
test.skip('survives online -> offline -> save/online race', async () => { | ||
test('survives online -> offline -> save/online race', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on also adding tests for update
and delete
immediately before reconnect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I have a compelling reason not to! I'll add them in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I only have one question (non-blocking)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
* feat(cognito): remove required domain param when create CookieStorage * test: enable integ test * test(cognito): update unit test * Revert "test: enable integ test" This reverts commit e3db965. * feat(cognito): make cookiestorage constructor parameter optional * chore: Docs updates (aws-amplify#11082) * chore(release): Publish [ci skip] - @aws-amplify/analytics@6.0.19 - @aws-amplify/api-graphql@3.1.7 - @aws-amplify/api-rest@3.0.19 - @aws-amplify/api@5.0.19 - @aws-amplify/auth@5.1.13 - aws-amplify@5.0.19 - @aws-amplify/cache@5.0.19 - @aws-amplify/core@5.1.2 - @aws-amplify/datastore-storage-adapter@2.0.19 - @aws-amplify/datastore@4.1.1 - @aws-amplify/geo@2.0.19 - @aws-amplify/interactions@5.0.19 - @aws-amplify/notifications@1.0.19 - @aws-amplify/predictions@5.0.19 - @aws-amplify/pubsub@5.1.2 - @aws-amplify/pushnotification@5.0.19 - @aws-amplify/storage@5.1.9 - @aws-amplify/xr@4.0.19 * chore(release): update version.ts [ci skip] * refactor(datastore): storage adapters (aws-amplify#11073) * chore: Docs updates (aws-amplify#11091) * chore: Tweaked `main` pre-id. (aws-amplify#11093) * chore(deps): bump activesupport from 6.1.7.2 to 7.0.4.3 in /docs Bumps [activesupport](https://github.com/rails/rails) from 6.1.7.2 to 7.0.4.3. - [Release notes](https://github.com/rails/rails/releases) - [Changelog](https://github.com/rails/rails/blob/v7.0.4.3/activesupport/CHANGELOG.md) - [Commits](rails/rails@v6.1.7.2...v7.0.4.3) --- updated-dependencies: - dependency-name: activesupport dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> * chore(release): Publish [ci skip] - @aws-amplify/analytics@6.0.20 - @aws-amplify/api-graphql@3.1.8 - @aws-amplify/api-rest@3.0.20 - @aws-amplify/api@5.0.20 - @aws-amplify/auth@5.1.14 - aws-amplify@5.0.20 - @aws-amplify/cache@5.0.20 - @aws-amplify/core@5.1.3 - @aws-amplify/datastore-storage-adapter@2.0.20 - @aws-amplify/datastore@4.1.2 - @aws-amplify/geo@2.0.20 - @aws-amplify/interactions@5.0.20 - @aws-amplify/notifications@1.0.20 - @aws-amplify/predictions@5.0.20 - @aws-amplify/pubsub@5.1.3 - @aws-amplify/pushnotification@5.0.20 - @aws-amplify/storage@5.1.10 - @aws-amplify/xr@4.0.20 * chore(release): update version.ts [ci skip] * chore(@aws-amplify/datastore): initial decomposition of test helpers (aws-amplify#11096) * moved test helpers to folder * first pass, rough decomposition of test helpers file * added readme for helpers * test helpers readme typo * Update packages/datastore/__tests__/helpers/README.md Co-authored-by: Ivan Artemiev <29709626+iartemiev@users.noreply.github.com> --------- Co-authored-by: Ivan Artemiev <29709626+iartemiev@users.noreply.github.com> * fix: add warning message when trying to use SQLite with CPK enabled (aws-amplify#11027) * fix(data): update SQLite utils to use 'instr' instead of 'like' when constructing WHERE conditions from predicates (aws-amplify#11108) * chore: Docs updates (aws-amplify#11111) * docs(cognito): excluding subdomains when domain cookie attr is omitted * chore(release): Publish [ci skip] - @aws-amplify/analytics@6.0.21 - @aws-amplify/api-graphql@3.1.9 - @aws-amplify/api-rest@3.0.21 - @aws-amplify/api@5.0.21 - @aws-amplify/auth@5.1.15 - aws-amplify@5.0.21 - @aws-amplify/cache@5.0.21 - @aws-amplify/core@5.1.4 - @aws-amplify/datastore-storage-adapter@2.0.21 - @aws-amplify/datastore@4.1.3 - @aws-amplify/geo@2.0.21 - @aws-amplify/interactions@5.0.21 - @aws-amplify/notifications@1.0.21 - @aws-amplify/predictions@5.0.21 - @aws-amplify/pubsub@5.1.4 - @aws-amplify/pushnotification@5.0.21 - @aws-amplify/storage@5.1.11 - @aws-amplify/xr@4.0.21 * chore(release): update version.ts [ci skip] * test: increase coverage for delete connection by model field (aws-amplify#11099) * Update config.yml Removes `Usage Question` link that sends users to an old 404 page for discussions * fix(datastore): stale observeQuery snapshot with sort param (aws-amplify#11119) * chore(notifications): Integrate refactored Android common utils * chore: Docs updates * Update config.yml Switched usage question to point at discord rather than deleting it * chore(release): Publish [ci skip] - amazon-cognito-identity-js@6.2.0 - @aws-amplify/analytics@6.0.22 - @aws-amplify/api-graphql@3.1.10 - @aws-amplify/api-rest@3.0.22 - @aws-amplify/api@5.0.22 - @aws-amplify/auth@5.2.0 - aws-amplify@5.0.22 - @aws-amplify/cache@5.0.22 - @aws-amplify/core@5.1.5 - @aws-amplify/datastore-storage-adapter@2.0.22 - @aws-amplify/datastore@4.1.4 - @aws-amplify/geo@2.0.22 - @aws-amplify/interactions@5.0.22 - @aws-amplify/notifications@1.0.22 - @aws-amplify/predictions@5.0.22 - @aws-amplify/pubsub@5.1.5 - @aws-amplify/pushnotification@5.0.22 - @aws-amplify/storage@5.1.12 - @aws-amplify/xr@4.0.22 * chore(release): update version.ts [ci skip] * fix(@aws-amplify/datastore): mutation hub event drops during reconnect (aws-amplify#11132) * fix(@aws-amplify/datastore): fixes mutation hub event dropped during races with reconnect * add explicit observer completion, expanded test coverage * fix(@aws-amplify/datastore): adds serialization for empty predicates (aws-amplify#11133) adds serialization for empty predicates * test(notifications): Add native Android unit tests (aws-amplify#11139) * test(notifications): Add native Android unit tests * Disable android tests from being automatically run in CI --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Allan Zheng <zheallan@amazon.com> Co-authored-by: aws-amplify-bot <aws@amazon.com> Co-authored-by: Ivan Artemiev <29709626+iartemiev@users.noreply.github.com> Co-authored-by: Jim Blanchard <jim.l.blanchard@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jon Wire <iambipedal@gmail.com> Co-authored-by: Dane Pilcher <dppilche@amazon.com> Co-authored-by: David McAfee <mcafd@amazon.com> Co-authored-by: Olya Balashova <42189299+helgabalashova@users.noreply.github.com> Co-authored-by: Bannon Tanner <bannonta@amazon.com> Co-authored-by: helgabalashova <helga.stolyarova@gmail.com> Co-authored-by: Aaron S <94858815+stocaaro@users.noreply.github.com>
Description of changes
outboxMutationProcessed
Hub events were being dropped for mutations that were being raced against reconnect activities. A fringe case, but during reconnect, this could lead to customer code waiting foroutboxMutationProcessed
events that would never arrive.This PR fixes the issue by properly removing the observer from the mutation processor during an unsubscribe or
stop()
. Anundefined
observer signals to the mutation processor not to immediately resume on mutation, but instead wait for an observer (the sync engine) to register itself and re-start()
the mutation processor.This PR also incidentally tamps down on
console.warn
noise in the scope ofconnectivityHandler.test.ts
.Issue #, if available
Description of how you validated changes
Un-skipped the test that originally caught the error. Made it pass.
Checklist
yarn test
passesBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.