Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Fix: Synchronization Issues in MergeManyChangeSet operators #808

Merged

Conversation

dwcullop
Copy link
Member

@dwcullop dwcullop commented Dec 21, 2023

Description

  • Fixes issue where the resulting ChangeSet Observable from MergeManyChangeSets contains more items than expected when a parent is removed around the time that an item from its child container is removed in a multi-threaded situation
  • Cleans up code for each implementation so they are in sync for easier future maintenance
  • MergeChangeSets operators are not affected because they have to deal with the removal of parent items

Scenario

(This is just a theory, but it explains why the current implementation is experiencing the issue described.)

Thread A removes the parent item from the source collection (cache/list) while Thread B removes an item from that same item's child collection.

This creates a race condition that results in that child item never being removed from the downstream changeset.

Sequence:

  1. Thread A removes an item from the parent list. Inside of the lock, MergeMany starts to unsubscribe from changes to that item's child container.
  2. Thread B removes an item to that same child's list. The change is behind the same lock, so it is blocked.
  3. Thread A continues, completes the unsubscribe, and releases the lock (which activates Thread B), and then proceeds to the OnItemRemoved handler.
  4. Thread B continues with the remove message is already in flight. It acquires the lock and removes the item from the operator's cached version of the child container. Since MergeMany has released its subscription to changes from this container, the removal does not get reflected downstream. The lock is then released.
  5. Thread A continues with its OnItemRemoved handler but has to wait on the lock (which was just grabbed by Thread B). Once acquired, the parent item removal process is triggered, which emits a changeset that removes all the child items (using the locally cached copy of that parent item's container, but that copy no longer has the item that was just removed by Thread B).

Expected Result:

When a parent item is removed from the source changeset, the downstream changeset should see a changeset that removes all of the children added from that parent.

Actual Result:

Because the item was removed between the unsubscribe from the MergeMany and the OnItemRemoved handler, both chances to emit the Remove change were missed and so it is never observed downstream which causes the result changeset to contain an extra item.

Solution Details

The MergeMany and OnItemRemoved need to be atomic together (not separately), so that another thread can't slip between them and cause a remove change to be missed. This can be achieved by adjusting the placement of Synchronize calls so that those operations are handled together.

@dwcullop dwcullop self-assigned this Dec 21, 2023
@dwcullop dwcullop added the bug label Dec 21, 2023
@dwcullop dwcullop force-pushed the bugfix/mergemanychangeset-synchronization branch from 8321076 to 2b2c774 Compare December 22, 2023 00:21
@ChrisPulman
Copy link
Member

Somehow there's a difference of 103 failing the tests now.

@dwcullop
Copy link
Member Author

Somehow there's a difference of 103 failing the tests now.

Yeah, the order of operations changed which broke some things. I've updated things to use the original order of operations and now all of the tests are passing.

@ChrisPulman
Copy link
Member

Its failed on the stress test again 🤯

@dwcullop dwcullop enabled auto-merge (squash) December 22, 2023 20:39
@dwcullop
Copy link
Member Author

Its failed on the stress test again 🤯

Yeah, but I can't repro it. If I have to, I'll disable that test before I go on vacation and fix it when I get back. This PR is still an improvement over not having these changes.

@ChrisPulman
Copy link
Member

Same stress test issue, 1 missing 😕

@dwcullop dwcullop merged commit 42d0bda into reactivemarbles:main Dec 22, 2023
1 check passed
@ChrisPulman
Copy link
Member

Successful 👏👏 we can come back to fix the test

@dwcullop dwcullop deleted the bugfix/mergemanychangeset-synchronization branch December 22, 2023 22:46
Copy link

github-actions bot commented Jan 6, 2024

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants