Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor duplicate Nats-Msg-Id tracking #6174

Merged
merged 1 commit into from
Nov 28, 2024
Merged

Refactor duplicate Nats-Msg-Id tracking #6174

merged 1 commit into from
Nov 28, 2024

Conversation

neilalexander
Copy link
Member

The current duplicate tracking has a couple of problems, namely that the ddmap capacity is potentially never reclaimed if it grows suddenly, and secondly that we have to constantly monitor ddarr's capacity and copy it to stop the array from growing infinitely too.

This PR makes the following changes:

  • ddmap is now using the stree, which means there is efficient in-place deduplication of message IDs and no infinitely growing backing capacity;
  • ddarr and ddindex are now gone, replaced with linked-list references and ddnext and ddlast tracking, so we also don't need to watch the capacity and reallocate this either;
  • ddtmr is now only scheduled when there is actually something to do.

Also added a unit test to prove the behaviour as well as the linked-list stitching.

Signed-off-by: Neil Twigg neil@nats.io

@neilalexander

This comment was marked as resolved.

Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@neilalexander neilalexander marked this pull request as ready for review November 26, 2024 12:15
@neilalexander neilalexander requested a review from a team as a code owner November 26, 2024 12:15
@neilalexander neilalexander force-pushed the neil/dedupe branch 2 times, most recently from 146c96c to 7c25f10 Compare November 28, 2024 11:02
@derekcollison
Copy link
Member

Want me to take a look?

@neilalexander
Copy link
Member Author

Yes please. I did take a look at container/heap as an option but it still requires a backing array and has the same problem that the current ddarr has in terms of backing capacity. There's also container/list but it isn't generic (although we could copy it in from the stdlib and make it generic if you think that's a better path).

@derekcollison derekcollison self-requested a review November 28, 2024 17:48
Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

* Swap out `ddmap` for stree (better message ID deduplication, consistent lookup times)
* Remove `ddarr` and `ddindex`, replace with linked list tracking
* Add unit test to prove the correct behaviour

Signed-off-by: Neil Twigg <neil@nats.io>
@derekcollison derekcollison merged commit ed49214 into main Nov 28, 2024
5 checks passed
@derekcollison derekcollison deleted the neil/dedupe branch November 28, 2024 18:42
neilalexander added a commit that referenced this pull request Dec 2, 2024
Includes:

- #6187
- #6174
- #6189
- #6192

Signed-off-by: Neil Twigg <neil@nats.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants