-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved version of UniqueList: OrderedSet #76
Conversation
A sorted set is optimal for this type of functionality because we can use fewer, more-performant Redis calls.
In order to ease the transition from UniqueList being backed by a Redis list, we can fallback to the legacy implementation for read operations. For write operations we first migrate to a sorted set, then retry.
lib/kredis/types/unique_list.rb
Outdated
legacy_elements = legacy_list.elements | ||
legacy_list.del | ||
append(legacy_elements) | ||
end |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
@@ -0,0 +1,29 @@ | |||
# You'd normally call this a set, but Redis already has another data type for that | |||
class Kredis::Types::UniqueListLegacy < Kredis::Types::List |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed the existing UniqueList
to UniqueListLegacy
.
lib/kredis/types/unique_list.rb
Outdated
end | ||
|
||
def current_nanoseconds(negative:) | ||
"%10.9f" % (negative ? -Time.now.to_f : Time.now.to_f) |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
lib/kredis/types/unique_list.rb
Outdated
end | ||
|
||
multi do | ||
zadd(elements_with_scores) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider LT / GT options for ordering safety, avoiding problems with clock skew.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not possible due to Redis server compatibility target being version 4.
lib/kredis/types/unique_list.rb
Outdated
end | ||
|
||
def basis_score_components | ||
[ redis_time + Time.now.utc.to_i, Time.now.utc.nsec ] |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
Use the Redis server time as a basis. Then add the current process time. Then add to that the index of the inserted elements as a microsecond component.
f7af6e0
to
f28ff6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice approach!
I'd introduce this as a distinct data type rather than migrating in place. Consider e.g. ordered_set
which reflects the zset usage and scoring. Then, separately, introduce a pathway to auto-migrate from unique lists to ordered sets.
It'd be interesting to consider a migration abstraction for other types, too. Could be a proxy wrapper with declared fallback vs migrate behavior, e.g. migrates from: :OldType, to: NewType, fallback: [ methods ], migrate: [ methods ]
.
lib/kredis/types/unique_list.rb
Outdated
end | ||
|
||
def process_start_time | ||
@process_start_time ||= redis.time.join(".").to_f - process_uptime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this may be used in replicated and clustered setups. Flagging just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you envisage any problems if this were used in such a setup?
lib/kredis/types/unique_list.rb
Outdated
end | ||
|
||
def process_uptime | ||
Process.clock_gettime(Process::CLOCK_MONOTONIC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about our spread of platform support, but this is limited to POSIX (so, probably not JRuby).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Process.clock_gettime is OK in JRuby 9 +
https://blog.dnsimple.com/2018/03/elapsed-time-with-ruby-the-right-way/
lib/kredis/types/unique_list.rb
Outdated
end | ||
|
||
def trim(from_beginning:) | ||
return unless limit&.positive? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are negative/zero limits supported? If not, would fail earlier, e.g. in a #limit=
method, and only check for nil
limit here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
lib/kredis/types/unique_list.rb
Outdated
legacy_elements = legacy_list.elements | ||
legacy_list.rename([ legacy_list.key, "_backup" ].join) | ||
append(legacy_elements) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be within an atomic MULTI…EXEC?
Like a UniqueList, but backed by a Redis sorted set rather than a list
* main: Bump version for 1.2.0 Use pipeline for migration too Ensure to pass the pipeline to super Use block parameter to pipeline in Redis#multi (rails#68) Note tested / supported versions of Redis in the Readme (rails#79) Return counter value after incrementing/decrementing Enum Bang setter (rails#82) Add reset to Cycle after_change callbacks Add example in Readme.md Support configuring custom keys via method invocation Use bundle add instead (rails#81)
@jeremy thanks for all the feedback. This is much simpler now. I like the idea of splitting out the migration into a follow-up PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lewispb Great approach! I've just left a note for your consideration
lib/kredis/types/ordered_set.rb
Outdated
return if elements.empty? | ||
|
||
elements_with_scores = types_to_strings(elements, typed).map.with_index do |element, index| | ||
score = generate_base_score(negative: prepending) + index / 100000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate what index / 10000
is for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redis sorted sets use a double 64-bit floating point number to represent the score. In all the architectures we support, this is represented as an IEEE 754 floating point number
I was trying to store the maximum unique precision possible in Redis, to store scores (times). By dividing a large integer (seconds) by 100000 we get a float. e.g:
>> Time.now.to_i
=> 1653898149
>> Time.now.to_f / 100000
=> 16538.98132595902
>> 3.times { p Time.now.to_f / 100000 }
16538.98262287975
16538.982622880398
16538.98262288052
Even though we lose some precision, each iteration produces a higher score than the previous one, perfect for our use case of appending.
def generate_base_score(negative:) | ||
current_time = process_start_time + process_uptime | ||
|
||
negative ? -current_time : current_time | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever approach 👍!
Just a note for your consideration:
What would happen in case of skewed Redis clocks / concurrent writes? Is possible to face the scenario where appending 2 arrays would lead to mixing their elements?
# In process A
ordered_set.append([1, 2, 3])
# In process B
ordered_set.append([4, 5, 6])
# Is this possible?
ordered_set.elements
# => [1, 2, 4, 3, 5, 6]
Maybe a less clever approach that relies getting the max/min (with an optimistic lock) would be more resilient?
WATCH zset
# get max
element = ZRANGE zset -1 -1
# compute scores max +1 for each element
MULTI
# add elements
ZADD zset elements
# trim
EXEC
neration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great.
Let's consider how to guide folks to the improved data structure. Perhaps remove the unique list example from the README, or move to the bottom and mark as deprecated? Could do this in a separate PR, since it also begs "well, how do I migrate?"
* main: (21 commits) Bump version for 1.4.0 Update nokogiri for compatibility Revert "Improved version of UniqueList: OrderedSet (rails#76)" (rails#111) Add `last` to lists (rails#97) Improved version of UniqueList: OrderedSet (rails#76) Return Time objects instead of deprecated DateTime (rails#106) Fix possible deserialization of untrusted data Typecast return of Set#take (rails#105) Declare Active Model dependency (rails#107) Address LogSubscriber deprecation (rails#98) Account for time zones in DateTime serializations (rails#102) Add sample to set (rails#100) Bump version for 1.3.0 Allow Redis 5.x Add ltrim to lists Coalesce "current pipeline or redis" into the redis method itself Pefer a thread_mattr_accessor over a thread local variable Delete list of keys in batch (rails#90) Use a thread-local variable for pipeline Revert "Use block parameter to pipeline in Redis#multi (rails#68)" ...
…tialize * origin/main: (22 commits) Add kredis_ordered_set for OrderedSet usage in models Add a development console Bump version for 1.5.0 Fix ordered set prepend bug (rails#115) Unique list with sorted set (rails#114) Eliminating Ruby Warnings (rails#112) CI against Redis 7, Ruby 3.1, and Ruby 3.2 (rails#113) Bump version for 1.4.0 Update nokogiri for compatibility Revert "Improved version of UniqueList: OrderedSet (rails#76)" (rails#111) Add `last` to lists (rails#97) Improved version of UniqueList: OrderedSet (rails#76) Return Time objects instead of deprecated DateTime (rails#106) Fix possible deserialization of untrusted data Typecast return of Set#take (rails#105) Declare Active Model dependency (rails#107) Address LogSubscriber deprecation (rails#98) Account for time zones in DateTime serializations (rails#102) Add sample to set (rails#100) Bump version for 1.3.0 ...
Closes #74
Using a benchmark script, we can see the improvement in performance by using a Sorted Set in some scenarios:
Initially, we can introduce this as a new datatype. But to follow on, a migration mechanism will be introduced so that we can replace unique_list.