Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make KEYS to be an exact match if there is no pattern #792

Merged
merged 1 commit into from
Aug 27, 2024

Conversation

enjoy-binbin
Copy link
Member

@enjoy-binbin enjoy-binbin commented Jul 16, 2024

Although KEYS is a dangerous command and we recommend people
to avoid using it, some people who are not familiar with it
still using it, and even use KEYS with no pattern at all.

Once KEYS is using with no pattern, we can convert it to an
exact match to avoid iterating over all data.

This PR got revert in #964.

Although KEYS is a dangerous command and we recommend people
to avoid using it, some people who are not familiar with it
still using it, and even use KEYS with no pattern at all.

Once KEYS is using with no pattern, we can convert it to an
exact match to avoid iterating over all data.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Copy link

codecov bot commented Jul 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.28%. Comparing base (1a8bd04) to head (330fe53).
Report is 89 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #792      +/-   ##
============================================
+ Coverage     70.25%   70.28%   +0.03%     
============================================
  Files           112      112              
  Lines         60590    60607      +17     
============================================
+ Hits          42567    42600      +33     
+ Misses        18023    18007      -16     
Files Coverage Δ
src/db.c 89.06% <100.00%> (+0.67%) ⬆️

... and 13 files with indirect coverage changes

@lipzhu
Copy link
Contributor

lipzhu commented Jul 16, 2024

Maybe we can extend this more common, for something like KEYS h[ae]llo , we can translate it to exactly match as hallo and hello. redis/redis#13079

@enjoy-binbin
Copy link
Member Author

Maybe we can extend this more common, for something like KEYS h[ae]llo , we can translate it to exactly match as hallo and hello

it seems like a good idea in some engine, but i think it adds too much code, and it doesn't seem worth doing so much pre-optimization in here.

@enjoy-binbin enjoy-binbin merged commit d66a06e into valkey-io:unstable Aug 27, 2024
20 checks passed
@enjoy-binbin enjoy-binbin deleted the match_keys branch August 27, 2024 04:04
@enjoy-binbin enjoy-binbin added the release-notes This issue should get a line item in the release notes label Aug 27, 2024
@soloestoy
Copy link
Member

We discussed this issue when implementing the one-dict-per-slot feature, and we considered mapping the pattern for commands like SCAN to slots using hashtags. However, when it comes to determining the key for the pattern, I used the behavior of the EXISTS command as an example of what to avoid redis/redis#12536 (comment).

We implemented this feature on our platform before (as early as version 2.8), and it resulted in very undesirable outcomes. Specifically, the time complexity became ambiguous, fluctuating between O(1) and O(n), which is a significant difference.

For instance, if a user initially used a non-pattern-based pattern like KEYS abc, they would quickly get results. However, when they used a pattern like xyz*, the overall result would be very poor, causing instances to get stuck due to slow queries and even leading to system crashes in the production environment. This was a real-life scenario.

I believe the right thing to do is to persuade users to use EXISTS instead of KEYS, to do the right thing the right way, rather than accommodating this incorrect usage. We should revert this "optimization" @valkey-io/core-team

@zuiderkwast zuiderkwast added the major-decision-pending Major decision pending by TSC team label Aug 28, 2024
@enjoy-binbin
Copy link
Member Author

enjoy-binbin commented Aug 28, 2024

ok, i did not read redis/redis#12536 at that time so i missed this context, sorry about that, and sorry i am not realizing it

We implemented this feature on our platform before (as early as version 2.8), and it resulted in very undesirable outcomes. Specifically, the time complexity became ambiguous, fluctuating between O(1) and O(n), which is a significant difference.

so you guys revert it at the end? i can see it will cause trouble, a guy execute very fast in here, and then get blocked when executing somewhere else. But i think in this case, it is ok if we document it (or we can even not to document it), O(N) degenerates to O(1) which is OK in my opinion, the worst case is O(N) and we did not deny it, and if he get blocked at the end, it is also normal since it is a wrong case.

to do the right thing the right way

i agree with this actually, i hope we can do it. But there are sill a lot of user doing this in the wrong way, they did not bother to do the right thing and everytime they are blaming the "service", all we can do i think is at least try our best to migrate the impact with a small cost. But i do agree with this, i am ok with both (revert it or document it or just leave it alone), lets seeks a vote and then i can process it.

And speaking of this, i would also like ask everyone to discuss this one: #866 (comment). What can or what should we do to migrate this. If we agree with something, i think we can fit it into 8.0

@soloestoy
Copy link
Member

When I develop a new feature or optimization, I often remind myself that Redis and Valkey are products designed for online services.

As an engineer at a cloud provider, with my experience facing various different scenarios, I have found that online services and offline services are quite different. The characteristic of online services is that they require a high level of determinism, both in terms of execution results and performance, because online services can change at any time due to user actions at the front end, while the deployed code for online services remains fixed.

Therefore, uncertainty can potentially bring disaster to the business. Take the KEYS command as an example - in my view, if a general optimization can reduce the time complexity from O(n) to O(log n), it would be very meaningful. However, if the time complexity fluctuates between O(n) and O(1), this kind of uncertainty is very dangerous.

@enjoy-binbin enjoy-binbin removed the release-notes This issue should get a line item in the release notes label Aug 29, 2024
enjoy-binbin added a commit to enjoy-binbin/valkey that referenced this pull request Aug 29, 2024
In valkey-io#792, the time complexity became ambiguous, fluctuating between
O(1) and O(n), which is a significant difference. And we agree uncertainty
can potentially bring disaster to the business, the right thing to do is
to persuade users to use EXISTS instead of KEYS in this case, to do the
right thing the right way, rather than accommodating this incorrect usage.

This reverts commit d66a06e.
This reverts valkey-io#792.

Signed-off-by: Binbin <binloveplay1314@qq.com>
enjoy-binbin added a commit that referenced this pull request Aug 29, 2024
In #792, the time complexity became ambiguous, fluctuating between
O(1) and O(n), which is a significant difference. And we agree uncertainty
can potentially bring disaster to the business, the right thing to do is
to persuade users to use EXISTS instead of KEYS in this case, to do the
right thing the right way, rather than accommodating this incorrect usage.

This reverts commit d66a06e.
This reverts #792.

Signed-off-by: Binbin <binloveplay1314@qq.com>
madolson pushed a commit that referenced this pull request Sep 2, 2024
Although KEYS is a dangerous command and we recommend people
to avoid using it, some people who are not familiar with it
still using it, and even use KEYS with no pattern at all.

Once KEYS is using with no pattern, we can convert it to an
exact match to avoid iterating over all data.

Signed-off-by: Binbin <binloveplay1314@qq.com>
madolson pushed a commit that referenced this pull request Sep 2, 2024
In #792, the time complexity became ambiguous, fluctuating between
O(1) and O(n), which is a significant difference. And we agree uncertainty
can potentially bring disaster to the business, the right thing to do is
to persuade users to use EXISTS instead of KEYS in this case, to do the
right thing the right way, rather than accommodating this incorrect usage.

This reverts commit d66a06e.
This reverts #792.

Signed-off-by: Binbin <binloveplay1314@qq.com>
madolson pushed a commit that referenced this pull request Sep 3, 2024
Although KEYS is a dangerous command and we recommend people
to avoid using it, some people who are not familiar with it
still using it, and even use KEYS with no pattern at all.

Once KEYS is using with no pattern, we can convert it to an
exact match to avoid iterating over all data.

Signed-off-by: Binbin <binloveplay1314@qq.com>
madolson pushed a commit that referenced this pull request Sep 3, 2024
In #792, the time complexity became ambiguous, fluctuating between
O(1) and O(n), which is a significant difference. And we agree uncertainty
can potentially bring disaster to the business, the right thing to do is
to persuade users to use EXISTS instead of KEYS in this case, to do the
right thing the right way, rather than accommodating this incorrect usage.

This reverts commit d66a06e.
This reverts #792.

Signed-off-by: Binbin <binloveplay1314@qq.com>
PingXie pushed a commit to PingXie/valkey that referenced this pull request Sep 14, 2024
Although KEYS is a dangerous command and we recommend people
to avoid using it, some people who are not familiar with it
still using it, and even use KEYS with no pattern at all.

Once KEYS is using with no pattern, we can convert it to an
exact match to avoid iterating over all data.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ping Xie <pingxie@google.com>
PingXie pushed a commit to PingXie/valkey that referenced this pull request Sep 14, 2024
…io#964)

In valkey-io#792, the time complexity became ambiguous, fluctuating between
O(1) and O(n), which is a significant difference. And we agree uncertainty
can potentially bring disaster to the business, the right thing to do is
to persuade users to use EXISTS instead of KEYS in this case, to do the
right thing the right way, rather than accommodating this incorrect usage.

This reverts commit d66a06e.
This reverts valkey-io#792.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ping Xie <pingxie@google.com>
PingXie pushed a commit to PingXie/valkey that referenced this pull request Sep 14, 2024
Although KEYS is a dangerous command and we recommend people
to avoid using it, some people who are not familiar with it
still using it, and even use KEYS with no pattern at all.

Once KEYS is using with no pattern, we can convert it to an
exact match to avoid iterating over all data.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ping Xie <pingxie@google.com>
PingXie pushed a commit to PingXie/valkey that referenced this pull request Sep 14, 2024
…io#964)

In valkey-io#792, the time complexity became ambiguous, fluctuating between
O(1) and O(n), which is a significant difference. And we agree uncertainty
can potentially bring disaster to the business, the right thing to do is
to persuade users to use EXISTS instead of KEYS in this case, to do the
right thing the right way, rather than accommodating this incorrect usage.

This reverts commit d66a06e.
This reverts valkey-io#792.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ping Xie <pingxie@google.com>
PingXie pushed a commit to PingXie/valkey that referenced this pull request Sep 14, 2024
Although KEYS is a dangerous command and we recommend people
to avoid using it, some people who are not familiar with it
still using it, and even use KEYS with no pattern at all.

Once KEYS is using with no pattern, we can convert it to an
exact match to avoid iterating over all data.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ping Xie <pingxie@google.com>
PingXie pushed a commit to PingXie/valkey that referenced this pull request Sep 14, 2024
…io#964)

In valkey-io#792, the time complexity became ambiguous, fluctuating between
O(1) and O(n), which is a significant difference. And we agree uncertainty
can potentially bring disaster to the business, the right thing to do is
to persuade users to use EXISTS instead of KEYS in this case, to do the
right thing the right way, rather than accommodating this incorrect usage.

This reverts commit d66a06e.
This reverts valkey-io#792.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ping Xie <pingxie@google.com>
PingXie pushed a commit to PingXie/valkey that referenced this pull request Sep 14, 2024
Although KEYS is a dangerous command and we recommend people
to avoid using it, some people who are not familiar with it
still using it, and even use KEYS with no pattern at all.

Once KEYS is using with no pattern, we can convert it to an
exact match to avoid iterating over all data.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ping Xie <pingxie@google.com>
PingXie pushed a commit to PingXie/valkey that referenced this pull request Sep 14, 2024
…io#964)

In valkey-io#792, the time complexity became ambiguous, fluctuating between
O(1) and O(n), which is a significant difference. And we agree uncertainty
can potentially bring disaster to the business, the right thing to do is
to persuade users to use EXISTS instead of KEYS in this case, to do the
right thing the right way, rather than accommodating this incorrect usage.

This reverts commit d66a06e.
This reverts valkey-io#792.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ping Xie <pingxie@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major-decision-pending Major decision pending by TSC team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants