-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: read-write splitting #402
Conversation
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #402 +/- ##
==========================================
- Coverage 97.52% 97.51% -0.02%
==========================================
Files 76 76
Lines 30868 30999 +131
==========================================
+ Hits 30105 30229 +124
- Misses 639 644 +5
- Partials 124 126 +2 ☔ View full report in Codecov by Sentry. |
cluster.go
Outdated
var p conn | ||
isSendToReplicas := c.opt.SendToReplicas(cmd) | ||
if isSendToReplicas { | ||
p = c.rslots[rand.Intn(int(cmds.InitSlot))] | ||
} else { | ||
p = c.pslots[rand.Intn(int(cmds.InitSlot))] | ||
} | ||
|
||
if p == nil { | ||
return nil, 0 | ||
} | ||
|
||
count.m[p]++ | ||
connIndexes[i] = p |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part looks incorrect and inconsistent with the previous implementation where we only do
init = true
continue
because we would like to defer slot selection when we have commands with no slot in DoMulti().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, when pipelining the MULTI
, SET k v
, EXIPRE k 1000
, EXEC
through DoMulti, we would choose the slot by SET k v
and EXIPRE k 1000
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is test cases for this:
t.Run("DoMulti Single Slot Read Operation And Write Operation + Init Slot", func(t *testing.T) {
c1 := client.B().Get().Key("K1{a}").Build()
c2 := client.B().Set().Key("K2{a}").Value("V2{a}").Build()
c3 := client.B().Info().Build()
resps := client.DoMulti(context.Background(), c1, c2, c3)
if v, err := resps[0].ToString(); err != nil || v != "GET K1{a}" {
t.Fatalf("unexpected response %v %v", v, err)
}
if v, err := resps[1].ToString(); err != nil || v != "SET K2{a} V2{a}" {
t.Fatalf("unexpected response %v %v", v, err)
}
if v, err := resps[2].ToString(); err != nil || v != "INFO" {
t.Fatalf("unexpected response %v %v", v, err)
}
})
Above case, "INFO" command is ignored. so i add connection to init slot command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get what you mean by providing the test case.
I mean, previously we intentionally did not select connections for cmd.Slot() == cmds.InitSlot
. Therefore, I expect you would also write the same code of
if cmd.Slot() == cmds.InitSlot {
init = true
continue
}
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible if multiple connections is selected and there is "InitSlot" command, then "InitSlot" command is ignored.
For exmaple,
"GET K1{a}" command uses connection "A", "SET K2{a} V2{a}" command uses connection "B", and there is "InitSlot" command likes "INFO" together. original code ignore "INFO" because none of connections is assigned.
So if you want to changed to original code, above ignoring command is what you intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the problem. But, we should only pick one connection if "InitSlot" commands are involved, otherwise commands like MULTI
and EXEC
will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. How about this?
eac17d5
When "InitSlot" command exist among multiple commands, use one of primary node only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. How about this? eac17d5 When "InitSlot" command exist among multiple commands, use one of primary node only.
Hi @proost,
Sure. Using one of the primary nodes only sounds good to me.
However, the eac17d5 makes the _pickMulti becomes too complex, IMHO. How about we refactor it a little bit to first detect if there are "InitSlot" commands? If yes, we then just go to the original logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! I changed it more simple
cluster.go
Outdated
isSendToReplicas := false | ||
if c.opt.SendToReplicas != nil { | ||
isSendToReplicas = true | ||
for _, cmd := range multi { | ||
if cmd.Slot() == cmds.InitSlot { | ||
continue | ||
} | ||
|
||
isSendToReplicas = isSendToReplicas && c.opt.SendToReplicas(cmd) | ||
if !isSendToReplicas { | ||
break | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need this part. All the works have been done in the _pickMulti.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part also changed it in here. eac17d5
cluster.go
Outdated
var p conn | ||
isSendToReplicas := c.opt.SendToReplicas(cmd) | ||
if isSendToReplicas { | ||
p = c.rslots[rand.Intn(int(cmds.InitSlot))] | ||
} else { | ||
p = c.pslots[rand.Intn(int(cmds.InitSlot))] | ||
} | ||
|
||
if p == nil { | ||
return nil, 0 | ||
} | ||
|
||
count.m[p]++ | ||
connIndexes[i] = p |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get what you mean by providing the test case.
I mean, previously we intentionally did not select connections for cmd.Slot() == cmds.InitSlot
. Therefore, I expect you would also write the same code of
if cmd.Slot() == cmds.InitSlot {
init = true
continue
}
here.
cluster.go
Outdated
p = c.slots[slot] | ||
switch { | ||
case slot == cmds.InitSlot: | ||
p = c.pslots[rand.Intn(int(cmds.InitSlot))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that we should not choose a random connection by choosing a random slot because it is possible that the chosen slot is not covered by the user's cluster. We should choose randomly from known connections instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed that point. how about this? use pconns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @proost,
Your pconns reminds me that the original implementation is wrong. The original implementation is possible to choose a replica connection for “InitSlot” commands.
However, I wonder if there is a simpler solution than introducing pconns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree that introducing "pconns" makes code more complex. but when choose primary node connection randomly among available primary node connections, pconn is simple & efficient solution.
How about replace "conns" with "pconns" and "rconns" like i did to replace "slots" to "pslots" and "rslots"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the pconns
and rconns
proposal.
But, after reviewing the access pattern of the conns
field, I think we still need this unified map. If we replace it with pconns
and rconns
, many places will become much more complex, for example:
Lines 387 to 390 in 4532630
func (c *clusterClient) redirectOrNew(addr string, prev conn) (p conn) { | |
c.mu.RLock() | |
p = c.conns[addr] | |
c.mu.RUnlock() |
Here, we must check both fields to find an existing connection.
Another example is here:
Lines 222 to 232 in 4532630
var removes []conn | |
c.mu.RLock() | |
for addr, cc := range c.conns { | |
if _, ok := conns[addr]; ok { | |
conns[addr] = cc | |
} else { | |
removes = append(removes, cc) | |
} | |
} | |
c.mu.RUnlock() |
where we reuse living connections and find which connections to remove after refreshing the cluster information. If we split the
conns
into two fields, this logic will become far more complex, especially when handling living connections switching their roles from one to the other.
I think the simplest solution to this problem could be extending the conns
field to be a map[string]struct{conn, bool}
where the new bool flag indicates it is a primary connection or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. I changed it 3139c13
cluster.go
Outdated
// NOTE: clusterConnection and conn must be initialized at the same time | ||
type clusterConnection struct { | ||
conn conn | ||
isPrimaryNodeConn bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isPrimaryNodeConn bool | |
replica bool |
I think it is better to consider a conn not a replica connection by default whereas the current isPrimaryNodeConn treats a conn not a primary connection by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also changed it too. 3bb4fb5
cluster.go
Outdated
@@ -226,31 +253,54 @@ func (c *clusterClient) _refresh() (err error) { | |||
if _, ok := conns[addr]; ok { | |||
conns[addr] = cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The role of a connection can change overtime, therefore we should not override the whole field from the old map, we should keep the new role instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, i missed that point. d115a78 Thank you!
cluster.go
Outdated
func newClusterClient(opt *ClientOption, connFn connFn) (client *clusterClient, err error) { | ||
client = &clusterClient{ | ||
cmd: cmds.NewBuilder(cmds.InitSlot), | ||
opt: opt, | ||
connFn: connFn, | ||
conns: make(map[string]conn), | ||
conns: make(map[string]*clusterConnection), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conns: make(map[string]*clusterConnection), | |
conns: make(map[string]connrole), |
clusterConnection is too long for me. And I think it is better to use struct directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3bb4fb5 OK. I changed it.
Hi @proost, Thank you for your hard work! It looks good to me. |
previous discussion: #400
add read-write splitting when cluster mode