feat: read-write splitting #402

proost · 2023-11-09T13:38:07Z

previous discussion: #400

add read-write splitting when cluster mode

codecov-commenter · 2023-11-09T13:52:40Z

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (98aa541) 97.52% compared to head (d115a78) 97.51%.
Report is 10 commits behind head on main.

Files	Patch %	Lines
cluster.go	97.10%	4 Missing and 2 partials ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #402      +/-   ##
==========================================
- Coverage   97.52%   97.51%   -0.02%     
==========================================
  Files          76       76              
  Lines       30868    30999     +131     
==========================================
+ Hits        30105    30229     +124     
- Misses        639      644       +5     
- Partials      124      126       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cluster.go

rueian · 2023-11-09T15:32:14Z

cluster.go

+				var p conn
+				isSendToReplicas := c.opt.SendToReplicas(cmd)
+				if isSendToReplicas {
+					p = c.rslots[rand.Intn(int(cmds.InitSlot))]
+				} else {
+					p = c.pslots[rand.Intn(int(cmds.InitSlot))]
+				}
+
+				if p == nil {
+					return nil, 0
+				}
+
+				count.m[p]++
+				connIndexes[i] = p


This part looks incorrect and inconsistent with the previous implementation where we only do

init = true continue

because we would like to defer slot selection when we have commands with no slot in DoMulti().

For example, when pipelining the MULTI, SET k v, EXIPRE k 1000, EXEC through DoMulti, we would choose the slot by SET k v and EXIPRE k 1000.

there is test cases for this:

t.Run("DoMulti Single Slot Read Operation And Write Operation + Init Slot", func(t *testing.T) { c1 := client.B().Get().Key("K1{a}").Build() c2 := client.B().Set().Key("K2{a}").Value("V2{a}").Build() c3 := client.B().Info().Build() resps := client.DoMulti(context.Background(), c1, c2, c3) if v, err := resps[0].ToString(); err != nil || v != "GET K1{a}" { t.Fatalf("unexpected response %v %v", v, err) } if v, err := resps[1].ToString(); err != nil || v != "SET K2{a} V2{a}" { t.Fatalf("unexpected response %v %v", v, err) } if v, err := resps[2].ToString(); err != nil || v != "INFO" { t.Fatalf("unexpected response %v %v", v, err) } })

Above case, "INFO" command is ignored. so i add connection to init slot command

I don't get what you mean by providing the test case.

I mean, previously we intentionally did not select connections for cmd.Slot() == cmds.InitSlot. Therefore, I expect you would also write the same code of

if cmd.Slot() == cmds.InitSlot { init = true continue }

here.

It is possible if multiple connections is selected and there is "InitSlot" command, then "InitSlot" command is ignored.

For exmaple,
"GET K1{a}" command uses connection "A", "SET K2{a} V2{a}" command uses connection "B", and there is "InitSlot" command likes "INFO" together. original code ignore "INFO" because none of connections is assigned.

So if you want to changed to original code, above ignoring command is what you intended?

I see the problem. But, we should only pick one connection if "InitSlot" commands are involved, otherwise commands like MULTI and EXEC will fail.

OK. How about this?
eac17d5
When "InitSlot" command exist among multiple commands, use one of primary node only.

OK. How about this? eac17d5 When "InitSlot" command exist among multiple commands, use one of primary node only.

Hi @proost,

Sure. Using one of the primary nodes only sounds good to me.

However, the eac17d5 makes the _pickMulti becomes too complex, IMHO. How about we refactor it a little bit to first detect if there are "InitSlot" commands? If yes, we then just go to the original logic.

1178a7c

Sure! I changed it more simple

rueian · 2023-11-09T15:36:22Z

cluster.go

+	isSendToReplicas := false
+	if c.opt.SendToReplicas != nil {
+		isSendToReplicas = true
+		for _, cmd := range multi {
+			if cmd.Slot() == cmds.InitSlot {
+				continue
+			}
+
+			isSendToReplicas = isSendToReplicas && c.opt.SendToReplicas(cmd)
+			if !isSendToReplicas {
+				break
+			}
+		}
+	}


I think we don't need this part. All the works have been done in the _pickMulti.

This part also changed it in here. eac17d5

cluster.go

rueian · 2023-11-13T14:05:05Z

cluster.go

+				var p conn
+				isSendToReplicas := c.opt.SendToReplicas(cmd)
+				if isSendToReplicas {
+					p = c.rslots[rand.Intn(int(cmds.InitSlot))]
+				} else {
+					p = c.pslots[rand.Intn(int(cmds.InitSlot))]
+				}
+
+				if p == nil {
+					return nil, 0
+				}
+
+				count.m[p]++
+				connIndexes[i] = p


I don't get what you mean by providing the test case.

I mean, previously we intentionally did not select connections for cmd.Slot() == cmds.InitSlot. Therefore, I expect you would also write the same code of

if cmd.Slot() == cmds.InitSlot { init = true continue }

here.

cluster.go

rueian · 2023-11-16T12:35:17Z

cluster.go

-		p = c.slots[slot]
+		switch {
+		case slot == cmds.InitSlot:
+			p = c.pslots[rand.Intn(int(cmds.InitSlot))]


I just noticed that we should not choose a random connection by choosing a random slot because it is possible that the chosen slot is not covered by the user's cluster. We should choose randomly from known connections instead.

1178a7c

Oh, I missed that point. how about this? use pconns.

Hi @proost,

Your pconns reminds me that the original implementation is wrong. The original implementation is possible to choose a replica connection for “InitSlot” commands.

However, I wonder if there is a simpler solution than introducing pconns.

Yes, I agree that introducing "pconns" makes code more complex. but when choose primary node connection randomly among available primary node connections, pconn is simple & efficient solution.
How about replace "conns" with "pconns" and "rconns" like i did to replace "slots" to "pslots" and "rslots"?

Thank you for the pconns and rconns proposal.

But, after reviewing the access pattern of the conns field, I think we still need this unified map. If we replace it with pconns and rconns, many places will become much more complex, for example:

rueidis/cluster.go

Lines 387 to 390 in 4532630

func (c *clusterClient) redirectOrNew(addr string, prev conn) (p conn) {

c.mu.RLock()

p = c.conns[addr]

c.mu.RUnlock()

Here, we must check both fields to find an existing connection.

Another example is here:

rueidis/cluster.go

Lines 222 to 232 in 4532630

var removes []conn

c.mu.RLock()

for addr, cc := range c.conns {

if _, ok := conns[addr]; ok {

conns[addr] = cc

} else {

removes = append(removes, cc)

}

}

c.mu.RUnlock()

where we reuse living connections and find which connections to remove after refreshing the cluster information. If we split the conns into two fields, this logic will become far more complex, especially when handling living connections switching their roles from one to the other.

I think the simplest solution to this problem could be extending the conns field to be a map[string]struct{conn, bool} where the new bool flag indicates it is a primary connection or not.

Nice. I changed it 3139c13

cluster.go

rueian · 2023-11-20T12:03:51Z

cluster.go

+// NOTE: clusterConnection and conn must be initialized at the same time
+type clusterConnection struct {
+	conn              conn
+	isPrimaryNodeConn bool


Suggested change

isPrimaryNodeConn bool

replica bool

I think it is better to consider a conn not a replica connection by default whereas the current isPrimaryNodeConn treats a conn not a primary connection by default.

Also changed it too. 3bb4fb5

rueian · 2023-11-20T12:10:57Z

cluster.go

@@ -226,31 +253,54 @@ func (c *clusterClient) _refresh() (err error) {
 		if _, ok := conns[addr]; ok {
 			conns[addr] = cc


The role of a connection can change overtime, therefore we should not override the whole field from the old map, we should keep the new role instead.

Oh, i missed that point. d115a78 Thank you!

rueian · 2023-11-20T12:12:48Z

cluster.go

 func newClusterClient(opt *ClientOption, connFn connFn) (client *clusterClient, err error) {
 	client = &clusterClient{
 		cmd:    cmds.NewBuilder(cmds.InitSlot),
 		opt:    opt,
 		connFn: connFn,
-		conns:  make(map[string]conn),
+		conns:  make(map[string]*clusterConnection),


Suggested change

conns: make(map[string]*clusterConnection),

conns: make(map[string]connrole),

clusterConnection is too long for me. And I think it is better to use struct directly.

3bb4fb5 OK. I changed it.

rueian · 2023-11-21T12:01:41Z

Hi @proost,

Thank you for your hard work! It looks good to me.

feat: read-write splitting

48408b1

rueian reviewed Nov 9, 2023

View reviewed changes

refactor: make isSendReplicas methos

f7197a8

proost requested a review from rueian November 13, 2023 11:39

rueian reviewed Nov 13, 2023

View reviewed changes

cluster.go Outdated Show resolved Hide resolved

cluster.go Outdated Show resolved Hide resolved

cluster.go Outdated Show resolved Hide resolved

cluster.go Outdated Show resolved Hide resolved

rueian reviewed Nov 13, 2023

View reviewed changes

proost added 2 commits November 14, 2023 20:03

refactor: remove useless assigned variable

600ecb0

refactor: remove unused variable

e7db040

proost requested a review from rueian November 14, 2023 11:06

fix: handle transaction

eac17d5

rueian reviewed Nov 16, 2023

View reviewed changes

proost added 2 commits November 18, 2023 00:45

refactor: use pconns

1178a7c

refactor: change more code simpler

9cf7c83

proost requested a review from rueian November 17, 2023 15:51

refactor: define cluster connection

3139c13

rueian reviewed Nov 20, 2023

View reviewed changes

proost added 2 commits November 21, 2023 16:28

refactor: rename more shorter & use struct

3bb4fb5

refacdtor: assign new connrole

d115a78

proost requested a review from rueian November 21, 2023 07:30

rueian merged commit cb76d15 into redis:main Nov 21, 2023

rueian added the feature label Nov 21, 2023

proost deleted the feat-read-write-splitting branch November 22, 2023 05:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: read-write splitting #402

feat: read-write splitting #402

proost commented Nov 9, 2023

codecov-commenter commented Nov 9, 2023 •

edited

Loading

rueian Nov 9, 2023 •

edited

Loading

rueian Nov 9, 2023

proost Nov 13, 2023

rueian Nov 13, 2023

proost Nov 14, 2023

rueian Nov 14, 2023

proost Nov 16, 2023

rueian Nov 16, 2023

proost Nov 17, 2023

rueian Nov 9, 2023

proost Nov 16, 2023

rueian Nov 13, 2023

rueian Nov 16, 2023

proost Nov 17, 2023

rueian Nov 17, 2023

proost Nov 18, 2023

rueian Nov 18, 2023

proost Nov 20, 2023

rueian Nov 20, 2023

proost Nov 21, 2023

rueian Nov 20, 2023 •

edited

Loading

proost Nov 21, 2023

rueian Nov 20, 2023

proost Nov 21, 2023

rueian commented Nov 21, 2023

	func (c *clusterClient) redirectOrNew(addr string, prev conn) (p conn) {
	c.mu.RLock()
	p = c.conns[addr]
	c.mu.RUnlock()

	var removes []conn

	c.mu.RLock()
	for addr, cc := range c.conns {
	if _, ok := conns[addr]; ok {
	conns[addr] = cc
	} else {
	removes = append(removes, cc)
	}
	}
	c.mu.RUnlock()

		@@ -226,31 +253,54 @@ func (c *clusterClient) _refresh() (err error) {
		if _, ok := conns[addr]; ok {
		conns[addr] = cc

	conns: make(map[string]*clusterConnection),
	conns: make(map[string]connrole),

feat: read-write splitting #402

feat: read-write splitting #402

Conversation

proost commented Nov 9, 2023

codecov-commenter commented Nov 9, 2023 • edited Loading

Codecov Report

rueian Nov 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rueian Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rueian commented Nov 21, 2023

codecov-commenter commented Nov 9, 2023 •

edited

Loading

rueian Nov 9, 2023 •

edited

Loading

rueian Nov 20, 2023 •

edited

Loading