Skip to content

Commit

Permalink
doc/rfc: v3 api supports for consistent range request
Browse files Browse the repository at this point in the history
  • Loading branch information
xiang90 committed Apr 18, 2015
1 parent 44182a0 commit 217825a
Showing 1 changed file with 15 additions and 1 deletion.
16 changes: 15 additions & 1 deletion Documentation/rfc/v3api.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,8 +94,9 @@ message ResponseHeader {
optional string error = 1;
optional uint64 cluster_id = 2;
optional uint64 member_id = 3;
// index of the store when the requested was processed.
// index of the store when the request was applied.
optional int64 index = 4;
// term of raft when the request was applied.
optional uint64 raft_term = 5;
}
Expand All @@ -105,12 +106,22 @@ message RangeRequest {
// if the end_key is given, it gets the keys in range [key, end_key).
optional bytes end_key = 2;
// limit the number of keys returned.
// if not all the keys are returned, a continue id will be
// returned in the range response.
optional int64 limit = 3;
// continue the previous range request in a consistent
// manner if a continue id is provided.
optional int64 continue_id = 4;

This comment has been minimized.

Copy link
@xiang90

xiang90 Apr 18, 2015

Author Contributor

@smarterclayton After think this for quite a while, I think this is probably the best approach to solve the consistent range query problem. I do not want to complicate the API, but more importantly I do not want make it error-prone to users. My previous proposal is too tricky.

Let me know what do you think.

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Apr 18, 2015

Contributor

Would the range query be lost on leader failover? Is the timeout a guide for the client based on the servers expected load/configuration? Is continue timeout seconds, raft indices, or something else?

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Apr 18, 2015

Contributor

Would tracking the continuation require resources on the server? Does the server have to limit the number of continuations (is it a vector for resource exhaustion attacks?

Just trying to get an idea of what sort of solution this would be. Ie is this leveraging delete tombstones, or an in memory behavior?

This comment has been minimized.

Copy link
@xiang90

xiang90 Apr 18, 2015

Author Contributor

@smarterclayton

Would the range query be lost on leader failover?

Yes. You have to retry. But the failover rate should be much lower than your query rate.

Is the timeout a guide for the client based on the servers expected load/configuration? Is continue timeout seconds, raft indices, or something else?

The timeout would be based on server's load. And it should be very reasonable, like 10 second or so. We just need to recycle the recourse at some point.

Does the server have to limit the number of continuations (is it a vector for resource exhaustion attacks?

Yes. But this is another concern. (get/put.. all share the similar issue, although their cost is less than a opening consistent range query)

Ie is this leveraging delete tombstones, or an in memory behavior?

The underlying database can allow us to create consistent readonly snapshot at low cost. However, the tricky part is to get a up-to-date consistent snapshot, we need to force commit all previous in-mem write/delete etc.. So a consistent range query costs more than a normal get. But it is not too bad.

}
message RangeResponse {
optional ResponseHeader header = 1;
repeated KeyValue kvs = 2;
optional continue_id = 3;
// the next range request with continue_id must
// be issued with in continue_timeout to the
// same server.
optional continue_timeout = 4;
}
message PutRequest {
Expand Down Expand Up @@ -239,6 +250,9 @@ message Action {
expire = 2;
}
optional ActionType event_type = 1;
// a put action contains the put key-value
// a delete/expire action contains the previous
// key-value
optional KeyValue kv = 2;
}
Expand Down

0 comments on commit 217825a

Please sign in to comment.