-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GetKeysSummary Api Signature/protos #4
Conversation
message GetKeysSummaryResponse { | ||
|
||
// Fetched versions along with the corresponding keys in the request. | ||
repeated KeyValue key_versions = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider: creating a separate struct for KeyVersion instead of re-using KeyValue struct here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say here explicitly that the value will be empty. I think it's OK to reuse the struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here are my initial thoughts.
we should say somewhere what are the consistency guarantees. for example, we should probably guarantee that the returned KeyValues will have been written at or after the returned globalVersion
, and that they may include writes after the returned globalVersion
.
app/src/main/proto/vss.proto
Outdated
// page's GetKeysSummaryResponse. | ||
// It will be used as exclusive_start_key to perform the new query and only keys greater than | ||
// the specified last_evaluated_key will be returned in the response. | ||
optional string last_evaluated_key = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider naming this be pagination_token
or similar, to discourage the client from assuming anything about the nature of this field. this will give us flexibility in the future to paginate in any order.
could probably drop the paragraph with exclusive_start_key
(it is not defined, and the client doesn't really care)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, some best practices here: https://cloud.google.com/apis/design/design_patterns#list_pagination
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes sense,
earlier version of this api was inspired by s3/ddb api.
but this approach is more generic and some of the new aws api's use a similar approach along with google cloud api's.
that best practices guide was also very helpful.
message GetKeysSummaryResponse { | ||
|
||
// Fetched versions along with the corresponding keys in the request. | ||
repeated KeyValue key_versions = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say here explicitly that the value will be empty. I think it's OK to reuse the struct.
I would still provide a paginating |
app/src/main/proto/vss.proto
Outdated
// page's GetKeysSummaryResponse. | ||
// It will be used as exclusive_start_key to perform the new query and only keys greater than | ||
// the specified last_evaluated_key will be returned in the response. | ||
optional string last_evaluated_key = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, some best practices here: https://cloud.google.com/apis/design/design_patterns#list_pagination
app/src/main/proto/vss.proto
Outdated
@@ -98,6 +98,64 @@ message PutObjectRequest { | |||
message PutObjectResponse { | |||
} | |||
|
|||
message GetKeysSummaryRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would name this using List
instead of Get
if the response can be paginated. Probably something like ListObjectVersionsRequest
to go with GetObjectRequest
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to ListKeyVersions
app/src/main/proto/vss.proto
Outdated
@@ -98,6 +98,64 @@ message PutObjectRequest { | |||
message PutObjectResponse { | |||
} | |||
|
|||
message GetKeysSummaryRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add message-level docs? I don't think it's obvious without them.
app/src/main/proto/vss.proto
Outdated
// page's GetKeysSummaryResponse. | ||
// It will be used as exclusive_start_key to perform the new query and only keys greater than | ||
// the specified last_evaluated_key will be returned in the response. | ||
optional string last_evaluated_key = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May also want to allow users to limit how many to return, as per best practices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes sense, added a field for this.
app/src/main/proto/vss.proto
Outdated
// Use this value to start a new GetKeysSummary operation, excluding this value in the new result | ||
// by specifying this last_evaluated_key in the next request. | ||
// | ||
// If last_evaluated_key is empty, then the "last page" of results has been processed and there is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty or not set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some clarification.
Since this is a response object, we want to be as specific as possible, so that it is easier for client to understand. I don't want client to have to think whether it will be empty or not set (even though they are kinda same in protobuf)
Whereas in request structs, we want to be more forgiving of client's usage, both empty and "not set" serve the same purpose and even if client uses one or the other, things should just work and server will handle it.
app/src/main/proto/vss.proto
Outdated
// global_version is a sequence-number/version of the whole store. | ||
// | ||
// global_version is only returned in response for the first page of GetKeysSummaryResponse. | ||
optional int64 global_version = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the global version is updated on the server before the call for the next page? Will the request fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, not really.
So the current correct usage of this api is:
- read global_version from first response, store it in local_variable.
- apply all the key versions. (load objects lazy or eager)
- then apply the global_version from the previously stored local_variable.
i should probably add this in doc on server-side as well, to help in usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, not really.
So the current correct usage of this api is:
- read global_version from first response, store it in local_variable.
- apply all the key versions. (load objects lazy or eager)
Could you describe the lazy case? Is it essentially marking the client-side version as out-of-date, requiring it to be fetched when it needs to read next?
- then apply the global_version from the previously stored local_variable.
i should probably add this in doc on server-side as well, to help in usage.
Hmmm... I'm not sure of this addressed my concern around pagination. Couldn't the second page correspond to a different global version if another client updated the global version after fetching the first page but before fetching the second?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to clarify, since there's no way to do a batch-get of the entire store, there's no hope for getting a globally consistent snapshot incrementally, right? so perhaps it doesn't matter that each page may have a different global version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it essentially marking the client-side version as out-of-date, requiring it to be fetched when it needs to read next?
Yes, this is optional but on client side we can store the new-version-number and old-version-number. On get of key, if there is a diff, we can make a service call and fetch latest instead of returning the local version.
Couldn't the second page correspond to a different global version if another client updated the global version
The global version is only returned in first page of results. With that being said, there still exists a scenario where globalVersion/store is updated after first page's fetch. This constitutes a race-condition b/w ListKeyVersions on device-1 and put-operation on device-2.
In such a scenario, it is ensured that the api is used safely if client follows the above protocol.
I.e., the put requests and global-versioning still maintains correct versioning and does not let the primary store to get corrupted. (the api design decisions are made to enforce/encourage correct and safe usage by the client.)
It does not guarantee that client has fully updated view of store in all scenarios after refresh, it makes a best-effort to bring client to fully-updated state. In case of race-conditions, we make sure that versioning scheme is not compromised and subsequent refresh will bring client up-to-date. In case where client has slightly out-of-date view of key-versions, subsequent put-requests will fail and client will have to refresh.
Current Api contract is as following: "Here are the minimum key-versions by this
globalVersion. Which is globally consistent snapshot at instance of time when store's globalVersion was this
"
I did not want to return global-version in every page, since client could easily end up misusing that api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to be a bit more specific, here is my understanding:
- each page is internally consistent at a certain global version
- a later page can have a greater-than-or-equal global version to the previous page (i.e. monotonically increasing)
is that what you had in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not guarantee that client has fully updated view of store in all scenarios after refresh, it makes a best-effort to bring client to fully-updated state. In case of race-conditions, we make sure that versioning scheme is not compromised and subsequent refresh will bring client up-to-date. In case where client has slightly out-of-date view of key-versions, subsequent put-requests will fail and client will have to refresh.
Few things came to mind:
- Is this robust enough to handle the case where the put-operation on device-2 adds data for a new key that would have been returned to device-1 in the first page?
- Will the server essentially exclude that item from the second page?
- For lightning, does that leave us in a state for instance when device-2 may open a new channel but device-1 may not know it exists for making a payment until either a refresh or an incoming payment? Or for the latter would it fail because the channel is unknown (i.e., it won't bother querying the server)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devrandom
Not exactly, only first page contains the globalVersion. And the key-versions are guaranteed to be correct only for that global version. Since client would be using that global version in put requests, it will only work if there were no further updates during or after this api request.
Is this robust enough to handle the case where the put-operation on device-2 adds data for a new key that would have been returned to device-1 in the first page?
Yes, since client would be using an out-of-date globalVersion, a client-device not having updated view will not be able to write successfully.
Will the server essentially exclude that item from the second page?
Server does not do anything special, globalVersion ensures that client has complete and updated view. If some change has been made during or after this api, then client's globalVersion is rendered out-of-date.
For lightning, does that leave us in a state for instance when device-2 may open a new channel but device-1 may not know it exists for making a payment until either a refresh or an incoming payment? Or for the latter would it fail because the channel is unknown (i.e., it won't bother querying the server)?
The key is, client is not able to perform any changes or updates if they don't have fully-updated view.
If the primary state has had any change, client-device needs to update/refresh to proceed with any operations.
This refresh could be triggered because of unrelated change or some periodic refresh on client-side.
Client-device could also ensure that they have latest state by just querying and comparing the globalVersion.
Depending on implementation, this could be triggered upfront i.e. before doing the actual work/operation.
it won't bother querying the server
This cannot happen because in order to use anything or make any save, client would need to have fully updated view.
Current process enforces client to have up-to-date view of LN state-machine.
Note that this multi-device usage is intended for end-user devices and not for concurrently running routing nodes of same LN-state machine. In these cases, we do expect some contention and overhead but our main goal is to ensure safe state machine transitions. And since these points of parallel use are user-facing, we expect them to be occasional but at the same time not very high frequency.
Expect context switching to be from one device to another, and in worst case back to same device. But we should not expect back and forth again and again b/w same devices within a short interval. (Even though all transitions are ensured to be safe, there will be some overhead when switching and that should be fine)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since client would be using an out-of-date globalVersion, a client-device not having updated view will not be able to write successfully.
Server does not do anything special, globalVersion ensures that client has complete and updated view. If some change has been made during or after this api, then client's globalVersion is rendered out-of-date.
My question here was not around writing but rather reading subsequent pages. Say device-1 gets page-1 with:
- key-1
- key-2
- key-4
Then device-2 puts key-3. Would device-1 get key-3 on when retrieving page-2?
The key is, client is not able to perform any changes or updates if they don't have fully-updated view. If the primary state has had any change, client-device needs to update/refresh to proceed with any operations. This refresh could be triggered because of unrelated change or some periodic refresh on client-side.
But say the client just wants to form a payment path using their channels first hop candidates. No write is involved. So they would have an outdated view of the channels (i.e., missing the newly opened channel). I see that if they actually go through and try to make the payment using the path --which would involve a write -- it would fail because of the global version is out of date. But after fetching the data, would they just go through with making the payment? Or would they start over with path finding now that they are aware of the previously missing channel? This is more of a question for users of VSS rather than how it is implemented, I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would device-1 get key-3 on when retrieving page-2?
It might or it might not, depending on DB implementation being used and key ordering. But it shouldn't matter according to the current process. (In both cases, a refresh is still required.)
Second question
That is an implementation detail and yes it depends on how vss is being used and refreshed.
One way to go about it is, client will have some buffer window for which it will assume that it has the latest view.
If client-state was refreshed within T-x time, assume it is latest and proceed.
app/src/main/proto/vss.proto
Outdated
// In case of refreshing complete key-version view on the client-side, correct usage for | ||
// the returned global_version is as following: | ||
// 1. Read global_version from the first page of paginated response, store it as local_variable. | ||
// 2. Update all the key_versions on client-side from the paginated response. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this also mean calling GetObject
for each key? It isn't clear from my reading of this that it is required though it seems implied from my understanding of global versioning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my other comment above about globally consistent snapshot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes,
lets say client makes a put operation and it fails due to conflictException. Now client wishes to refresh their view of storage.
This can be done by computing the diff in versions and downloading the updated keys in lazy or eager fashion.
Since only few of the keys need to be updated to ensure fully-updated view, only those keys will be fetched.
app/src/main/proto/vss.proto
Outdated
// This ensures that on client-side, we can guarantee that all current key_versions are at least | ||
// from the corresponding global_version. This guarantee is helpful for ensuring the versioning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does "current key_versions" mean only those returned by the first page (as this documentation is outlining)? Or for all pages, which seems to imply the docs are discussion fetching more than the first page? If the latter, seems like the above steps should be explicit about fetching subsequent pages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is for all pages,
since the key versions returned in all pages are atleast from globalVersion returned in first page.
I thought it was explicit in "from the paginated response" but reworded it for clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/response/responses
Since there is more than one response, each with a single page. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when we say paginated, its generally a "paginated response" even though it refers to multiple pages.
@@ -98,6 +98,79 @@ message PutObjectRequest { | |||
message PutObjectResponse { | |||
} | |||
|
|||
message ListKeyVersionsRequest { | |||
|
|||
// store_id is a keyspace identifier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: might make sense to move repeating doc comments to the top of the file, and just link to them in each request proto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i want to do that but there wasn't a clear and nice way to do it.
If we do it the current way, protobuf generated classes have nicely added field docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i want to do that but there wasn't a clear and nice way to do it. If we do it the current way, protobuf generated classes have nicely added field docs.
Are you saying any top-level docs are not copied anywhere in the generated code? In my past experience, the generated code didn't have any docs at all, so developers would just look at the proto file. But perhaps this has changed.
We should ideally end off in a place where the reader has a high-level summary somewhere and have more detailed explanations as they drill down into messages and fields, IMO. Not sure how this will ultimately look, but I could imagine a few possibilities:
- High-level summary at top, minimal request/response docs, detailed field docs
- High-level summary at top, detailed request/response docs, minimal field docs
- No high-level summary, high-level summary in request/response docs, detailed request/response docs
Regardless of which, we should have some high-level summary somewhere so developers don't have to exhaustively read every field doc to know how everything fits together. Feel free to do in a follow-up, of couse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying any top-level docs are not copied anywhere in the generated code?
Yes, they get omitted, only field and struct level docs get into generated code.
My current plan is
"High-level summary in project docs, No high-level summary in proto, high-level summary in request/response docs, detailed field docs"
In this way, docs on top of fields and structs will get copied to generated code, and high-level project doc serves as reference for usage. It will also contain examples from client-side code.
Will take it as follow-up.
app/src/main/proto/vss.proto
Outdated
// | ||
// In case of refreshing complete key-version view on the client-side, correct usage for | ||
// the returned global_version is as following: | ||
// 1. Read global_version from the first page of paginated response, store it as local_variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no need to give a name to the variable - i.e. can just say "and save it."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 changed it to "and save it as local variable."
app/src/main/proto/vss.proto
Outdated
// 2. Update all the key_versions on client-side from all the pages of paginated response. | ||
// 3. Update global_version on client_side from the local_variable stored in step-1. | ||
// This ensures that on client-side, we can guarantee that all current key_versions are at least | ||
// from the corresponding global_version. This guarantee is helpful for ensuring the versioning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"are at least from the corresponding global_version" -> "were stored at global_version or later"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@devrandom can you take an another look at this. |
Add GetKeysSummary Api Signature/protos
This will auto-generate required request/response objects.
This API can be used for computing version difference of keys b/w client and server-side and lazily load only necessary values.
Some notes from previous discussion on this api:
regarding "return everything with a certain key prefix" => This makes for an unbounded query on server-side hence we should be avoiding it and use pagination.
Also note that we could support storing values upto ~5MB. If we started returning value as well in this response, even a 10 item limit might mean 50MB response. So it is best to only load necessary key-values lazily depending on key-version differences.
Note regarding http/rest+protobuf protocol:
Here protobuf is only being used as a binary serialization format for request and response objects.(instead of writing custom binary serialization or json). This helps both client and server to create requests/response like class instances with ease. This payload will be used as binary body in http-rest post requests.