-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Endpoint to perform client state sync #43
Comments
I think as initial implementation (to simplify things) we can do something like this:
This will work for the first testnet purposes and we can expand request options later on to include tags, senders etc. To state one assumption explicitly: once the client applies |
By the way, for the
In the above, a combination of And for the
In the above, |
Should that issue #22 be closed then? It looks like account states are also being synced over this endpoint. |
As mentioned in 0xPolygonMiden/crypto#195 (comment), we may want to include one more field into the response. The new response schema would look like so:
When interpreting this response, the client would do something like this to update its local chain MMR: let num_notes = /* based on the number of notes returned in the response */
chain_mmr.update(mmr_delta).unwrap();
if num_notes > 0 {
chain_mmr.add(block_num, block_header_hash, path_to_last_header).unwrap();
}
// TODO: check that new chain_mmr is consistent with `block_header.chain_root` Thus, by repeatedly calling this endpoint until
|
Expanding on the ideas from #38 (comment) and simplifying some things, I think we can combine everything into a single endpoint which we can call
In the above:
Pros
Cons
Open questions
|
I think we should still keep it, but in my mind, it is no longer on the critical path. |
Some thoughts about the
Maybe the user behavior could also affect the effectivess of the approach above. I think the best way of having a solution for this is to actually anonymize the request for nullifiers, or to request from multiple nodes instead. |
Great points! My current thinking is that we can leave the exact strategies for how to rotate nullifier prefixes, how many decoys to use etc. to the future. And maybe these could be determined individually by the clients based on their specific needs. As you mentioned, these may need to rely on some additional data (current/historical TPS), but I'm thinking that this can be obtained from external sources or maybe even from the nodes themselves via a different endpoint. A couple of specific comments.
Yes, this how I imagined it working - i.e., the client keeps using the same prefix until it receives at least several (or maybe several dozen) nullifiers for this prefix. At 20 TPS, this would mean that nullifier rotation could happen on daily basis. But again, the exact strategies and parameters require much deeper analysis and probably can be left to the future.
Good idea, and I think it can be used together with (rather than instead of) the above.
My initial thinking was that in such situations the client would not rotate the set of nullifiers during the entire sync. But there could be other strategies too. |
I think I didn't do a good job at presenting my point. I'm trying to say it isn't a good strategy to rotate decoys. Here is another take:
With that said, the proposal is to go with 100TPS as a estimate, so it seems the decoys won't be necessary. |
Recently we discussed this endpoint could be sufficient for the state sync needs, I see a few issues: Issue 1: This assumes clients has a reference block number to start from. Which is not true for new clients. Issue 2: The above can be read to imply it is possible to provide such a start block, which is not true. Example: Client wants to consume a DEX note produced prior to that reference point (would never see the note's inclusion proof). Issue 3: This assumes blocks which have been synced over are no longer useful, which is not true. Example: Client has synced from genesis to the tip of the chain, and wants to consume a note that has been produced already. (here the DEX example applies too, the note's inclusion proof would be missing, but not because the block wasn't seen, but because the request that saw the block didn't include note's tag) Issue 4: The issues 2/3 also happen for the nullifiers. Because of the above, I think this endpoint is useful for a client to fetch state from the rollup when it's filters have not changed. But we still need endpoints to look in the past. Here are some possible solutions: For issue 1: I guess we can use the endpoint to fetch the block header to fetch the latest block header, and use that as the reference point. Note: This doesn't fix the other issues. For issue 2, w/ side communication channel: The user would learn about the note's block height by the same means it learns about the note, or even better, the user would receive the authentication path for the note, and this endpoint is not necessary for this case. 1 For issue 2, wo/ side communication channel: If there is no out-of-bound communication channel, then the user does not know the block height, and has to start at genesis. This doesn't have great performance and needs some thought w.r.t pruning. For issue 2, w/ pay-to-address: For a subset of the notes, which need the user's account id as a target, the issue goes away with some careful setup:
For issue 3: The solutions for issue 2 applies. On top of that, the client would need to perform multiple concurrent requests, and hopefully the request for newly added items would move fast enough so the user could merge the syncs into a single call. For issue 4, w/ side communication channel: This could be a grievance attack vector, so the user shouldn't trust anything except the state of the head. 1 2 For issue 4, wo/ side communication channel: I think this would need an endpoint to fetch the nullifier based on epochs similar to this Footnotes
|
If the client wants to sync from genesis, then they can probably use 0 as If the client doesn't care about syncing from genesis (e.g., this is a new client and they know for sure there are no nullifiers or notes addressed to them), then we'll need to provide another endpoint. Something that gives the client:
|
For issues 2, 3, 4, I think a solution would be to introduce the concept of epochs (similar to what you mention in your post). Let's say we define an epoch as two weeks, then the endpoint would work as follows:
With the above, the client will have reference point on epoch boundaries. So, if they would like to get some additional data from a past epoch, they'll be able to request it and merge in into their local database. I can implement this enhancement at a later date though. |
I think this sholud be a list accounts which have their last change in the range
|
Agreed! |
Why would the If we apply the same design as the other tables, it should look like: CREATE TABLE
notes
(
pk INTEGER NOT NULL,
tag INTEGER NOT NULL,
block_num INTEGER NOT NULL,
note BLOB NOT NULL,
PRIMARY KEY (pk),
CONSTRAINT notes_tag_is_felt CHECK (tag >= 0 AND tag < 18446744069414584321),
CONSTRAINT notes_block_num_is_u32 CHECK (block_num >= 0 AND block_num < 4294967296),
CONSTRAINT notes_note_isnt_empty CHECK (length(note) > 0),
FOREIGN KEY (block_num) REFERENCES block_header (block_num)
) STRICT, WITHOUT ROWID; Instead of: CREATE TABLE
notes
(
block_num INTEGER NOT NULL,
note_index INTEGER NOT NULL,
note_hash BLOB NOT NULL,
sender INTEGER NOT NULL,
tag INTEGER NOT NULL,
num_assets INTEGER NOT NULL,
merkle_path BLOB NOT NULL,
PRIMARY KEY (block_num, note_index),
CONSTRAINT notes_block_number_is_u32 CHECK (block_num >= 0 AND block_num < 4294967296),
CONSTRAINT notes_note_index_is_u32 CHECK (note_index >= 0 AND note_index < 4294967296),
CONSTRAINT notes_note_hash_is_digest CHECK (length(note_hash) = 32),
CONSTRAINT notes_sender_is_felt CHECK (sender >= 0 AND sender <= 18446744069414584321),
CONSTRAINT notes_tag_is_felt CHECK (tag >= 0 AND tag <= 18446744069414584321),
CONSTRAINT notes_num_assets_is_u8 CHECK (tag >= 0 AND tag < 256),
-- 32 is the size of a digest
-- 20 is the value of NOTE_TREE_DEPTH
CONSTRAINT notes_merkle_path_is_simple_tree CHECK (length(merkle_path) = 32 * 20),
FOREIGN KEY (block_num) REFERENCES block_header (block_num)
) STRICT, WITHOUT ROWID; |
I'm adding these optimizations. But I would like to reemphasize that for the protobuf and sqlite layers these changes don't make sense. For protobuf all integers types are encoded using the same And the encoded size in bytes is variable, depending on the contents of the integer. In other words, these changes are only adding unnecessary type castings when encoding the protobuf messages. Sqlite is similar, and all integer types are encoded using the same storage class: That is also variable size:
|
There are several reasons:
Also, I probably wouldn't impose foreign key constraints. I remember reading that they are a real performance killers in SQLite. |
I think it is good to specify more restrictive types - primarily for code clarity reasons. Also, in the future we might move to a different underlying database, and it would be easier to carry over if the types are specified more precisely. |
ref: #38
Tasks
The text was updated successfully, but these errors were encountered: