Replies: 1 comment 1 reply
-
One of the ways to reduce request size and improve privacy when requesting nullifiers is to send prefixes of the nullifiers we are interested in rather than full nullifier digest. For example, if instead of sending a full nullifier we send a 32-bit prefix, a request for 1000 nullifiers will require 4KB (vs. 32KB if we were to send full nullifiers). We could reduce the prefix size even further - e.g., to 16 bits. The downside is that this could return a large number of nullifiers. For example, if there are 32 billion nullifiers in the system (at 1K TPS this would be generated in about a year), and assuming they are uniformly distributed, a single 16-bit prefix would match almost 500K nullifiers. Even if we strip out Merkle authentication paths (which, it doesn't really makes sense to send when we are requesting nullifiers by prefix), this will be 15MB of data per prefix. There is, however, an easy way to fix this: the client can send their last known block number, and the node would respond only with the new nullifiers created since then. Assuming the same TPS and if the client has been offline for about a month, a single 16-bit prefix would match about 40K nullifiers (or 1.2MB of data). Obviously, if we send a request for 100 prefixes, we would get 100x of the data back on average. The table below shows how much data a client would need to download for requesting nullifiers for 100 16-bit nullifier prefixes depending on how long the client has been offline for various TPS assumptions:
The values for 100 and 1K TPS seem pretty reasonable to me. For 10K TPS we may need to use a longer prefix, but maybe by the time we get to this throughput downloading this much data won't be an issue. Based on the above, I think we can modify the endpoint to request nullifier/account data as follows:
|
Beta Was this translation helpful? Give feedback.
-
Related issues #10, #11, #21, #22, and #32.
When a client wants to sync up with the latest state of the network, the client is interested in the following:
a. A path from the
note_root
in a block header to the note.b. A path from an MMR peak to the leaf representing the block header of the block the note was created in.
a. The notes the client has received previously. The information of interest here is whether any of these notes have already been consumed. We also want to update the MMR Merkle path for these notes to match the latest peaks (though, technically, we may get away without this).
b. The notes the client had previously sent to the network (and which have already been included in the chain). For these notes, we are interested to learn if the notes have already been consumed.
c. The notes that the client has just recently sent to the network (and which may not have been included in the chain yet). This is also similar to learning whether a given transaction has been included in the chain. Our interest here is to confirm that a transaction has been included, and get Merkle inclusion paths for any newly created notes.
In addition to the above requirements, there are additional considerations:
Specific-object approach
One approach to retrieve the above data could be to ask for specific objects (e.g., notes, accounts, nullifiers).
Note consumption info
Getting info on whether some notes have already been consumes is relatively straightforward in this approach (and this was already implemented in #10). We just send a list of nullifiers we are interested in to the network and get the response back. However, this approach has a drawback: we need to explicitly list all nullifiers we want to get the status of. This means:
The first one is not a huge issue in my mind as clients will likely need to check on a relatively small number of nullifiers at any given time. The second one is a bigger problem as it could leak information about nullifiers for notes a client has sent to the network.
Account status info
We can do something similar to the above for account status checks as well. Specifically, we send a list of accounts we are interested in, and get back a list of their hashes. We could probably combine this request together with request for nullifiers and have a single request which could work as follows:
The benefit of a single request is that Merkle paths for nullifiers and accounts resolve against the same block header.
This request would fulfill items 2 (partially), 3 and 4 in the list at the top of this post. I'm also assuming that for this request we don't need to support pagination as we can put a limit on the number of requested nullifiers/account_ids and that would also limit the response size.
Note info
Fulfilling item 1 and the rest of item 2 is a bit more tricky. There are three ways in which we may want to retrieve note info:
Here, we could structure the request similarly to how we structured requests for accounts/nullifiers - i.e., list all items we are interested in. Specifically, it could look like so:
Here, we also include the block number (
block_ref
) of the last block known to the client. This limits the size of the response as the node will need to send matching notes which were created after the specified block.However, we may still end up with a relatively large response as a lot of notes could match a given tag or sender. So, we need to introduce some sort of pagination. One approach to this was described in #11: we use blocks as a natural pagination mechanism. Specifically, the response could look like this:
The user then would make repeated requests until
chain_tip = block_header.block_num
at which point, the user would have synced up fully to latest state of the chain.Bloom filter-based approach
One of the downsides of the above approach is that we have to explicitly request every object we are interested in. As mentioned before, this may result in relatively large requests, but more importantly, may leak more info than would be acceptable.
An alternative approach is to use bloom filters to specify the data we are interested in. For example, for the note info request, it would be nice to use something like this:
In the above, the bloom filter would include note tags, senders, and note hashes the client is interested in. Then, the node would run all relevant notes through this bloom filter, and send the ones that pass back to the client.
While this is quite convenient, the main concern with the above scheme is how to make it performant.
First consideration is how large to make the bloom filter. Assuming the client wants to request several hundred objects (across tags, senders, and hashes) and be able to vary the false probability rate between 1 in 1K and 1 in 1M, we'd need a bloom filter of 4KB (32768 bits). From the request size standpoint, this is not bad at all. From checking whether a given item is in the bloom filter, this is probably not great.
More importantly, however, it is not clear how to efficiently figure out how we can apply the bloom filter on the node side. As I can see it, there are two options:
Beta Was this translation helpful? Give feedback.
All reactions