-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
P2P: Subnet content resolution protocol #475
Comments
For the record, we had a discussion about this on Zoom:
|
It's not entirely clear what /// Supported types of content
enum ContentType {
CrossMsgs(Vec<Msg>)
} First I thought it was a fully fledged raw message, but later usage seems to contradict this, because |
I had questions about Bitswap, notably how it handles recursion. I saw that for example Forest implements Bitswapstore for RocksDB, which is a trait in the in the libp2p-bitswap library, a different implementation from iroh-bitswap linked above. If we look at the implementation it has this part: impl BitswapStore for RocksDb {
type Params = libipld::DefaultParams;
...
fn missing_blocks(&mut self, cid: &Cid) -> anyhow::Result<Vec<Cid>> {
bitswap_missing_blocks::<_, Self::Params>(self, cid)
}
} Following on to bitswap_missing_blocks we can see that what id does is load the CID as a To do so, the In the case of Forest, it will indeed try to follow all CIDs recursively by returning them as To be clear, the scenario is:
So, to cover all bases, I think we can use bitswap the following way:
Thinking about Fendermint where I want to include CIDs in the blocks for async resolution (ie. not resolve them before the block is executed, but by some future time when the same CID is re-proposed for execution, so that I can keep using Tendermint without stalling consensus while resolution is happening), option 1 should be fine, exactly because I am proposing a CID for resolution, so someone must have it, and Tendermint Core itself doesn't use Bitswap, so it won't get into trouble. |
Correct. I abused a bit the notation. This is an unnecessary generalization for M2 where I considered supporting more than one content type already for the protocol. By the way, if you feel discussing on this issue is a bit inefficient (there is no support for threads/resolve conversations. etc.), we can move to a Github discussion, or we can kick-off a proper design doc for this already. |
When Bitswap receives an IPLD block it inspects if there are pending links to be resolved and broadcasts a WANT message for those CIDs. As we don't need recursive resolution of CIDs for now maybe is easier to not use Bitswap and write a lighter libp2p protocol that from all the connections it checks if its part of the subnet and it can serve the content we are looking for. Bitswap is known to flood the network with a lot of requests to overcome its lack of content routing capabilities, so if we choose to go with it maybe we should wrap it into some higher-level protocol of our own to work around this and try to focus a bit more the search for content to peers in a specific subnet. We should maybe take a first stab to writing a design doc so we can clarify all this details. WDYT?
Maybe I misunderstood you here, but quick clarification, there's no need for resolving a checkpoint from a Cid. The checkpoint for a subnet |
@adlrocha thank you for responding to my comments here. I think it's as good a place as any other, at least it's in an obvious place. Currently I am investigating the
In your design you don't need recursion because you put all messages into a single data structure assembled by the Gateway actor. However I think it would be great if we designed for recusiveness for message resolution between actors (and between Fendermint applications) because checkpointing is inherently recursive, we shouldn't discard it. At least I thought IPLD data can be forwarded by enveloping, so that the core of the message can travel on unchanged, and always be available, rather than as slightly modified copies.
I was thinking about the use case of Fendermint where I wanted to include CIDs as for resolution, so the first step would always be to resolve the CID. I am not at a place yet where I can enumerate to what messages they can resolve; I will probably create an enum just for that, where one of the options will be a cross message, the other a simple message which was turned into CID to lower the cost of re-inclusion in case a block could not be finalized. In that model it would be
Hm, I'm missing the step where the checkpoint was pushed into the state of parent. I thought that the parent asks the agent for anything to include, and it gets the full checkpoint, not a checkpoint with just a CID.
That might be true, although from what I understand about the MVP it also exhibits flooding behaviour, by broadcasting the Pull and then the Response as well, which in this case are larger messages, especially if the granularity of the request/response is a full checkpoint, without looking at the availability of individual messages. I am wary that we'll end up reinventing bitswap. At least in the case of Fendermint I believe Bitswap is the right fit and it's worth investigating. I am not skilled with libp2p so for me there's no such thing as a "lighter libp2p protocol" 🙂
Great point. I am looking at the |
What about https://github.com/retrieval-markets-lab/rs-graphsync ? EDIT: After reading your summary of graphsync using IPLD selectors, it's even more of an overkill for our situation than bitswap. |
I also notice that some of the improvements that you mention in your fantastic blog have been included in the library, for example during a |
Random notes regarding
NB the It's not immediately obvious how IPC agents are supposed to know where to connect to the peers of other subnet's agents. I assume with the GossipSub approach there would have been a single address for all agents, then you just select your topic, but here, if we wanted separate swarms where you can target all the agents of a subnet for resolution, you have to learn the address from somewhere first. Possibly we'd have to combine the two solution as @adlrocha suggested with "Gossipsub for peer discovery": we could publish the participation of an agent in a subnet to GossipSub, which is cool because the membership info would spread, and then take that membership and use it for running Bitswap queries. With this combo, we can have a single swarm as well, just select the |
Amazing analysis, @aakoshh 🙌
I like this approach because it also offers a shared broadcast layer by all members for the case that you need to broadcast some control messages. It probably can't be used for the resolution itself as it may flood the network, but it can be used for heartbeat or proactive caching purposes. |
@adlrocha thanks again for following up on all my comments on Slack and here as well. Here's a concrete proposal of something that I think would be general enough and work. It requires almost no new protocol development from our side, just stitching together existing tech, and we already have examples of such stitching; although I have to say it's a bit daunting. When I wrote my Forest School document, based on my description of handling events around Bitswap it looks like Forest was version 0.4, whereas now it's version 0.6 and a lot has changed: the behaviour of getting and inserting data into the blockstore has been moved into the library (not sure if it was there before as well as in Forest or not). If we look at the ForestBehaviour it has multiple parts:
Bitswap is used to resolve messages in blocks, while the block of CIDs is received either as a gossipsub message or pulled through chain exchange. The want_block method looks up all peers from the discovery behaviour and uses them to do a The way the events are handled internally is complicated 😵 So, in a similar vein we could do the following:
I am not sure how reusable something like |
One thing that isn't entirely clear to me is that Forest feeds the discovered addresses to Bitswap, but it doesn't do so with Gossipsub. Does Gossipsub have its own way to discover peers? |
From what I can see, Gossipsub must be using the on_swarm_events to maintain its list of connections. The The reason possibly lies further down where Kademlia is polled and can return a The question is:
|
I don't think this is the case. It knows about the, on average, 6 connections of its mesh, and from there it can know with certainty that their messages will be broadcast to the rest of the subscribers of the topic with high-probability but doesn't know who are these.
I don't think I got this. Kademlia and Gossipsub are orthogonal. If we choose to use a DHT for membership management, each subnet should keep its own DHT and we can be sure that all members will be tracked there. The issue is that, while it is more reliable because the DHT keeps the whole membership, it is less dynamic than leveraging GossipSub.
This may be a bit of an over-kill. We could maybe use Gossipsub for membership and a point-to-point request-response libp2p protocol like chain exchange for now for the actual exchange. We could then introduce Bitswap in the future (it seems harder to integrate). That being said, let me ping the CoD team, IIRC they had a similar issue, let's see what they ended up doing. Following up in Slack. |
I don't think that's exactly true. It definitely knows about more peers, but by default it only uses 6 connections per topic. I saw this yesterday in join, that when you join a topic, you pick 6 connections, filling them up with random ones if you have to, which means there are more to choose from. You can see this here: there are collections that track who is subscribed to what and then there is separately the mesh, but later on it says the mesh keys are the ones we are currently subscribed to. Also when we receive a subscription, it is recorded and then optionally added to the mesh if there are less than 6 peers in it.
I don't think they are, not completely. I think Gossipsub only adds connected peers when the Swarm connects to them (see the link above), and the Swarm only connects to peers if you tell it to do so, which can be because they are explicit/persistent peers you configured the node to always connect to, or because the Kademlia behaviour told it to connect, because it had to run a query against them during the discovery process. They seem orthogonal, and they kind of are, but the Gossipsub connections exist as a byproduct of Kademlia (and the discovery behaviour driving it) doing its regular thing. My current thesis is that we need some kind of peer discovery process, because Gossipsub won't connect to anything on its own.
Not necessarily; I mean we can have a single DHT to discover all IPC agents in existence, then decide based on Gossipsub who to contact when we need to resolve content from certain subnets. I doubt that we'll run out of space in the K-buckets to be able to track so many agents. It's just possible that we need to prompt the Swarm to connect to the ones we want, not just hope that we are already connected to some. And here, we need an address, which we normally get from Kademlia. We cannot ask Gossipsub for addresses, just PeerIds; if we want to use Gossipsub for learning where to connect to, we have to publish that information explicitly, and ask the Swarm to connect; basically do Kademlia over Gossipsub. Another option would be to let Kademlia connect, but keep track of these addresses even if it disconnects, and make sure we are always connected to some of those peers in the target subnets by telling the Swarm to dial them.
I disagree, I think Bitswap is the the right solution here and the one easier to integrate as well, because it needs zero custom development. I might be still misunderstanding how your approach for including the cross messages works, but I thought once you do your resolution, you have to put the CID of the cross message into an actual block for execution - otherwise how would any other peer know that this is the block a validator decided to do this. This means that during |
@adlrocha it has been a long time since I read it but https://github.com/ethereum/devp2p/blob/master/discv5/discv5.md combines peer discovery with topics. |
Just noticed that Forest now implements its own Bitswap: https://github.com/ChainSafe/forest/blob/main/node/forest_libp2p/bitswap/README.md |
It definitely knows about more connections, but it can only be sure about 6 of those being subscribed to a specific topic (I think we are saying the same thing :) )
I agree with this. There needs to be a bootstrap infrastructure for agents (or a protocol like mDNS for bootstrapping, although I am afraid that wouldn't work for us).
A peer could probably also populate its entry with information about the subnets it is subscribed to. Actually, I feel this approach is really similar to that of discv5 (it has also been a while since I last read the protocol). To summarize a bit the points made so far, we seem to agree on:
|
Not exactly, I meant that it knows others are subscribed, but it doesn't gossip that topic to them.
Yes, and thanks for reaching out to the Bacalhau people 🧜♂️ They use Gossipsub Peer Exchange to learn about addresses of other peers in the network during pruning; alas, this is not available in the Rust version of the library. I don't think this is the right time for us to add the feature, so I vote we start with Kademlia. Judging by the Forest code, it doesn't require too much code to wrap in a similar structure than their Assuming we know from Gossipsub who are the agents in a specific subnet that we can contact for getting our stuff, what may be difficult to balance is that we are actually able to maintain connections to some of them. If we try to connect to one and the connection pool of the Swarm is already full, the outgoing connection will be rejected. It's like we'd need a separate |
Most of the issues are completed, the library is ready to be tested. Let's close this and use the offshoots we created from now. |
Background
While child subnets
C
are required to sync with their parents for their operation, it is not required for parent subnets,P
, to sync with all their children. This means that whileC
can directly pull the top-down messages that need to be proposed and executed inC
by reading the state of the parent through the IPC agent and getting the raw messages, this is not possible for bottom-up messages. Bottom-up messages are propagated inside checkpoints as acid
that points to the aggregate of all the messages propagated fromC
toP
.P
does not have direct access to the state ofC
to get the raw messages behind thatcid
and conveniently propose them for execution. This is where the subnet content resolution protocol comes into play.This protocol can be used by any participant of IPC to resolve content stored in the state of a specific subnet. The caller performs a request specifying the type of content being resolved and the
cid
of the content and any participant of that subnet will pull the content from its state and share it respond to the request. This protocol is run by all IPC agents participant in an IPC subnet. Initially, the only type supported for resolution will beCrossMsgs
, but in the future additional content types and handlers can be registered in the protocol.Design
This design is inspired by the way we implemented the protocol in the MVP, but if you can come up with a simpler or more efficient design, by all means feel free to propose it and implement it that way. As we don't have a registry of all the IPC Agents or peers participating in a subnet, we leverage
GossipSub
for the operation of the protocol. Each IPC agent is subscribed to an independent topic for each of the subnets syncing with. Thus, if an IPC agent is syncing withP
andC
, it will be automatically subscribed to/ipc/resolve/P
and/ipc/resolve/C
.In the MVP, the protocol was designed as an asynchronous request-response protocol on top of a broadcast layer (i.e. GossipSub). We implemented three types of messages:
Pull
message to the relevant broadcast topic for the destination subnet sharing information about the content to be resolved, and optionally either information about the subnet or the multiaddress of the source agent making the request.Response
message to the topic of the source subnet if it was specified in the request, or it directly connects to theMultiAddr
of the initiator of the request and send theResponse
directly to them.C
toP
, agents inC
may choose to broadcast aPush
message to the topic ofP
for the case where agents from validators inP
may want to preemptively cache the content so they can propose the messages without having to resolve the content in a destination subnet.Alternatives
Gossipsub + Bitswap
One alternative to this protocol would be to directly use Bitswap to resolve any cid from our existing connections. We could use GossipSub exclusively for peer discovery, i.e. so all IPC agents would subscribe to an
ipc/agents/<subnet_id>
topics for each subnet to mesh with other IPC agents syncing with these subnets and establish connections that can then be leveraged by Bitswap to resolve content. For this to work, all the content that we want to be "resolvable" in the IPC agent needs to be cached in a local datastore.Point-to-point + DHT or Gossipsub for peer discovery.
Another option is to leverage a DHT for each subnet, or to subscribe to specific topics for each of the subnets in order to discover with peers syncing with the same subnets, and then build a direct peer-to-peer protocol for the content resolution with the same kind of messages proposed above for the MVP implementation. Actually, a peer-to-peer libp2p protocol on top of some peer discovery protocol could be the most efficient in terms of number of messages and network load.
The text was updated successfully, but these errors were encountered: