Skip to content
This repository has been archived by the owner on Feb 1, 2023. It is now read-only.

Proposal: Streaming GetBlocks #228

Open
Stebalien opened this issue Dec 9, 2019 · 11 comments
Open

Proposal: Streaming GetBlocks #228

Stebalien opened this issue Dec 9, 2019 · 11 comments
Assignees
Labels
Milestone

Comments

@Stebalien
Copy link
Member

Stebalien commented Dec 9, 2019

At the moment, it's a bit tricky to continuously prefetch blocks as one needs to launch a goroutine per batch.

Proposal: Change the signature of GetBlocks to take an input channel and return an output channel:

// Fetcher is an object that can be used to retrieve blocks (defined in go-ipfs-exchange-interface)
type Fetcher interface {
	// GetBlock returns the block associated with a given key.
	GetBlock(context.Context, cid.Cid) (blocks.Block, error)

	// GetBlocks returns a stream of blocks, given a stream of CIDs. It will
	// return blocks in any order.
	//
	// To wait for all remaining blocks, close the CID channel and wait for
	// the blocks channel to be closed. A closed channel does not mean that
	// _all_ blocks were retrieved, it just means that the fetcher is done
	// retrieving blocks.
	GetBlocks(context.Context, <-chan cid.Cid) (<-chan blocks.Block, error)
}

Additional elements:

  • The go-blockservice.BlockGetter interface should be replaced with the Fetcher interface.
  • The output channel should have the same buffer as the input channel (I think?).
  • The error return type is for indicating that the request couldn't even be started (e.g., closed service). This interface doesn't really have a way to report runtime errors (and there's almost always nothing we can do about them anyways).

This will require changes to go-ipfs-exchange-interface, go-blockservice, and go-bitswap.

@Stebalien
Copy link
Member Author

The output channel should have the same buffer as the input channel (I think?).

I'm really not sure the best way to do this. We could also just have no backpressure at this layer and say that the caller shouldn't ask for more blocks than it can handle.

@dirkmc
Copy link
Contributor

dirkmc commented Dec 9, 2019

This proposal LGTM - I'm not sure I understand what "the same buffer" means in this context - do you mean the size of the output channel buffer is the same as the size of the input channel buffer?

@Stebalien
Copy link
Member Author

do you mean the size of the output channel buffer is the same as the size of the input channel buffer

Yes. The question is really: how do we apply backpressure (and do we)?

@dirkmc
Copy link
Contributor

dirkmc commented Dec 10, 2019

Currently in Bitswap

If Session.GetBlocks() takes a channel of CIDs it seems like it would make sense for it to do something similar, ie

  • pop wants off the incoming channel immediately and put them in a queue
  • buffer received blocks until the client is ready to process them

@Stebalien
Copy link
Member Author

SGTM.

@dirkmc
Copy link
Contributor

dirkmc commented Dec 10, 2019

@aschmahmann and I talked it over and he suggested that there may be some scenarios in which back-pressure would be useful. For example if an app uses Bitswap to download a movie, it would be useful to provide a buffer size to the session such that Bitswap would stop popping wants off the input channel once the downloaded blocks exceed the buffer size. This would allow Bitswap to prioritize bandwidth for Sessions that need the data immediately.

On the other hand because Bitswap works best when it can request a lot of blocks in parallel, and because it doesn't provide a guarantee of the order of blocks the buffer size may not be that useful in practice.

@Stebalien
Copy link
Member Author

My concern is that bitswap will keep popping blocks off the queue until it starts downloading blocks. That means we need an infinite receive buffer anyways. Basically, the tradeoff is between backpressure and potential deadlocks and the backpressure may not be effective enough to make it worth it.

We should consider some kind of outstanding wants limit (total and per session). But that's a more general issue that we can probably tackle separately.

@dirkmc
Copy link
Contributor

dirkmc commented Dec 11, 2019

I agree I think the glove doesn't quite fit so it's better to punt on back-pressure

@jacobheun jacobheun added the epic label Apr 20, 2020
@jacobheun jacobheun added this to the go-ipfs 0.6 milestone Apr 20, 2020
@jacobheun jacobheun modified the milestones: go-ipfs 0.6, go-ipfs 0.7 May 26, 2020
@Wondertan
Copy link
Member

Wondertan commented Jun 1, 2020

How then NodeGetter will work?

I am a bit ahead of the issue due to the development of the small protocol that is using the same pattern with input CID chan. A problem is that I still have not decided on the best way to make the pattern work with DAGService without changing its interface and I am curious about your plan here.

By the way, my protocol has a nice dynamic buffer for back-pressure with ordering which may match your needs.

@Stebalien
Copy link
Member Author

Yeah, this would require changes in the DAGService. We'd likely add an optional StreamBlocks() method.

Is there a description somewhere of go-blockstream's design? I'd love to take a look.

@Wondertan
Copy link
Member

Ok, thanks. I would like to help pushing this forward. Also, an intention for the PR here is to implement a new NavigableNode over the streaming interface. In case DAGService will have the new method for that we can simply reimplement it or still go with a new one.

Currently, I don't have any detailed description of the protocol, but going to write it soon after some changes needed for handling stream reset cases. Simply saying, blockstream removes unnecessary complexity from bitswap when block providers are already known and splits requests between them with order guarantees. Additionally, the codebase has convenient IPLD walking functionality over the streaming interface. In prospect, I want to integrate IPLD selectors in it - making a graphstream?😄

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants