Skip to content

Commit

Permalink
remove unnecessary concurrency in last commit
Browse files Browse the repository at this point in the history
  • Loading branch information
whyrusleeping committed Dec 2, 2014
1 parent c80a064 commit 6a5b1e8
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 14 deletions.
16 changes: 3 additions & 13 deletions exchange/bitswap/bitswap.go
Original file line number Diff line number Diff line change
Expand Up @@ -253,20 +253,10 @@ func (bs *bitswap) HasBlock(ctx context.Context, blk *blocks.Block) error {
bs.wantlist.Remove(blk.Key())
bs.notifications.Publish(blk)

var err error
wg := &sync.WaitGroup{}
wg.Add(2)
child, _ := context.WithTimeout(ctx, hasBlockTimeout)

This comment has been minimized.

Copy link
@btc

btc Dec 2, 2014

Contributor

the context can be shared between the two functions. no need to create one for each

This comment has been minimized.

Copy link
@whyrusleeping

whyrusleeping Dec 3, 2014

Author Member

No, it cant. The timeout would then apply to both operations instead of having a separate timeout for each. For example, if our timeout was five seconds and sendToPeersThatWant takes four seconds, that would only leave Provide with 1 second to perform its task.

This comment has been minimized.

Copy link
@btc

btc Dec 3, 2014

Contributor

Yeesh. Do these tend to take that long in practice?

And I think this was a miscommunication. I read hasBlockTimeout and interpreted that as the timeout for the hasBlock operation. However, if it's used for two sequential operations, the actual operation timeout is 2 * hasBlockTimeout. If this is the desired behavior, it needs a different name to avoid this confusion.

This comment has been minimized.

Copy link
@whyrusleeping

whyrusleeping Dec 3, 2014

Author Member

Yeah, the naming is probably bad (have I ever been good at naming things? I should probably work on that), or I could have done something like hasBlockTimeout / 2 for each suboperation.

As for this happening in practice, if a large number of peers want the block we have just received, then sendToPeersThatWant will take quite some time (sending a 1M block to 100 other peers 😢 ), and Its imperative that we do announce to the DHT we are providing the new block.

This comment has been minimized.

Copy link
@whyrusleeping

whyrusleeping Dec 3, 2014

Author Member

I should note that I have also seen the Provides call taking a very long time when the outgoing pipe is backed up.

This comment has been minimized.

Copy link
@jbenet

jbenet Dec 5, 2014

Member

I think hasBlockTimeout / 2 is the better route for this sort of thing. callers should just specify the total deadline time (not 0.5 the total).

As for this happening in practice, if a large number of peers want the block we have just received, then sendToPeersThatWant will take quite some time (sending a 1M block to 100 other peers 😢 ), and Its imperative that we do announce to the DHT we are providing the new block.

Can't these happen in parallel? tons of sequential IO here, we should be able to start announcing to the dht simultaneously, with the same deadline.

This comment has been minimized.

Copy link
@whyrusleeping

whyrusleeping Dec 5, 2014

Author Member

I tried that (look at the previous commit) and a lot of tests started failing for some reason. So I erred on the side of "less concurrency = probably better" and reverted the change.

This comment has been minimized.

Copy link
@jbenet

jbenet Dec 5, 2014

Member

we should figure out why though. these two things are basically separate operations and should be able to be concurrent. which commit?

This comment has been minimized.

Copy link
@whyrusleeping

whyrusleeping Dec 5, 2014

Author Member

look at the removals for the one we are commenting on

This comment has been minimized.

Copy link
@btc

btc Dec 6, 2014

Contributor
  1. Since most bitswap operations are best-effort, the only mechanism that can cause operations to terminate is a context cancellation. For example, GetBlock will hang (and wait for incoming blocks on network) until the context is cancelled or the block is received. It won't return earlier than this.
  2. Thus, to simulate sequences of events between multiple bitswap clients, it is necessary to use context cancellation.
  3. Doing so makes the tests sensitive to wall-clock timing.
  4. Increasing concurrency increases non-determinism. If a goroutine misses its chance to run (randomness of CPU scheduler), desired actions in the test may not go according to plan.
  5. Failures may occur despite correctness of code.
go func() {
bs.sendToPeersThatWant(child, blk)
wg.Done()
}()
go func() {
err = bs.routing.Provide(child, blk.Key())
wg.Done()
}()
wg.Wait()
return err
bs.sendToPeersThatWant(child, blk)
child, _ = context.WithTimeout(ctx, hasBlockTimeout)
return bs.routing.Provide(child, blk.Key())
}

// receiveBlock handles storing the block in the blockstore and calling HasBlock
Expand Down
2 changes: 1 addition & 1 deletion exchange/bitswap/bitswap_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ func TestSendToWantingPeer(t *testing.T) {
t.Logf("%v should now have %v\n", w.Peer, alpha.Key())
block, err := w.Blockstore.Get(alpha.Key())
if err != nil {
t.Fatal("Should not have received an error")
t.Fatalf("Should not have received an error: %s", err)
}
if block.Key() != alpha.Key() {
t.Fatal("Expected to receive alpha from me")
Expand Down

0 comments on commit 6a5b1e8

Please sign in to comment.