Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestFetchGraph failing randomly #4062

Closed
whyrusleeping opened this issue Jul 12, 2017 · 7 comments · Fixed by #4169
Closed

TestFetchGraph failing randomly #4062

whyrusleeping opened this issue Jul 12, 2017 · 7 comments · Fixed by #4169
Labels
kind/bug A bug in existing code (including security flaws) topic/test failure Topic test failure

Comments

@whyrusleeping
Copy link
Member

very occasionally we get a failure in TestFetchGraph: https://ci.ipfs.team/blue/organizations/jenkins/go-ipfs/detail/PR-4047/57/pipeline/

--- FAIL: TestFetchGraph (0.01s)
	merkledag_test.go:248: merkledag: not found
FAIL

This is likely a bug in FetchGraph or EnumerateChildren, definitely needs looking into.

@whyrusleeping whyrusleeping added kind/bug A bug in existing code (including security flaws) topic/test failure Topic test failure labels Jul 12, 2017
@Kubuxu
Copy link
Member

Kubuxu commented Aug 9, 2017

@Stebalien
Copy link
Member

This appears to be happening more often...

@Stebalien
Copy link
Member

So, this is definitely a bug, not just a racy test. FetchGraph succeeds and then enumerating the graph fails. Possible locations:

  1. offline_ds.GetLinks or GetLinksDirect. The test case uses this, FetchGraph uses GetLinksDirect. This seems unlikely
  2. EnumerateChildrenAsync or EnumerateChildren. I'm betting it's EnumerateChildrenAsync (it's probably miscounting and not waiting for some subgraph to be enumerated.

@Stebalien
Copy link
Member

So much for it being EnumerateChildrenAsync's fault. I've tried replacing EnumerateChildrenAsync (in FetchGraph with EnumerateChildren and it still fails occasionally.

They could both be buggy...

@Stebalien
Copy link
Member

Not GetLinks versus GetLinksDirect (not that surprising).

Also, I'm pretty sure the bug is not in EnumerateChildren as it's a simple recursive function.

@Stebalien
Copy link
Member

So... bitswap is notifying bitswap sessions about blocks before adding them to the blockstore. We really need to clean up this logic.

(actually, bitswap notifies the session about the block twice, once before adding it to the blockstore and once after because bitswap is special).

@Stebalien
Copy link
Member

As a matter of fact, we had a race condition where we could end up returning a block to the user while bitswap was closing and never record the block in the blockstore.

Stebalien added a commit that referenced this issue Aug 24, 2017
…sessions.

fixes #4062 (yay!)

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) topic/test failure Topic test failure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants