Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ipfs pin add does not fetch raw leaves and thus pins incomplete dag #3688

Closed
kevina opened this issue Feb 14, 2017 · 16 comments · Fixed by #3691
Closed

Ipfs pin add does not fetch raw leaves and thus pins incomplete dag #3688

kevina opened this issue Feb 14, 2017 · 16 comments · Fixed by #3691
Assignees
Labels
kind/bug A bug in existing code (including security flaws)
Milestone

Comments

@kevina
Copy link
Contributor

kevina commented Feb 14, 2017

Version information: ipfs version 0.4.6-dev 2e116b4

Type: Bug

Priority:P1

Description:

In local or offline mode:

$ random 1048576 56 > afile
$ ipfs add --pin=false --raw-leaves afile
added Qmcd9X534wgpJEojZ1Tukngsrt7ySo1sZ2bBNXuyfBiWcb afile
$ ipfs ls Qmcd9X534wgpJEojZ1Tukngsrt7ySo1sZ2bBNXuyfBiWcb
zb2rhZWKCMcHCfCym41ocwE5BhNQsEVLrwBkLEQaxJzRHHZch 262144 
zb2rhdUpCuyoQPRtvBW8JhAEfPaNd9dBb3QQCMMWYS1V8hjBR 262144 
zb2rhnwyvreekPqEMAbKqs35KYHxkNi7pXED7L2TfJKveTTva 262144 
zb2rhnvAVvfDDnndcfQkwqNgUm94ba3zBSXyJKCfVXwU4FXx2 262144 
$ ipfs block rm zb2rhZWKCMcHCfCym41ocwE5BhNQsEVLrwBkLEQaxJzRHHZch
$ ipfs pin add Qmcd9X534wgpJEojZ1Tukngsrt7ySo1sZ2bBNXuyfBiWcb
pinned Qmcd9X534wgpJEojZ1Tukngsrt7ySo1sZ2bBNXuyfBiWcb recursively
$ ipfs cat Qmcd9X534wgpJEojZ1Tukngsrt7ySo1sZ2bBNXuyfBiWcb
Error: failed to fetch all nodes
@kevina kevina self-assigned this Feb 14, 2017
@kevina kevina added the kind/bug A bug in existing code (including security flaws) label Feb 14, 2017
@kevina kevina added this to the ipfs 0.4.6 milestone Feb 14, 2017
@kevina
Copy link
Contributor Author

kevina commented Feb 14, 2017

I noticed this when working on #3671. I think know how to fix it so am assigning it to myself.

@whyrusleeping
Copy link
Member

This is because of that optimization on GetLinks right? (as you probably already know) we should check if we have the block before short circuiting out.

@kevina
Copy link
Contributor Author

kevina commented Feb 14, 2017

Correct. My plan is to change the Visit function in:

func FetchGraph(ctx context.Context, c *cid.Cid, serv DAGService) error {
	return EnumerateChildren(ctx, serv, c, cid.NewSet().Visit, false)
}

To actually retrieve the node from the dag service, because just because we have the link id, doesn't mean we have the node.

@kevina
Copy link
Contributor Author

kevina commented Feb 14, 2017

(as you probably already know) we should check if we have the block before short circuiting out.

Sometimes we don't care if we have the links locally, but when we call FetchGraph we most certainly do.

@kevina
Copy link
Contributor Author

kevina commented Feb 14, 2017

Okay the fix I had in mind didn't work out. This is going to take some time to fix correctly with out defeating the original purpose I had in mind for GetLinks (filestore related and still relevant).

@whyrusleeping
Copy link
Member

@kevina I think its probably fine for the linkservice rawnode optimization to require that we at least have the block. Sure it won't be quite as fast, but it solves this problem much more efficiently.

@Kubuxu
Copy link
Member

Kubuxu commented Feb 14, 2017

Average time of Has call to Datastore on our gateways and storage servers from recent metrics is 140 microseconds.

@whyrusleeping
Copy link
Member

Yeah, its much faster to call Has than get, so the optimization is still quite fast.

@kevina
Copy link
Contributor Author

kevina commented Feb 15, 2017

Can we please slow down. The linkservice is more than an optimization when it comes the interaction of the filestore. I will try to do a proper writeup tonight.

In the mean time I have one question. Do you want the garbage collector to fail if a node in the filestore is no longer accessible? It does not have to be that way.

@whyrusleeping
Copy link
Member

The garbage collector should not be able to tell that a missing node is in the filestore or not.

@kevina
Copy link
Contributor Author

kevina commented Feb 15, 2017

The garbage collector should not be able to tell that a missing node is in the filestore or not.

Let me put this another way. Do you want the garbage collector it to just abort if a leaf node of a pinned object is no longer available for some reason? Please note that it is perfectly safe for it to continue as a leaf node can not possible have any links so there is no risk of it removing something it should not because it could not see it.

This is a very important question that will effect the usability of the filestore (at least how I envisioned it) so I would really like to have several people's opinion on this before we proceed.

@whyrusleeping
Copy link
Member

For now, yes. Abort. But make that an easy enough thing to change.

@kevina
Copy link
Contributor Author

kevina commented Feb 15, 2017

@whyrusleeping would you mind if I refactor EnumerateChildren to preserve the indented behavior of the LinkService while still fixing this bug? It should also be able to eliminate the need for the bestEffort Boolean parameter. It should only take me a day or so.

@whyrusleeping
Copy link
Member

It really seems to me like the fix for this should be as simple as adding "if !blockservice.Has(c) { blockservice.Get(c) }" into the linkservice GetLinks optimization codepath. Can we push that as a fix, add a couple tests for it, and get that in today so that we can proceed with releasing this fix (and a few others)? You can do the refactor of enumerate children after that

@kevina
Copy link
Contributor Author

kevina commented Feb 15, 2017

Okay, I will push a simple fix to EnumerateChildren that does what you suggest I will than work on a more proper fix. Expect something in the next few hours, by the end of the day at the latest.

@kevina
Copy link
Contributor Author

kevina commented Feb 15, 2017

And it looks like #3598 fixed the issue by using EnumerateChildrenAsync instead of EnumerateChildren. I will create a p.r. to add a test case. No code changes required now. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants