Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CRDT] Reconnecting a peer causes issues receiving deltas and sending out pins #798

Closed
lanzafame opened this issue May 28, 2019 · 6 comments
Assignees
Labels
kind/bug A bug in existing code (including security flaws) status/in-progress In progress

Comments

@lanzafame
Copy link
Contributor

Additional information:

  • OS: Linux
  • IPFS Cluster version: master
  • Installation method: built from source

Describe the bug:
Three peer cluster, two peers bootstrapped to the first peer. Shutdown 2nd peer and then start again without --bootstrap as the trusted peers PeerInfo should now be in the peerstore and it will connect to those peers when it starts back up. This worked fine, confirmed with a peers ls against the restarted peer. The issue occurred when pin operations against the first peer started.

The following is an excerpt of the logs from the restarted peer:

15:57:00.335 ERROR       crdt: error getting root delta priority: %s failed to get block for QmVGD6aZQsbjpLpNTWEnK57kYLLtBaFSn4k39hGMw1nzmq: context deadline exceeded crdt.go:413
15:57:00.336 ERROR       crdt: error getting delta: context deadline exceeded crdt.go:420
15:57:01.456 ERROR       crdt: error getting root delta priority: %s failed to get block for Qmaz5xpm5wgMg4ULiE9WtpQz6xdURK2Wuafyi88F1dLAVS: context deadline exceeded crdt.go:413
15:57:01.456 ERROR       crdt: error getting delta: context deadline exceeded crdt.go:420
15:57:02.559 ERROR       crdt: error getting root delta priority: %s failed to get block for QmdfJ7ouWeCa88BYRgrXkGk6kjtkCyC1PHRyeHdYiGitqY: context deadline exceeded crdt.go:413
15:57:02.559 ERROR       crdt: error getting delta: context deadline exceeded crdt.go:420

If you ipfs-cluster-ctl add <file> from the out-of-sync peer, it will attempt to pin on all the peers but at the height of the out-of-sync peer, though it doesn't succed:

### out-of-sync peer
16:10:27.023  INFO    cluster: pinning QmZkDEwrfwHKfoqWg3jnqSVwEWiFCU9TRnnRM2Xe1xYNqh everywhere: cluster.go:1227
16:10:27.026  INFO       crdt: new pin added: QmZkDEwrfwHKfoqWg3jnqSVwEWiFCU9TRnnRM2Xe1xYNqh consensus.go:199
16:10:27.027  INFO       crdt: adding new DAG head: QmUCBQPpMvTymSsuwJPU8KMXveUGHpweD4aamjc4iuj96U (height: 1) heads.go:114
16:10:27.028  INFO      adder: QmZkDEwrfwHKfoqWg3jnqSVwEWiFCU9TRnnRM2Xe1xYNqh successfully added to cluster adder.go:163
16:10:27.036  INFO   ipfshttp: IPFS Pin request succeeded:  QmZkDEwrfwHKfoqWg3jnqSVwEWiFCU9TRnnRM2Xe1xYNqh ipfshttp.go:306
### bootstrap peer
15:56:20.426  INFO    cluster: pinning QmQTzvHwJZ8N3G8tQdiZ9ykGD3Nnxb5cyUh9tJRSXLeAmT everywhere: cluster.go:1227
15:56:20.454  INFO       crdt: new pin added: QmQTzvHwJZ8N3G8tQdiZ9ykGD3Nnxb5cyUh9tJRSXLeAmT consensus.go:199
15:56:20.462  INFO       crdt: replacing DAG head: QmQHzBCnuvuSvyZhwpK16XoggavjvKZ4ebX2wXsZccHRLT -> QmPC53f4uj5vT2sBicyFt6NWPLxAWUA4CdSHcKQqvTaBcx (**new height: 19**) heads.go:82
16:10:27.036  INFO       crdt: new pin added: QmZkDEwrfwHKfoqWg3jnqSVwEWiFCU9TRnnRM2Xe1xYNqh consensus.go:199
16:10:27.038  INFO       crdt: adding new DAG head: QmUCBQPpMvTymSsuwJPU8KMXveUGHpweD4aamjc4iuj96U (**height: 1**) heads.go:114
16:11:03.795 WARNI    cluster: metric alert for ping: Peer: 12D3KooWBZjhNyT26AdHmvgvWeuUWBCs7ChBTuawhX8niPCpjtq2. cluster.go:321
16:11:03.799 WARNI    cluster: metric alert for freespace: Peer: 12D3KooWBZjhNyT26AdHmvgvWeuUWBCs7ChBTuawhX8niPCpjtq2. cluster.go:321

It appears that the out-of-sync peer never recovers...

@lanzafame lanzafame added kind/bug A bug in existing code (including security flaws) need/review Needs a review labels May 28, 2019
@lanzafame
Copy link
Contributor Author

After restarting the out-of-sync peer again but with the --bootstrap flag added back, it appears to function correctly but it does still log the error getting root delta priority error.

@lanzafame
Copy link
Contributor Author

so checking back after fixing my test setup and this one is still broken.

@lanzafame
Copy link
Contributor Author

The two in-sync peers don't even think that the out-sync-peer should be pinning any of the pins...

@lanzafame
Copy link
Contributor Author

#792 should have made that a peer in the trusted peerset should only have to bootstrap once when using CRDT consensus.

@hsanjuan
Copy link
Collaborator

hsanjuan commented Jun 7, 2019

The two in-sync peers don't even think that the out-sync-peer should be pinning any of the pins...

Not sure what that means. If status is returning unpinned is because the out-of-sync peer does not think it should be pinning anything.

Let's discuss about this during standup. I am seeing the "getting root delta" error by default and other peers cannot sync at all because they cannot even get the root so things don't work for me.

@hsanjuan
Copy link
Collaborator

hsanjuan commented Jun 7, 2019

ok I found an issue (the issue probably)

hsanjuan added a commit that referenced this issue Jun 7, 2019
Bitswap needs to exist before connections are opened!

Fixes #798
@hsanjuan hsanjuan self-assigned this Jun 10, 2019
@hsanjuan hsanjuan added status/in-progress In progress and removed need/review Needs a review labels Jun 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) status/in-progress In progress
Projects
None yet
Development

No branches or pull requests

2 participants