IPFS being offline causes cascades of errors and potential memory usage bubble #1733

hsanjuan · 2022-07-08T17:40:37Z

Our call to "ipfs pin ls" streams pins which we collect in a map.

If the request "dies" half-way (because IPFS dies), we end up with a map that does not have all the things it should and an error in the logs.

If this happens during a regular RecoverAll() check, the code will potentially think that a huge amount of IPFS pins are missing. This will result in all those items to be queued for pinning (so they go into memory).

While we are doing that, we will be attempting to pin things too, opening requests to IPFS obviously immediately fail, causing huge load and errors, while the queue is getting filled and memory ballooning.

Cluster should be aware if IPFS is not reachable (connection refused) and introduce some sort of delay / retry logic so that it is not possible to hammer a dead-node like now. Probably the ipfsconnector is the best place to have this logic, as it is the place that makes requests to IPFS and has common methods for that.

The problem with too many things being queued due to missing ipfs-pins entries in the pintracker is separate and involves surfacing and acting on pin-ls streaming errors, so that we abort StatusAll calls when they happen.

hsanjuan mentioned this issue Sep 15, 2022

Behaviour improvements when the ipfs daemon is unavailable #1762

Merged

hsanjuan added this to the Release v1.0.3 milestone Sep 15, 2022

hsanjuan closed this as completed in #1762 Sep 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IPFS being offline causes cascades of errors and potential memory usage bubble #1733

IPFS being offline causes cascades of errors and potential memory usage bubble #1733

hsanjuan commented Jul 8, 2022

IPFS being offline causes cascades of errors and potential memory usage bubble #1733

IPFS being offline causes cascades of errors and potential memory usage bubble #1733

Comments

hsanjuan commented Jul 8, 2022