Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Retrieval deal fails with "deal data transfer failed: incomplete response" - Client Side #5829

Closed
ethanwu155 opened this issue Mar 17, 2021 · 19 comments · Fixed by #5941

Comments

@ethanwu155
Copy link

Describe the bug
Client cannot get stored file on FileCoin from miner. Stored file has StorageDealActive Status.

Version (run lotus version):
lotus version 1.5.0+mainnet+git.191a05da4

To Reproduce
Steps to reproduce the behavior:

  1. Run lotus client retrieve --miner <minerID> <data CID> <output file>
  2. Get the following error:
ERROR: retrieval failed: Retrieve: Retrieval Error: error generated by data transfer: deal │ data transfer failed: incomplete response

Expected behavior
Expect to be able to retrieve stored file.

Logs

$ lotus client retrieve --miner f066596 bafyaa6qsgafcmalqudsaeiekkmiypfxq6vgdwepdafg6jyacye46gurjichuapbzxwqhoywq7yjaagelucbyabasgafcmalqudsaeif3e6tsyc7c3anlmobix3744duq7cyasjsmrsmbkdn4besizu5ffyjaagf6zcc36aykcqeaeghh3sbl6bzaqcaibaaeedt5zav7am output.mp4                                                                                       
> Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew)                                  
> Recv: 0 B, Paid 0 FIL, ClientEventDealProposed (DealStatusWaitForAcceptance)            
> Recv: 122 B, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)        
> Recv: 122 B, Paid 0 FIL, ClientEventDealAccepted (DealStatusAccepted)                   
> Recv: 122 B, Paid 0 FIL, ClientEventPaymentChannelAddingFunds (DealStatusPaymentChannelAllocatingLane)                                                                            
> Recv: 122 B, Paid 0 FIL, ClientEventLaneAllocated (DealStatusOngoing)                   
> Recv: 122 B, Paid 0 FIL, ClientEventDataTransferError (DealStatusErrored)               
> Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew)                                  
ERROR: retrieval failed: Retrieve: Retrieval Error: error generated by data transfer: deal data transfer failed: incomplete response 

Screenshots
See attachment

Additional context
None
Screen Shot 2021-03-17 at 3 23 16 PM

@kernelogic
Copy link

This is caused by two things:
#5821
#5409

The miner you tested have set unseal price to be non-zero. So it just go back to non-fast-retrieval, and unsealing failed.

@flyworker
Copy link

we tested both on 1.5.0 and 1.5.3 both have similar issue.

@bq1024
Copy link

bq1024 commented Mar 17, 2021

This is caused by two things:
#5821
#5409

The miner you tested have set unseal price to be non-zero. So it just go back to non-fast-retrieval, and unsealing failed.

We have set unseal price to zero but failed.

@bq1024
Copy link

bq1024 commented Mar 17, 2021

We have exactly the same issue.

Client Side:

$ lotus client retrieve  --miner f019104 --pieceCid=baga6ea4seaqn6mlouekcaghi574xvkshq4kdkk57yboy5focxatabfio4iytwii mAXCg5AIgcSJzsW/ogYWKQQaYZruCwaGaqGmWMSNKDOKbR3W5x5s test-retrieval
> Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew)
> Recv: 0 B, Paid 0 FIL, ClientEventDealProposed (DealStatusWaitForAcceptance)
> Recv: 0 B, Paid 0 FIL, ClientEventDealAccepted (DealStatusAccepted)
> Recv: 0 B, Paid 0 FIL, ClientEventPaymentChannelAddingFunds (DealStatusPaymentChannelAllocatingLane)
> Recv: 0 B, Paid 0 FIL, ClientEventLaneAllocated (DealStatusOngoing)
> Recv: 0 B, Paid 0 FIL, ClientEventDataTransferError (DealStatusErrored)
> Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew)
ERROR: retrieval failed: Retrieve: Retrieval Error: error generated by data transfer: deal data transfer failed: incomplete response

Miner Side:

Mar 17 17:38:36 miner4 lotus-miner[14143]: 2021-03-17T17:38:36.041Z        INFO        markets        loggers/loggers.go:30        retrieval provider event        {"name": "ProviderEventOpen", "deal ID": "15", "receiver": "12D3KooWLb3vya3V3ZU39jZRfVsTEY6KB93czht8AdVAF3pMCgqA", "state": "DealStatusNew", "message": ""}
Mar 17 17:38:36 miner4 lotus-miner[14143]: 2021-03-17T17:38:36.069Z        INFO        markets        loggers/loggers.go:30        retrieval provider event        {"name": "ProviderEventDealAccepted", "deal ID": "15", "receiver": "12D3KooWLb3vya3V3ZU39jZRfVsTEY6KB93czht8AdVAF3pMCgqA", "state": "DealStatusUnsealing", "message": ""}
Mar 17 17:40:34 miner4 lotus-miner[14143]: 2021-03-17T17:40:34.586Z        INFO        markets        loggers/loggers.go:30        retrieval provider event        {"name": "ProviderEventUnsealComplete", "deal ID": "15", "receiver": "12D3KooWLb3vya3V3ZU39jZRfVsTEY6KB93czht8AdVAF3pMCgqA", "state": "DealStatusUnsealed", "message": ""}
Mar 17 17:40:34 miner4 lotus-miner[14143]: 2021-03-17T17:40:34.597Z        INFO        dt-impl        impl/impl.go:370        resume channel 12D3KooWLb3vya3V3ZU39jZRfVsTEY6KB93czht8AdVAF3pMCgqA-12D3KooWH5PDdPxQsPSegABD8FUkSvcg7czGYWwWsj666aEAhyDx-15
Mar 17 17:40:34 miner4 lotus-miner[14143]: 2021-03-17T17:40:34.598Z        INFO        graphsync        responsemanager/responsemanager.go:481        response failed: %!w(traversal.SkipMe={})
Mar 17 17:40:34 miner4 lotus-miner[14143]: 2021-03-17T17:40:34.608Z        INFO        markets        loggers/loggers.go:30        retrieval provider event        {"name": "ProviderEventDataTransferError", "deal ID": "15", "receiver": "12D3KooWLb3vya3V3ZU39jZRfVsTEY6KB93czht8AdVAF3pMCgqA", "state": "DealStatusErrored", "message": "deal data transfer failed: incomplete response"}

@pooja
Copy link
Contributor

pooja commented Mar 18, 2021

Some related miner logs:
Screen Shot 2021-03-18 at 11 54 34 AM

@ethanwu155
Copy link
Author

ethanwu155 commented Mar 19, 2021

I upgraded lotus to 1.5.3rc and tried to run the retrieval again and the result is the same.

Screen Shot 2021-03-19 at 9 16 05 PM

Daemon Logs:

Screen Shot 2021-03-20 at 8 40 25 AM

@dirkmc
Copy link
Contributor

dirkmc commented Mar 22, 2021

It seems that the error is getting lost somewhere in the unsealing code.
First step is to have better error logging on unseal failure: #5851

@dirkmc
Copy link
Contributor

dirkmc commented Mar 22, 2021

We need to bubble up the fix for a better message than "incomplete response" from filecoin-project/go-data-transfer#145
I made a release of go-data-transfer with this fix in v1.2.9: filecoin-project/go-data-transfer#167

@dirkmc
Copy link
Contributor

dirkmc commented Mar 22, 2021

In go-fil-markets close the reader after unsealing completes, and check the result of the close to see if it returns an error: filecoin-project/go-fil-markets#507

@dirkmc
Copy link
Contributor

dirkmc commented Mar 22, 2021

Release markets v1.2.0: filecoin-project/go-fil-markets#509
Next we need to bubble up to lotus and test out the release manually: #5853

@pooja
Copy link
Contributor

pooja commented Mar 22, 2021

Hey team, adding a couple other details:

  • The storage deal was made with an offline data transfer
  • The miner was storing an unsealed copy of the data. Not sure if an unseal was triggered. Have asked the team to chime in here!

@kernelogic
Copy link

@dirkmc it's pretty easy to reproduce - let me know if there's a release that can be tested with I'll be happy to do it.

@dirkmc
Copy link
Contributor

dirkmc commented Mar 22, 2021

Thanks, that would be great - we're testing out the release and once it's ready we'd love for you to test it.
I'm not sure that the change I've made will fix the bug, as it may be on the unsealing side. But it should at least give more information in the logs.

@jamerduhgamer
Copy link

Looks like unsealing did occur even though we had an unsealed copy of the data. Pictures of the log from the miner below.
Deal ID 4 Unseal
Deal ID 4 Unseal Complete
Graphsync ReceiveError

@jamerduhgamer
Copy link

This is the corresponding journal entry for the retrieval deal above.
Lotus Journal for DealID 4.txt

@jennijuju
Copy link
Member

assigned as blocked b/c according to #5829 (comment) May be fixed the new market release

@kernelogic
Copy link

@f8-ptrk stored a deal from me with fast-retrieval, he also set the unsealing price to 1 FIL.
When I do the retrieval with him, guess what, it immediately failed with the "incomplete response" message and he obtained 1 FIL from me.

@dirkmc if you think it is just a confusing message, no unsealing happened, I don't know how to explain the fee then

lotus client retrieve --miner f062353 bafykbzacebcilfo3sfu4o6re5lz7zh66xdz6la74nmp7y7idcofeysuvnb7p2 /mnt/ipfs/CC-MAIN-2018-47-index.tar.05

Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew)
Recv: 0 B, Paid 0 FIL, ClientEventDealProposed (DealStatusWaitForAcceptance)
Recv: 0 B, Paid 0 FIL, ClientEventUnsealPaymentRequested (DealStatusAccepted)
Recv: 0 B, Paid 0 FIL, ClientEventPaymentChannelAddingFunds (DealStatusPaymentChannelAllocatingLane)
Recv: 0 B, Paid 0 FIL, ClientEventLaneAllocated (DealStatusOngoing)
Recv: 0 B, Paid 0 FIL, ClientEventPaymentRequested (DealStatusFundsNeeded)
Recv: 0 B, Paid 0 FIL, ClientEventSendFunds (DealStatusSendFunds)
Recv: 0 B, Paid 0 FIL, ClientEventVoucherShortfall (DealStatusCheckFunds)
Recv: 0 B, Paid 0 FIL, ClientEventPaymentChannelAddingFunds (DealStatusPaymentChannelAddingFunds)
Recv: 0 B, Paid 0 FIL, ClientEventPaymentChannelReady (DealStatusOngoing)
Recv: 0 B, Paid 0 FIL, ClientEventPaymentRequested (DealStatusFundsNeeded)
Recv: 0 B, Paid 0 FIL, ClientEventSendFunds (DealStatusSendFunds)
Recv: 0 B, Paid 1 FIL, ClientEventPaymentSent (DealStatusOngoing)
Recv: 0 B, Paid 1 FIL, ClientEventDataTransferError (DealStatusErrored)
Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew)
ERROR: retrieval failed: Retrieve: Retrieval Error: error generated by data transfer: deal data transfer failed: incomplete response

https://filfox.info/en/message/bafy2bzaced7csrimf3r5kds6cmz3zw6o5pdaal6qw6gxtpa5bnhhktdimwerw

@kernelogic
Copy link

Re-tested retrieval on my own miner for both unseal price is zero (successful) and non-zero (failed). In the log there's some weirdness about unsealing (shouldn't even happen, deal is fast-retreival), also some weirdness about cancel-transfer saying channel not found.

lotus -v
lotus version 1.5.3+mainnet+git.358773e2b

unsealprice_nonzero_failure.txt
unsealprice_zero_success.txt

@dirkmc
Copy link
Contributor

dirkmc commented Mar 31, 2021

Likely root cause: filecoin-project/go-fil-markets#528

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants