On-chain Retrieval Expectations #861
Replies: 6 comments 4 replies
-
Most of this doesn't quite make sense to me, although I do think that some parts could after some iteration and clarification. One problem you point to is that discussions about retrievability are imprecise. I can see that as a problem, but don't see how consensus protocol changes are a great way to resolve that. Would the first step not be to upgrade the discussions to use precise concepts of bandwidth, latency etc? Similarly for clients negotiating storage deals - if the deal is to include expectations about bandwidth, latency, availability etc, they should be explicit in the deal. I mean deal in a very general way here. The built-in market actor specifically is very limited, and it's quite unfortunate that it is a built-in actor and currently holds a privileged position in data onboarding. FIP-0076 and others aim to resolve this so that user-programmed smart contracts are equally capable of brokering and holding the metadata for a deal. Such markets could define any terms they want, including expectations for retrieval frequency, bandwidth etc. I can clearly see the value in an FRC defining some common standards around this across markets. (I would suggest that it directly address the attributes like bandwidth, frequency etc rather than bundle into a few categories, but an far from expert in user needs here). But change to the built-in market actor is expensive, disruptive, and limited. The linked FIP draft implies change to it by changing the DealProposal type that it deals in, but lacks detail around the necessary migrations etc. Your point 2 identifies a problem that I'm not convinced needs to be solved (and certainly not at the consensus level). I think you're saying that a client needs to know about all of an SP's other deals with other clients in order to know if an SP can meet their obligations. This seems generally impossible, not solved elsewhere, and not much needed. An Amazon/Google/Azure client has no insight into those services' other obligations, and its unthinkable that they might know about all other customers. Whether the service meets client expectations is a question answered at the time clients request their data. I think it quite unlikely that any of these services is provisioned for all their clients to retrieve all data stored with them even on a schedule that would be quite modest for any individual client and data. Can AWS export all their data in even a year? Who knows? The impossibility of knowing whether a service can meet some future obligations is not relevant to their clients. I think it's impossible anyway, even given a public blockchain. SPs will be able to take on arbitrary obligations beyond those captured by any central mandated standard. If squeezed, one would expect them to honour the expectations that are most economically valuable to them, but that may not be the standardised ones. I can see that discussion about retrieval is imprecise. I also do agree that smart contracts that are acting as brokers, markets, or other deal-related things on chain should probably have some explicit encoding of expectations of retrieval (and other things). The problems I see is that those on-chain brokers don't exist or aren't doing this, and that even if they were there would be value in a standardised representation. However, the built-in market isn't the right place to implement this (or anything, if we can avoid it). Almost everything in this discussion is application-level concerns to be addressed by user-programmed contracts and conventions among them. |
Beta Was this translation helpful? Give feedback.
-
i am not sure what this proposal aims at. i mentioned it in a, now deleted, comment before: there is no course of action on L1 to enforce this. it is technically impossible to prove retrieve-ability. as alex mentions here https://github.com/filecoin-project/FIPs/pull/862/files#r1405505256 the there is a reason clients use off chain agreements: it can, if needed, be litigated (lets be generous and classify whatever fil+ does as litigation...) over to enforce them. if there is a way to enforce (incentivize/punish) retrive-ability on L1 in a mathematically sound way i am happy to hear about it. everything i have seen so far is far away from what one would want to base a L1 consensus on. all this doesn't mean that i do not agree with the general idea behind this discussion - it for sure makes sense to have a discussion about how to categorize retrieval expectations. but i also agree with alex that the proposed tiers are way to unspecified in their current form. and i honestly do not see the community agree on tight enough specifications for this to be even a FRC. the pressing problem i see right now:
in short: this is all L2, not L1 (when i say L1 i mean the filecoin L1 we have and will have. there are for sure other projects that do something on their L1 to do stuff in the data availability direction, the Binance storage chain thingy comes to mind here) |
Beta Was this translation helpful? Give feedback.
-
Related to this FIP and the discussion around the right places to push for improving the reliability of filecoin retrievals, we're starting up a biweekly synchronous meeting to track active threads and keep ourselves accountable. If you're interested in participating, the meetings can be found at https://lu.ma/retrieval-wg |
Beta Was this translation helpful? Give feedback.
-
Have we considered to add the tiers to a user deployed actor rather a builtin actor? With the recent work from fvm team towards actor upgrade - I think that gives a more flexible way to update teirs as new requirement evolves |
Beta Was this translation helpful? Give feedback.
-
Have you guys watched my talk from ipfs thing in Brussels because I have a proposal for how to enforce and incentivize retrievals Banyan will implement the above next year But I think NONE of this info should be encoded into deal proposals on actor-level... we should do retrieval SLAs composably on the FVM... it should be an "added layer of agreement" not a "different deal" |
Beta Was this translation helpful? Give feedback.
-
Don’t over-engineer the consensus algo. This can be done at the application layer. aggregators like Lighthouse are capable of serving retrievals just fine |
Beta Was this translation helpful? Give feedback.
-
Background
There is not a cryptographic protocol guaranteeing retrieval in the same way that there is with storage. The retrieval side of the Filecoin "storage+retrieval market" remains as a consequence undeveloped.
What this has led to is two primary issues:
a. fil+ discussion on retrievability does not specify what retrieval actually means.
b. fil+ recent proposal for ac bot specifies retrieval via a "retrieval bot score" minimum, which does not specify the actual bandwidth provisioning expected of SPs.
Problem
There is not an on-chain record of retrieval characteristics of stored data. without community buy-in, we reduce the ability of any l2 solution to succeed in promoting reliable retrieval on top of filecoin.
Proposal
The full proposal is linked here.
We are proposing 5 'tiers' of retrieval that a deal is classified into:
This will reduce the implementation space, to better allow providers to provision, and to not have to guess at provisioning levels when most clients will not know exactly what level they will need apriori.
We encode these tiers in a deal proposal (or equivalent in a DDO world) so that there's a common location where clients and the network as a whole can understand an estimate of how much retrieval providers are signed up to provide based on their current storage load.
Including an initial tier estimate in the ask negotiation (which happens off chain, but is reflected in the need to include it in the proposal) means that SPs can better differentiate their pricing and charge against the bandwidth request associated with stored data.
Beta Was this translation helpful? Give feedback.
All reactions