-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support http-path in index provider address #117
Comments
FWIW, the example provider shown above is running Curio.
|
@bajtos I have a question for understanding. |
Great question.
My understanding is different. That regexp is used to check if the index provider talks HTTP(s). I don't think we need any changes there. We are converting the multiaddr string to a URL in You can learn more about multiaddr and http-path here:
We must preserve the http-path segment and include it in the base URL. Test vector:
|
I see , thanks for clarifying. So if we convert a regular Addr field like for example: |
@bajtos is there a reason we never used the multiaddress javascript implementation to resolve multiaddresses? |
ContextThe error that the protocol is unsupported occurs while processing the next advertisement. It expects a string that matches the pattern Implementation PlanThere exists a dedicated javascript library for dealing with multiaddresses. We can use it to handle the different forms of multiaddresses and give us information about which protocols are used and what the values for these protocols are. There are multiple ways how to extract the information stored in the multiaddresses. One way is to use the function stringTuples, which yields an array of tuples containing the code for a certain protocol and the value for that protocol. You can find the codes for each protocol here. Similarly to how the function I am keen to hear your feedback on the proposed changes @bajtos @juliangruber @pyropy |
I think I understand now where the confusion comes from. The error message makes it look like the problem is in the code checking whether the address matches /^https?:///
However, that's not the real cause. Take a look here: piece-indexer/indexer/lib/ipni-watcher.js Lines 70 to 75 in 1d8ca94
If we cannot convert the multiaddress to URL ( You can think of enum ProviderAddress {
Supported { url: string },
Unsupported { multiaddr: string },
} All packed into a single JavaScript string variable, where the check In hindsight, it was not the best design decision on my side. To implement support for Essentially, we want to support two cases:
|
When I was looking into conversion of multiaddr into HTTP(S) URL, I reviewed existing npm packages and found them unsuitable to our needs: 1) how they handle non-http protocols like TCP 2) how easy it is to interpret parsing errors. Here are test-cases covering those requirements:
I did not find a I was looking at https://www.npmjs.com/package/@multiformats/multiaddr-to-uri, too. This is what they have in their README:
That's a problem for Spark, because an address like |
I think such solution would work, but it seems like a lot of extra work to me - wouldn't you have to write most parts of See #117 (comment) for my suggestion for a solution that would IMO require less work. |
Here's the structured GitHub comment with the second part in Bash formatting: @bajtos Thank you for your detailed answer! When I first examined how For example, consider the two cases that
If we only consider I attempted a quick integration of /dns/f03303347-market.duckdns.org/https/ Would be processed as usual, while the second part: http-path/%2Fipni-provider%2F12D3KooWJ91c6xQshrNe7QAXPFAaeRrHWq2UrgXGPf8UmMZMwyZ5 Contains the encoded path, which we can extract using This approach should be a minimal change and avoids the added complexity of introducing WDYT? |
Thank you for a detailed explanation. You are right, our current implementation in
I was thinking the same back in the day, that's how I discovered multiaddr-to-uri. (And found it's limitations.)
I like the idea to split the original multaddress at However, we still need to fix the current code parsing the protocol+hostname+port. It should support values like I see two tasks here:
Considering the original goal of this issue - support Curio multiaddr like In other words, I agree with your proposal 👍🏻 |
Sounds good, I created a POC for what we just discussed and all tests are passing including a new case for the |
Some IPNI index providers serve advertisements at a non-root location. We need to improve piece-indexer to support such providers.
Example provider:
https://cid.contact/providers/12D3KooWJ91c6xQshrNe7QAXPFAaeRrHWq2UrgXGPf8UmMZMwyZ5
Address in the multiaddr format:
Base URL we need to build from the above multiaddr:
Full advertisement URL:
https://f03303347-market.duckdns.org/ipni-provider/12D3KooWJ91c6xQshrNe7QAXPFAaeRrHWq2UrgXGPf8UmMZMwyZ5/ipni/v1/ad/baguqeerakqnjugtecgj5hhfmnh46pthk5k3vy6vf4bw3vpmwmudknlatun5a
Current piece-indexer ingestion status:
https://pix.filspark.com/ingestion-status/12D3KooWJ91c6xQshrNe7QAXPFAaeRrHWq2UrgXGPf8UmMZMwyZ5
The text was updated successfully, but these errors were encountered: