Skip to content
This repository has been archived by the owner on Apr 18, 2024. It is now read-only.

Improve resiliency around orchestrator.strn.pl #67

Open
3 tasks
lidel opened this issue Mar 27, 2023 · 2 comments
Open
3 tasks

Improve resiliency around orchestrator.strn.pl #67

lidel opened this issue Mar 27, 2023 · 2 comments

Comments

@lidel
Copy link
Contributor

lidel commented Mar 27, 2023

iiuc multiple DoS vectors exist around cases when:

  • we run out of useful L1s
  • HTTP server running orchestrator is down (Amazon has hiccups every year..)
  • orchestrator.strn.pl has a hiccup (misconfiguration, attac) and returns too few, or no useful L1s

Each one of these events is unlikely, but the risks compound.
Some ideas for anticipatory action:

  • maintain a separate, append-only list of L1s returned by orchestrator, and use it as a backup when orchestrator is down, or does not provide any useful L1s
  • DNS idea 1: fallback to https://l1s.strn.pl or https://strn.pl
    • probably not the best idea, iiuc it always points at the closest L1, so using this could overload "best" L1s
  • DNS idea 2: use DNS-based orchestrator as a backup mirror of HTTP one
    • Instead of sending HTTP request to a centralized server, leverage DNS for caching and distribution.
    • Saturn has two domains, so we have two "DNS mirrors"
    • we could have all.l1s.strn.pl or nearby.l1s.strn.pl which returns more than one A record, replacing the need for HTTP request https://orchestrator.strn.pl/nodes/nearby?count=9999@guanzo thoughts on feasibility?

Other ideas how we can improve resiliency for the day Amazon fails?

cc @aarshkshah1992 @willscott @guanzo

@guanzo
Copy link
Contributor

guanzo commented Mar 28, 2023

Idea: Saturn provides bootstrap nodes (could be L1s run by PL) that host a periodically updated list of L1s.

Smart clients like Caboose/serviceworker can hardcode the bootstrap node ip addresses in their code and fetch the list of L1s as a fallback.

That's similar to how the IPFS bootstrap nodes work, right? https://docs.ipfs.tech/how-to/modify-bootstrap-list/


DNS idea 2 is feasible. I like having 1 global list (ALL IPS), and 1 nearby list. If the nearby list returns 0 then you can fallback to the global list.

@lidel
Copy link
Contributor Author

lidel commented Mar 31, 2023

If possible, we should avoid hard-coding HTTP endpoints as "bootstrappers" – this is the very problem we are trying to solve :)

The libp2p bootstrappers are bit different,

  • it is something we should rely on only during the first cold boot, once we learn about other peers.
  • Sidenote: bootastrappers use multiaddrs with /p2p/peerid suffix, which makes them way more resilient than HTTP URLs. it may not matter much for bootstrapper use case, but matters for peering agreements: even when IP is unreachable, or DNS is down, Kubo might be able to reach out to them if LAN DHT or local peer knows their alternative location

I think if we invest time in this, DNS idea 2 gives us the biggest return in resiliency, because of how DNS hierarchical caching works. And thanks to DoH, we can still use it in browser contexts: https://www.npmjs.com/package/dns-over-http-resolver

So we still can read orchestrator nodes over HTTP, but we no longer need to hardcode specific URL, any DoH endpoint will work.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants