Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: clarify how to start a multi-node cluster #347

Closed
lanzafame opened this issue Mar 14, 2018 · 7 comments
Closed

docs: clarify how to start a multi-node cluster #347

lanzafame opened this issue Mar 14, 2018 · 7 comments
Assignees
Labels
topic/docs Documentation

Comments

@lanzafame
Copy link
Contributor

Just going to list out a few of the issues I hit and where I felt the docs fell down in assisting with troubleshooting them.

The first and main issue that I encountered was getting the ipfs-cluster-service nodes to be able to communicate with each other.

This was partly due to me having the incorrect expectations about how the cluster would go about forming, basically, if they had the same CLUSTER_SECRET they would auto-discover each other similar to ipfs nodes, when within the same network, which they were in this case.

Things that I found confusing about the cluster creation process:

  • ipfs-cluster-service init cmd creates config directory and service.json file. There appears to be no way of providing a service.json file to init, which when using a docker container means you have to run the container so it runs init and then stopping the container and editing service.json and then starting the container again. (Note: I was able to do this by mounting named volumes).
  • The chicken-and-egg aspect of starting a cluster isn't very clear in the documentation. I found this really confusing due to the aforementioned false expectation about member auto-discovery based on the shared CLUSTER_SECRET. To clarify what I mean by 'chicken-and-egg', the requirement to start one node then take that nodes multiaddress and add it to service.json of the other nodes as the bootstrap address (if doing dynamic cluster membership)

The other thing is how the documentation is split over several documents and it doesn't always make sense why some are split and some are not.

@lanzafame lanzafame added the topic/docs Documentation label Mar 14, 2018
@hsanjuan
Copy link
Collaborator

hsanjuan commented Mar 14, 2018

These are fair comments... init provides a default configuration for a single-peer cluster which needs to be manually updated. In the case of docker, it is expected that this is mounted from a volume as otherwise there is no control over it and it would be lost during restarts. But yeah, it's true it's not clear enough. The easiest way is outlined here: https://github.com/ipfs/ipfs-cluster/blob/master/docs/HOWTO_build_and_update_a_cluster.md, on which you bootstrap to a single cluster peer and you can get away without touching the init-created config.

We have at least three fronts to address these issues:

Docs

  • Clarify how to configure a multi-peer cluster
  • Consolidate documentation (@lanzafame can you specify which several documents should be consolidated?)
  • We need more docs on how to run cluster in production. The cluster guide has too much information to be used as a checklist to bring up a cluster.
  • We need to re-visit the cluster guide and update some aspects about upgrading.

Initialization

  • Can we improve our init to make it more useful when creating a multi-peer cluster?
  • We used init to be similar to ipfs, but ipfs-cluster initialization requires more steps than init. Perhaps we should rename this command to config default or config --profile singlepeer, as it leads to confusion.
  • Tool to create cluster configurations for a whole cluster given the size and a template config? (and then the user can copy them?)

Autodiscovery

  • We have Ipfs-cluster with consul #158
  • We could also start using libp2p mDNS for host discovery before launching cluster the first time. We might need a custom mDNS service to not mix with IPFS nodes, and with cluster nodes with different secrets.

Let's discuss here a bit more about these things before I proceed to create issues...

@lanzafame
Copy link
Contributor Author

I brain dumped some thoughts/questions yesterday when trying to figure where I had gone wrong in setting up the cluster, it may be worthwhile putting them in FAQ, or not 😛:

Q: does ipfs-cluster-service live-reload configuration on changes to service.json?

Q: can a cluster use both a stable and dynamic cluster membership configuration?

Q: should there be a way to configure multiple peers at once, i.e. add a peer to all other cluster members? Associated with this, management of multiple ipfs-cluster-service nodes is rather cumbersome if there is no way of sending requests to them as a group <- sorry if this doesn't make sense, these are raw, minimally processed thoughts.

@hsanjuan
Copy link
Collaborator

Q: does ipfs-cluster-service live-reload configuration on changes to service.json?

No

can a cluster use both a stable and dynamic cluster membership configuration?

It will fail to start. Either you bootstrap or either you write the peers.

Should there be a way to configure multiple peers at once, i.e. add a peer to all other cluster members?

That's exactly what bootstrapping does...

@lanzafame
Copy link
Contributor Author

Should there be a way to configure multiple peers at once, i.e. add a peer to all other cluster members?

That's exactly what bootstrapping does...

Sorry, I meant like running the ipfs-cluster-ctl peers add ... command, as far as I understand this only adds the new peer to the node's peers list that ctl is communicating to?

@lanzafame
Copy link
Contributor Author

Consolidate documentation (@lanzafame can you specify which several documents should be consolidated?)
My thoughts are architecture and conceptual docs should be consolidated and then have operations docs.
We need more docs on how to run cluster in production. The cluster guide has too much information to be used as a checklist to bring up a cluster.
We need to re-visit the cluster guide and update some aspects about upgrading.

Maybe, we could model the docs along the lines of the Hashicorp products, i.e. Consul? I find the sections they have are good for finding what you are after.

The format of the following list is Name of Consul section with link: ipfs-cluster equivalent

  • Installing: would be the several ways to get ipfs-cluster-service and ipfs-cluster-ctl
  • Upgrading: how to upgrade, what requires an upgrade, stuff about migrations
  • Internals: architecture and conceptual documents
  • Commands (CLI): ipfs-cluster-ctl documents
  • Agent: ipfs-cluster-service documents
  • Guides: operation documents, i.e. how to setup a multi-node cluster, network setup considerations, etc
  • FAQ: faq

@hsanjuan
Copy link
Collaborator

Sorry, I meant like running the ipfs-cluster-ctl peers add ... command, as far as I understand this only adds the new peer to the node's peers list that ctl is communicating to?

No, peer add as a peer to the cluster (so to all), but I'm planning to remove peer add because it implies that your peer is up, which might involve divergent log entries and it's a very good way to mess things up. That's why docs don't mention peer add anymore and focus on bootstrap.

@nothingismagick
Copy link

nothingismagick commented Mar 21, 2018

I have to agree with @lanzafame - The install process is not trivial and full of gotchas. In my case I am quite new to GO, but not afraid of anything (I usually work in a well-linted, standard nodejs / yarn / webpack / vue / quasar / electron / cordova / mocha / chai environment). The highlight of my day was stumbling across "ipfs-completion.bash". How can that not be written in HUGE letters? Why do I have to post an issue to find out the "right" version of go to use?

It seems that there are many assumptions about developer experience with other layers of the IPFS stack, and I although I understand what I need (I think) I really have no idea where to go to look for resources that will get me where I want to go. Everything (except go-ipfs, I guess) is either experimental, subject to change or might be thrown out with the laundry... many of the links to examples (I am looking at you IPLD) are either 2 years old or 404'd - the latter of which is ironic for a project where that shouldn't be possible if the dogfood tasted good.

Please, don't be offended. I really want to help! This is a HUGE project, and I am in it for the long run. However, I would like to recommend that there be a concerted effort across Protocol Labs properties to drop the barrier to entry with a visible, unified HOW-TO that starts at the beginning and follows a formula to get humans into it. I am not suggesting that it is important to explain what compilers are or how the internet works. That is a different audience. I think that the nodejs family of IPFS does a much better job of bootstrapping the user into a working environment - but that is something I generally find in node libs...

I think something like Consul would be good too, but extended across EVERYTHING with real-world examples of how to use IPFS, IPFS-CLUSTER and IPLD in harmony. (Speaking of harmony, wasn't orbitDB supposed to get integrated into the IPFS family too...?)

@ghost ghost assigned hsanjuan Apr 26, 2018
@ghost ghost added the status/in-progress In progress label Apr 26, 2018
@hsanjuan hsanjuan mentioned this issue Apr 27, 2018
@ghost ghost removed the status/in-progress In progress label May 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/docs Documentation
Projects
None yet
Development

No branches or pull requests

3 participants