Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #4: Adds files needed for deployment on Kubernetes #494

Closed
wants to merge 1 commit into from
Closed

Fix #4: Adds files needed for deployment on Kubernetes #494

wants to merge 1 commit into from

Conversation

Reza-Rajabi
Copy link
Contributor

@Reza-Rajabi Reza-Rajabi commented Dec 7, 2019

Fixes #4

Type of Change: Enhancement

Description

This PR will hopefully close the Kubernetes related issue, but there are some goals that I couldn't achieve, and we may reach them with your reviews and helps in this PR, or we may make new issues about them. We can also keep #4 open. This pull request is not as well as the perfect picture that I have expected for # 4.

What is missing in this PR:

  • The issue that I had with the 'localhost not showing' in the Dockerfiles, I have it here as well. I can't get the UI of the website while the backend is working.
  • I couldn't find a way to pass variables to the .yaml file as it can't understand (I think) the variable substitution. So there are many magic numbers. Helm seems to have a solution for this matter, but I found it itself bulky and hard to maintain. The config mapping and Secret resource of the Kubernetes seem useless for this concern and I failed while trying them. Although I think we will need a Secret resource to be added to our deployment later.

This will add two files to the repo.

  • Dockerfile-deploy: This helps to make a more secure image with about half of the size of our current development image which we can eventually deploy it, for example, on Kubernetes.
  • telescope.yaml: This file builds a telescope app using the image we have made from the previous file and a Redis image. It orchestrates 3 instances of the app beside a Service resource of Kubernetes in a cluster so that we can have a resilient, manageable and seamlessly updatable deployment. Moreover, this file specifies the health check instruction for Kubernetes to verify the healthiness of each instance of the app and restart the instance if required.

Documents:

  • I have provided instruction in both files about working with them and my assumptions. I will make an issue about updating the related documents.

About test:

  • I think the only way to test this PR is running them, and I expect everybody can have the results as is being demonstrated in the below images:

creating the image and push it on Docker Hub
Screen Shot 2019-12-05 at 3 38 20 AM

deploy on the cluster and watching pods are running
Screen Shot 2019-12-07 at 5 19 27 PM

checking out one instance of the app is running and it is doing the health checks
Screen Shot 2019-12-07 at 5 21 24 PM

Checklist

  • Quality: This PR builds and passes our npm test and works locally
  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Screenshots: This PR includes screenshots or GIFs of the changes made or an explanation of why it does not (if applicable)
  • Documentation: This PR includes updated/added documentation to user exposed functionality or configuration variables are added/changed or an explanation of why it does not(if applicable)

@Reza-Rajabi
Copy link
Contributor Author

@raygervais, please check this PR out.

@Reza-Rajabi
Copy link
Contributor Author

documenting for future use:
https://developer.github.com/v3/guides/delivering-deployments/

@raygervais
Copy link
Contributor

Sorry for the delay, holidays + work is never a fun combo.

Looking into the docker-compose file and the Kubernete's declaration, this looks like a great starting point that telescope can iterate on. Regarding your variable substitution, a hacky way I had done in the best was to leverage SED/AWK to substitute and place in temp directory which would house the production files (never to truly be checked in). A cleaner way, albeit with slightly more code is to leverage Node's string interpolation so that you can have an NPM command say npm run deploy-production which would create the final files with correct values provided by a config.json for example. I'd opt for this route since it's more tangible to those who don't know YAML / Kubernetes syntax. Just a thought that we can explore.

For your localhost issue, I'm trying to explore and help debug but am running into the following issues while trying to start the container you uploaded.

PS > docker run rare2r/telescope-deploy

> telescope@0.0.1 start /telescope
> node src/backend

[ 2019-12-15 16:42:23.858 ] ERROR (18 on 454aefbd8666): Unable to load certs/key.pem
    error: {
      "errno": -2,
      "syscall": "open",
      "code": "ENOENT",
      "path": "/telescope/certs/key.pem"
    }
Sun, 15 Dec 2019 16:42:23 GMT express-session deprecated req.secret; provide secret option at src/backend/web/routes/login.js:17:12
Warning: connect.session() MemoryStore is not
designed for a production environment, as it will leak
memory, and will not scale past a single process.
[ 2019-12-15 16:42:23.876 ] INFO  (18 on 454aefbd8666): Telescope listening on port 3000
[ioredis] Unhandled error event: Error: connect ECONNREFUSED 127.0.0.1:6379
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1129:14)
[ 2019-12-15 16:42:23.879 ] ERROR (18 on 454aefbd8666): Queue feed-queue error
    module: "queue:feed-queue"
    err: {
      "type": "Error",
      "message": "connect ECONNREFUSED 127.0.0.1:6379",
      "stack":
          Error: connect ECONNREFUSED 127.0.0.1:6379
              at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1129:14)
      "errno": "ECONNREFUSED",
      "code": "ECONNREFUSED",
      "syscall": "connect",
      "address": "127.0.0.1",
      "port": 6379
    }
[ 2019-12-15 16:42:23.879 ] ERROR (18 on 454aefbd8666): Queue feed-queue error
    module: "queue:feed-queue"
    err: {
      "type": "Error",
      "message": "Error initializing Lua scripts",
      "stack":
          Error: Error initializing Lua scripts
              at /telescope/node_modules/bull/lib/queue.js:155:30
              at processTicksAndRejections (internal/process/task_queues.js:93:5)
    }
[ 2019-12-15 16:42:23.880 ] ERROR (18 on 454aefbd8666): Queue feed-queue error
    module: "queue:feed-queue"
    err: {
      "type": "Error",
      "message": "connect ECONNREFUSED 127.0.0.1:6379",
      "stack":
          Error: connect ECONNREFUSED 127.0.0.1:6379
              at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1129:14)
      "errno": "ECONNREFUSED",
      "code": "ECONNREFUSED",
      "syscall": "connect",
      "address": "127.0.0.1",
      "port": 6379
    }
[ 2019-12-15 16:42:23.880 ] INFO  (18 on 454aefbd8666): Received UNHANDLED REJECTION, starting shut down
[ 2019-12-15 16:42:23.880 ] ERROR (18 on 454aefbd8666):
    err: {
      "type": "Error",
      "message": "connect ECONNREFUSED 127.0.0.1:6379",
      "stack":
          Error: connect ECONNREFUSED 127.0.0.1:6379
              at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1129:14)
      "errno": "ECONNREFUSED",
      "code": "ECONNREFUSED",
      "syscall": "connect",
      "address": "127.0.0.1",
      "port": 6379
    }
[ 2019-12-15 16:42:23.881 ] INFO  (18 on 454aefbd8666): Web server shut down.
[ 2019-12-15 16:42:23.882 ] INFO  (18 on 454aefbd8666): Feed queue shut down.
[ 2019-12-15 16:42:23.882 ] INFO  (18 on 454aefbd8666): Completing shut down.
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! telescope@0.0.1 start: `node src/backend`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the telescope@0.0.1 start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /home/node/.npm/_logs/2019-12-15T16_42_23_953Z-debug.log

This is coming from running the following commands I'm executing locally:

docker pull rare2r/telescope-deploy
docker run rare2r/telescope-deploy

Am I missing anything? I see the CERT issue and am curious if it's an upstream fix that your branch is missing, or a file on my own W10 system that's missing?

@humphd
Copy link
Contributor

humphd commented Dec 15, 2019

@raygervais this might be pieces missing from our docker setup that only landed recently in doc form at https://github.com/Seneca-CDOT/telescope/blob/master/docs/Local_Login.md. cc @Grommers00, @agarcia-caicedo, @ragnarokatz, who all worked on parts of this, and might have more details.

The tldr; of this is that we require SSL certs to be generated locally for the SSO login flow to work. If we haven't already, we'll have to deal with this somehow in the docker setup. In production, we'd want real certs (Let's Encrypt or the like) vs. self-signed ones we generate.

@humphd
Copy link
Contributor

humphd commented Dec 15, 2019

Also, our errors when things are missing (certs, redis not running, etc.) could be way, way more helpful. I know what's going on, looking at those stacks, but there's no way someone not living and breathing all this code will. We should be logging useful messages: "Looks like you don't have certs, see this doc for details...", "Looks like Redis isn't running, see this doc for info ..."

@raygervais
Copy link
Contributor

Thanks for the clarification @humphd, will look into doing that so I can run it locally. Likewise error messages can always be improved :).

I discovered this during a quick google search, perhaps an GitHub ticket to raise and address? Though we use node, the concept is still valid I'd think compared to NGINX. https://www.humankode.com/ssl/how-to-set-up-free-ssl-certificates-from-lets-encrypt-using-docker-and-nginx

@humphd
Copy link
Contributor

humphd commented Dec 15, 2019

RE: errors, it looks like more recent changes on master would do the right thing for the certs files missing:

logger.error({ error }, 'Unable to load certs/key.pem');
. The redis error is #122, and still needs a proper fix.

@humphd
Copy link
Contributor

humphd commented Dec 15, 2019

Filed #519 on certs for staging. Thanks for the link, @raygervais.

@Reza-Rajabi
Copy link
Contributor Author

@raygervais It looks like I had a combo too 😀
The docker commands that you mentioned only runs the image that you pulled. The thing is, that image is only the image of Telescope itself. To run the project we also need Redis image. In the development version we achieved this by running the docker-compose file that includes Redis image to the project. In this version (deployment) however, I have included the Redis image in the Kubernetes orchestration in the YAML file. So to run the image, we may follow the comments from the telescope.yaml file of this pull request:

 # Change <DockerID> with the Docker Hub namespace (ID/username) in this file
 # To deploy project on Kubernetes we must have telescope image from `Dockerfile-deploy` file pushed on Docker Hub (instruction in `Dockerfile-deploy`).
 # We need `kubectl` installed, and for local simulation of the deployment, `minikube` installed and running.
 # deploy the app using following command
 # $ kubectl create -f telescope.yaml
 # to watch the pods as they gets created
 # $ kubectl get pods --watch
 # to see stdout of containers (telescope or redis) in one specific instance (pod)
 # $ kubectl logs <pod name as shown by above command> -c [telescope | redis]   ---> ex. $ kubectl logs telescope-7b567445fc-pn7cb -c telescope
 # finally access to the app which is running in Kubernetes cluster using this command
 # $ minikube service telescope

Essentially we only need to have the telescope.yaml file (nothing else) and change line 77 of it to

 image: rare2r/telescope-deploy:latest

to use the image that I had (or follow instruction on Dockerfile-deploy file to create a new one) and then run:

kubectl create -f telescope.yaml
kubectl get pods --watch
kubectl logs <pod name as shown by above command> -c telescope

this should show the outputs on the command line the same as the above screenshots. To get the localhost (which is not working) we need to run:

minikube service telescope

@Reza-Rajabi
Copy link
Contributor Author

@raygervais @humphd by using the image I pushed on Docker Hub (rare2r/telescope-deploy:latest) we run the telescope image I made at the time of this pull request. I will make a new image regarding to the updates on master since then, and I will test it using this pull request and I will let you know if there is any issues with cert or other stuff.

# to watch the pods as they gets created
# $ kubectl get pods --watch
# to see stdout of containers (telescope or redis) in one specific instance (pod)
# $ kubctl logs <pod name as shown by above command> -c [telescope | redis] ---> ex. $ kubectl logs telescope-7b567445fc-pn7cb -c telescope
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling error, change to kubectl

@naiuhz
Copy link

naiuhz commented Dec 19, 2019

Hi @sukhbeersingh @kartik-budhiraja and @manekenpix ,
I'm responsible for setting up the Kubernetes cluster for Telescope per @humphd 's request.
However, I'm still new to using Kubernetes so please feel free to teach me the ropes.
Right now, I have Kubernetes tools and Minikube installed on ireland.cdot.systems, which I assigned as the Kubernetes master. I'm still in the process of acquiring additional servers as worker nodes.

So I have some questions for guys related to Telescope and Kubernetes deployment:
Am I missing any files not on the Telescope repository? (eg. telescope.yaml, certs/key.pem)
And what are the steps to deploying telescope on the Kubernetes cluster? Is it just this?

$ kubectl create -f telescope.yaml
$ kubectl get pods --watch
$ kubectl logs -c [telescope | redis] ---> ex. $ kubectl logs telescope-7b567445fc-pn7cb -c telescope
$ minikube service telescope

@Reza-Rajabi
Copy link
Contributor Author

@naiuhz
You may not use Minikube since you have Kubernetes.
I am not sure if we can deploy using these commands yet. I couldn't get the localhost using the files in this pull request.
At this point, You may try deploying the project using an image that you can make from our docker-compose file in the project's master branch and the Dockerfile-deploy from this pull request.
You may need to make sure if this suggestion works for your purpose because this pull request has not been reviewed yet.

@humphd
Copy link
Contributor

humphd commented Dec 20, 2019

I filed #526 to help prepare for this as well.

@c3ho
Copy link
Contributor

c3ho commented Jan 7, 2020

Hi @sukhbeersingh @kartik-budhiraja and @manekenpix ,
I'm responsible for setting up the Kubernetes cluster for Telescope per @humphd 's request.
However, I'm still new to using Kubernetes so please feel free to teach me the ropes.
Right now, I have Kubernetes tools and Minikube installed on ireland.cdot.systems, which I assigned as the Kubernetes master. I'm still in the process of acquiring additional servers as worker nodes.

So I have some questions for guys related to Telescope and Kubernetes deployment:
Am I missing any files not on the Telescope repository? (eg. telescope.yaml, certs/key.pem)
And what are the steps to deploying telescope on the Kubernetes cluster? Is it just this?

$ kubectl create -f telescope.yaml
$ kubectl get pods --watch
$ kubectl logs -c [telescope | redis] ---> ex. $ kubectl logs telescope-7b567445fc-pn7cb -c telescope
$ minikube service telescope

Hey @naiuhz, I think you need another step

You created a deployment with
kubectl create -f telescope.yaml

However, don't you need to create a service for it with the following step?
kubectl expose deployment telescope

Then finally run the service you created with
minikube service telescope

@humphd
Copy link
Contributor

humphd commented Jan 9, 2020

Can we use GitHub's new package registry for our Docker images? https://help.github.com/en/github/managing-packages-with-github-packages/configuring-docker-for-use-with-github-packages.

I'm wonder if this would be better than Docker Hub, since it's per-account.

@humphd
Copy link
Contributor

humphd commented Jan 9, 2020

Filed #530 on getting the domains we need setup.

@c3ho
Copy link
Contributor

c3ho commented Jan 9, 2020

@humphd We can actually build the images using the dockerfile we have locally to minikube, if we choose to go that route. Changing some settings in the yaml file will cause minikube to look for a local image instead of going straight to Docker Hub.

@humphd
Copy link
Contributor

humphd commented Jan 9, 2020

For continuous deployment, it's more common to have Kubernetes pull pre-built images from a registry. Otherwise we have to build tooling on the deployment machine to get the code out of github, build it, build the Docker image, etc.

@c3ho
Copy link
Contributor

c3ho commented Jan 9, 2020

Understood

@Reza-Rajabi
Copy link
Contributor Author

Can we use GitHub's new package registry for our Docker images? https://help.github.com/en/github/managing-packages-with-github-packages/configuring-docker-for-use-with-github-packages.

I'm wonder if this would be better than Docker Hub, since it's per-account.

I think it is better if we use it.
It connects the repo to the Docker Hub anyway because it uses docker commands (login, pull, push).

@c3ho
Copy link
Contributor

c3ho commented Jan 9, 2020

@Reza-Rajabi were you able to get this running on minikube?

# specifications of resource
spec:
selector:
name: telescope
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to useapp: telescope here otherwise an endpoint wasn't being assigned to the service

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't need to have a named endpoint. I tried to prevent any Kubernetes vocabulary that I thought it was unnecessary. Why do you think we need to have a named endpoint?

Copy link
Contributor

@c3ho c3ho Jan 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was never able to get the deployment and the service up and running through the yaml at the same time, the deployment I verified works, the service not so much. Whenever I tried using the service with minikube service telescope , I wouldn't be able to access the website

Until the change, these were my steps to get it running

  1. kubectl create -f telescope.yaml
  2. kubectl delete services telescope
  3. kubectl expose deployment telescope --type=NodePort
    or kubectl expose deployment telescope --type=LoadBalancer
  4. minikube service telescope

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not able to get the localhost although the backend was working. So if you could get the localhost with this change, I should change it as well.

nodePort: 30000
# This resource needs to communicate with outside of the cluster
# and share traffic between instances of app
type: LoadBalancer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this "works" for minikube, we should change this to NodePort, since we don't actually have a load balancer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a LoadBalancer and I have it here by purpose. I actually commented about the reason right at the line above: share traffic between instances of app. We have 3 replicas of the app (line 58 of this file) running at the same time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a difference between having replicas and actually load balancing. If you want to use the load balancing service, you need to use minikube tunnel as well (I think).

https://minikube.sigs.k8s.io/docs/tasks/loadbalancer/
https://blog.codonomics.com/2019/02/loadbalancer-support-with-minikube-for-k8s.html

@humphd
Copy link
Contributor

humphd commented Jan 20, 2020

A question I had, and wonder if I should file: do we want to split our node app up into two parts? Specifically:

  1. the API web server
  2. the feed parser and worker processes

The two of these are connected via Redis, but don't need to live in the same process. I wonder if we'd be smart to divide them so that the web server can scale separately to the feed parser, which won't really need to grow/shrink due to client traffic.

@Reza-Rajabi
Copy link
Contributor Author

A question I had, and wonder if I should file: do we want to split our node app up into two parts? Specifically:

  1. the API web server
  2. the feed parser and worker processes

The two of these are connected via Redis, but don't need to live in the same process. I wonder if we'd be smart to divide them so that the web server can scale separately to the feed parser, which won't really need to grow/shrink due to client traffic.

That's a very smart catch.
I think it would be a wise enhancement.
Although, I can imagine 3 scenarios:

  1. one instance of worker and, for example, 3 instances of server: then the worker may not be stable.
  2. for example, 3 instances of each worker and server: then both are stable, but it is almost the same as the current implementation.
  3. for example 3 instances of worker and as many instances as server that fits the traffic (i.e 5): it makes sense to separate the worker from the server.

I hope I didn't get the question wrong.

@humphd
Copy link
Contributor

humphd commented Jan 21, 2020

The way that the "worker" portion behaves, we won't ever want more than one instance. It co-ordinates work for parallel processor functions, each of which run in their own process. So with a single "worker" instance, we can still parallelize via separate processes.

The web server could need to scale horizontally, and it doesn't have any issues that will make it not work if there are more than 1.

I think what I'm hearing here is that I should probably file an issue to split things up. Agree? We can figure out details there vs derailing this PR.

@Reza-Rajabi
Copy link
Contributor Author

I understand that we only need one worker. What I meant by more than one instance for the worker is the fact that the health check of the worker may fail at any point, so the more instances running guarantee the stability of the deployment. As I mentioned, the first scenario seems to have reliability concerns.

@c3ho
Copy link
Contributor

c3ho commented Jan 21, 2020

Here's a problem, if both are connected via Redis don't we still need all of them inside one pod for the shared volume?

@manekenpix
Copy link
Member

@c3ho I don't think they need to be inside one pod, as long as different pods can see each other and communicate, it should be fine.

@humphd
Copy link
Contributor

humphd commented Jan 21, 2020

@c3ho as @manekenpix says, as long as they share the same network, Redis will be the common "state" between all of the various pieces in our app.

@Reza-Rajabi thanks for your clarifications above, I wasn't understanding your concern, but I get it now. I'm going to file another issue, and we can all figure out how best to divide things up, and how to do it safely.

Sorry to invade this PR. It just made me think about this issue as I read the code.

@humphd
Copy link
Contributor

humphd commented Jan 21, 2020

I moved the discussion to #566.

@Reza-Rajabi
Copy link
Contributor Author

As I said, it is brilliant. The separation improves the design even though it doesn't make the deployment lighter. And for sure if the high traffic occurs (i.e. scenario 3) this approach is definitely the best.

@manekenpix
Copy link
Member

@Reza-Rajabi Thank you for this contribution to the project. Telescope has changed significantly in the last few months and we're going to approach container deployment in a slightly different way. This PR will the base for the new approach, so we really appreciate your work on this issue. We hope we'll be able to count on you, if not as an active contributor, at least as a reviewer.
Thanks again.

@manekenpix manekenpix closed this Feb 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Discussion: Kubernetes
6 participants