-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes support? #19
Comments
The newer implementations of the Horizontal Pod Autoscaler (HPA 1.6+?) support scaling on Custom Metrics (k8s 1.8+ custom metrics), for which there are some preliminary implementations. I think one interesting one is the Prometheus Adapter—if I understand correctly, k8s can pull metrics from prometheus and then an HPA can be set to scale based on a metric in that set. Perhaps in this world the responsibility of Faktory would be to have a prometheus exporter (á la oliver006/redis_exporter) that can send metics to Prometheus. Since an exporter might live separately, I believe at a fundamental level the basic necessity would be Faktory's API for exposing processing metrics. An obvious metric for scaling might be the size of the queues, but I have also found myself interested in the amount of time a job spends in the queue before processing as well, because I believe that's also an intelligent indicator of a need to scale up workers. Are there current ideas / plans for what internal metrics will be gathered / recorded? |
Yeah, queue size and latency are easily implementable. What's the best approach to getting Kubernetes aware of Faktory? Are people using Helm? Would a Docker image or DEB/RPM binaries be most useful? |
@mperham Definitely Docker image would enable |
We run google container engine a.k.a. GKE (managed kubernetes) for all our services which include sidekiq workers. We have a situation where we may have high flash traffic (pdf rendering) where we are going to use google cloud functions a.k.a. GCF so we don't worry about scale. So sidekiq handles typical jobs, GCF handles high scale/flash traffic jobs like rendering. With the introduction of faktory, I think one approach/architecture for pain-free scaling:
Faktory needs proper probes:
Preparing for kubernetes:
Last but not least, consider an equivalent service to redislabs.com. We use GKE because we want to offload as much management as possible to focus on app development. For that reason, we didn't even try to deploy redis inside our cluster, but we used redislabs.com to deploy in the same region as our GKE cluster. Feel free to ping me via email if you want to dig into any of these areas. |
@rosskevin Thanks for the info, that's great detail. Running a Faktory SaaS is not in my immediate plans but like antirez and Redislabs, I've already had pings from people interested in partnering in building something and I'm always willing to chat: mike @ contribsys.com I'm still trying to wrap my head around what an enterprise-ready Faktory would look like and require; this helps a lot. |
@rosskevin @mperham a few notes to add about k8s support, let me know what you think and if I'm getting anything wrong. is the master process that also holds the rocksdb database, remember that k8s might kill faktory pod at will, so there should be a service with both web and worker ports open connected to the Line 52 in 01dadde
I am a heavy helm user, it's pretty easy to set up a basic chart that will take care of setting up the faktory server, but workers might need to be as a different chart, so upgrading of the worker chart won't require a redeploy of the faktory server. let me know if you need any help with PRs around these areas ... |
I will look into configuration & deployment for OpenShift ... similar to Kubernetes ... |
Gonna close this because I'm not a fan of nebulous, open-ended issues. Please open a specific issue if there's something Faktory can do to make our k8s support better. |
@mperham I was going to open an issue regarding Kubernetes support, and I discovered you have one already open! Specifically, I'd like to see examples of Kubernetes YAMLs that people could copy, paste, This would be especially cool to have for zero-downtime deploy of Faktory, as well as for Redis Gateway and replicated Faktory Pro. |
This could be a starting point. Couple of things to note:
A helm chart might be the right call for a more configurable and standardized deployment. |
Here's a helm chart helm/charts#13974 |
Here's the resources I used to get the Faktory server fully set up. I hope that this can help someone else in the future, as @jbielick 's example deployment yml was a great help to me. A note about this configuration, we are using Datadog deployed as a daemon set + Kubernetes, so this setup will allow you to use the "pro metrics" statsd implementation. The references to DD_TRACE_AGENT_HOSTNAME are how we access the datadog agent. Configs created:
Server Deployment:
Persistent Volume Claim:
This is a script for sending the HUP signal to the Faktory server to reset the cron when new jobs are added:
I hope this helps someone else with their setup! |
@dm3ch I saw that the incubator PR was closed. Did you end up adding your chart / repo to the helm hub? I couldn't find it. Would you be opposed to me using some of the files from your PR to make a chart and publish? @ecdemis123 this is great. I assume this is a production setup? Glad you figured out the Datadog DaemonSet connection. I'll definitely be referencing that at some point to get it working correctly in one of our clusters. That I think I'll start on a helm chart, which could automate sending the |
Cheers! Yep, this is our production setup, however our staging setup is pretty much the same with the exception that it's running faktory in development mode. I'll be interested to see the helm chart. We aren't currently using helm but we hope to move onto it someday. |
Actually, that HUP command is not working to restart the cron. I suspect that its due to this kubernetes/kubernetes#50345 since I'm using subPath to mount the conf.d files. Gonna dig more into it and see if I can come up with a workaround |
I updated my deployment manifest above with a working version that will accept the HUP signal and update the cron schedule from a k8s config map. Everything seems to be working good in production now. |
@ecdemis123 You might consider writing the ConfigMap like so: apiVersion: v1
kind: ConfigMap
metadata:
name: faktory
data:
cron.toml: |
[[cron]]
schedule = "*/5 * * * *"
[cron.job]
type = "FiveJob"
queue = "critical"
[cron.job.custom]
foo = "bar"
[[cron]]
schedule = "12 * * * *"
[cron.job]
type = "HourlyReport"
retry = 3
[[cron]]
schedule = "* * * * *"
[cron.job]
type = "EveryMinute"
faktory.toml: ""
test.toml: "" Where each key is a file, the value its string contents. I found this pretty convenient when mounting to the pod: # ...
- name: faktory-configs
mountPath: /etc/faktory/conf.d
# ...
volumes:
- name: faktory-configs
configMap:
name: faktory
Change to the ConfigMap get pushed to the pod in less than a minute (10s in my case).
|
You might be able to write the statds.toml in the init container and it won't get overwritten. That's kind of a tricky one :\ The reloading ( I'll do an experiment and report back. This ability is mentioned here: https://kubernetes.io/docs/tasks/configure-pod-container/share-process-namespace/ |
Interesting. My config map was created from a file, not using a string, so I wonder if that would simplify the implementation a bit. Mine looks pretty similar to yours, but I'm not that familiar with configmaps to see any subtle differences. I like the idea of having a sidecar container to send the HUP signal. I added that HUP script referenced above to our deploy pipeline that will get ran manually if necessary.
|
I realized that I also could have posted our worker deployment yml. This just a basic implementation and I've removed company-specific info.
|
🎉 Chart now available on the helm hub 🎉 helm repo add adwerx https://adwerx.github.io/charts
helm install --name faktory adwerx/faktory Datadog Agent HostIP support coming soon. |
It it be useful if i PR'ed some example kubernetes configs, or add them to the wiki? |
Either is ok, I’d prefer wiki. I can add a Kubernetes page if you’d like.
… On Jan 17, 2020, at 14:44, Scott Robertson ***@***.***> wrote:
It it be useful if i PR'ed some example kubernetes configs, or add them to the wiki?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Sorry for long reply. Suddenly I haven't got yet time to publish my chart in separate repo. |
What should Faktory look like in a world full of Kubernetes? My understanding is that Kubernetes could be very useful in scaling worker processes as queues grow. How can Faktory make this easy?
The text was updated successfully, but these errors were encountered: