-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add examples for interpod configurations #4557
Conversation
b786c23
to
feb8aea
Compare
feb8aea
to
bbd419e
Compare
bbd419e
to
c088e69
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @dhilipkumars. I have a couple minor suggestions.
@@ -206,6 +206,31 @@ If defined but empty, it means "all namespaces." | |||
All `matchExpressions` associated with `requiredDuringSchedulingIgnoredDuringExecution` affinity and anti-affinity | |||
must be satisfied for the pod to schedule onto a node. | |||
|
|||
#### More Practical Usage | |||
|
|||
Than directly using with pods Interpod Affinity and AnitAffinity will be more useful when they are used with higher |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would write this as:
Interpod Affinity and AntiAffinity can be even more useful when used with higher level collections such as ReplicaSets, StatefulSets, Deployments, etc.
#### More Practical Usage | ||
|
||
Than directly using with pods Interpod Affinity and AnitAffinity will be more useful when they are used with higher | ||
level objects such as Statefulset, Deployments, etc. One can easily configure and prefer if two workloads should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One can easily configure that a set of workloads should be co-located in the same defined topology, e.g., the same node.
|:--------------------:|:-------------------:|:------------------:|:------------------:| | ||
| *DB-MASTER* | *DB-REPLICA-1* | *DB-REPLICA-2* | *DB-REPLICA-3* | | ||
|
||
[Here](https://kubernetes.io/docs/tutorials/stateful-application/zookeeper/#tolerating-node-failure) is an example of zookeper statefulset configued with anti-affinity for high availablity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/configued/configured
c088e69
to
3e35d35
Compare
@bsalamat Thanks for taking a look, updated the PR with corrections PTAL, Have also re-based it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments.
#### More Practical Usage | ||
|
||
Interpod Affinity and AnitAffinity can be even more useful when they are used with higher | ||
level collections such as ReplicaSets, Statefulset, Deployments, etc. One can easily configure that a set of workloads should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Statefulset/StatefulSets
|
||
Interpod Affinity and AnitAffinity can be even more useful when they are used with higher | ||
level collections such as ReplicaSets, Statefulset, Deployments, etc. One can easily configure that a set of workloads should | ||
be co-located in the same defined topology, eg., the same node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/eg./e.g.
|:--------------------:|:-------------------:|:------------------:| | ||
| *webserver-1* | *webserver-2* | *webserver-3* | | ||
| *cache-1* | *cache-2* | *cache-3* | | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that it would be more helpful if we provided examples of ReplicaSet configurations for the web server and and the cache here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, i think we want our users to use things like deployments and statefulsets directly instead of replicasets. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. StatefulSets or Deployments are totally fine. The affinity/anti-affinity rules are the same for all of them and that's the important piece of the config in this example. I have no particular preference about type of the collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for response, I'll update later today.
@@ -206,6 +206,31 @@ If defined but empty, it means "all namespaces." | |||
All `matchExpressions` associated with `requiredDuringSchedulingIgnoredDuringExecution` affinity and anti-affinity | |||
must be satisfied for the pod to schedule onto a node. | |||
|
|||
#### More Practical Usage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Usage/Use-cases
3e35d35
to
588bc64
Compare
@bsalamat PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! One more suggestion for the example.
labels: | ||
app: web-store | ||
spec: | ||
affinity: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go with the example of 3 web servers and 3 redis caches, the web-server will need anti-affinity to web-store
to make sure multiple pods of this replicaset are not scheduled on the same node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, we may not need anit-affinity for web-store
(even in production) as default selector-spread in the schedule might anyways spread the workloads across the eligible nodes also we may need more web-store
than the number of available nodes based on the traffic? Please let me know if you think it would still make sense to add it. (Plus we are re-directing the reader to zookeeper with anti-affinity example?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running multiple web-servers on a single machine rarely makes sense, especially if the intent is to handle traffic load. Well known web servers are multi-threaded and a single process can use multiple CPU cores on a machine. Besides, there is a potential of having port conflicts if you run multiple web server to serve one service point.
Scheduler tries to spread pods, but users shouldn't assume that pods of a single collection will never land on a single machine without proper anti-affinity rules.
All that said, I agree with you that for the sake of this example which is co-location we may want to keep the config simple. How about this:
- Change the number of instances of redis and web servers to 3 instances, which is the same as the number of nodes.
- Keep the rest of the config the same as it is now.
- Add a note at the bottom of the example to point out that if spreading of web-servers or redis instances is desired, proper anti-affinity rules should be added and refer to the next example for anti-affinity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done PTAL,
what i was trying to say is stateless web-servers can coexist in the same node if there is enough room in it without hurting each other, they don't have to specifically prefer (using antiAffinity) not to be placed next to each other, on the other hand stateful workloads such as DB, KV Store , caches should mention antiAffinity explicitly as they need to be highly available.
metadata: | ||
labels: | ||
app: store | ||
spec: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec needs an anti-affinity to 'store' to ensure that the pods are not scheduled on a single node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, i considered this also but we were trying to emphasis podAffinity
in this example we are separately linking the user to a zookeeper example for podAnitAffinity
, i was thinking we should keep this example simple and clear. We may very well add a sentence saying the best practice would be to add anti-affinity for app=store
workload. What do you think?
metadata: | ||
name: redis-cache | ||
spec: | ||
replicas: 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a better example if we created the same number of cache instances as the web server instances. Given that this is a 3 node cluster, I would use 3 replicas for both web server and redis cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, i did that first. But when i tried with only 2 redis
replicas and 4 web-store
it appeared to me that affinity concept was better demonstrated as node-2
was ignored by the scheduler even though it had enough capacity. I felt it could make it easier for the reader to understand. What do you think?
588bc64
to
c932208
Compare
Deploy preview ready! Built with commit 7525090 https://deploy-preview-4557--kubernetes-io-master-staging.netlify.com |
web-server-1287567482-s330j 1/1 Running 0 7m 10.192.3.2 kube-node-2 | ||
``` | ||
|
||
Best practise is to configure these highly available stateful workloads such as redis with antiAffinity rules for more guaranteed spread, which we will see in the next section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/practise/practice
s/antiAffinity/AntiAffinity
s/more guaranteed spread/guaranteed spreading
Thanks, @dhilipkumars! Looks good. Just a minor comment. |
c932208
to
7525090
Compare
@bsalamat Thanks for your quick response, please take a look now. |
Thanks! Please squash your commits. /lgtm |
…hub.io into chenopis-vnext-staging-noindex * 'master' of https://github.com/kubernetes/kubernetes.github.io: Add network overlay details to cloud routes flag Revert "Update volumes.md" add examples for interpod configurations (#4557) Update "readonly" in abac include _headers (#4636)
…hub.io into release-1.8 * 'master' of https://github.com/kubernetes/kubernetes.github.io: fix noindex for vnext-staging (#4640) Add network overlay details to cloud routes flag Revert "Update volumes.md" add examples for interpod configurations (#4557) Update "readonly" in abac
…hub.io into release-1.7 * 'master' of https://github.com/kubernetes/kubernetes.github.io: fix noindex for vnext-staging (#4640) Add network overlay details to cloud routes flag Revert "Update volumes.md" add examples for interpod configurations (#4557) Update "readonly" in abac
…hub.io into chenopis-user-journeys * 'master' of https://github.com/kubernetes/kubernetes.github.io: (26 commits) fix noindex for vnext-staging (#4640) Add network overlay details to cloud routes flag Revert "Update volumes.md" add examples for interpod configurations (#4557) Update "readonly" in abac include _headers (#4636) Update callouts.css Update callouts.css fix command Add noindex to vnext-staging Add note for 1.8 release. (#4632) fix create-cluster-kubeadm docs Update aws.md add newline above kube-apiserver Update scheduling-gpus.md Fix spacing with shell commands Documenting another (potential) callout issue (#4604) Update reference docs with most recent version of brodocs (#4600) Update kubernetes-api.md fix spelling mistake ...
* add examples for interpod configurations * re-word and fix typo based on review comments * explain podAffinity with examples * review comments: make replicas 3 for both workload types * Address final review comments
Add some illustrations for interpod affinity and anti affinity, also link it back with zookeeper example.
as per the conclusion of this issue it was reccomended to add more documentation in this section.
This change is