Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement failover load balancing strategy #46

Closed
donovanmuller opened this issue Feb 24, 2020 · 0 comments · Fixed by #65
Closed

Implement failover load balancing strategy #46

donovanmuller opened this issue Feb 24, 2020 · 0 comments · Fixed by #65
Labels
enhancement New feature or request
Milestone

Comments

@donovanmuller
Copy link
Contributor

As per the supported load balancing strategies in the initial design a failover strategy should be implemented to ensure the guarantees stated:

Failover - Pinned to a specified primary cluster until that cluster has no available Pods, upon which the next available cluster's Ingress node IPs will be resolved. When Pods are again available on the primary cluster, the primary cluster will once again be the only eligible cluster for which cluster Ingress node IPs will be resolved

Scenario 1:

  • Given 2 separate Kubernetes clusters, X, and Y
  • Each cluster has a healthy Deployment with a backend Service called app and that backend service exposed with a Gslb resource on all 2 clusters as:
apiVersion: ohmyglb.absa.oss/v1beta1
kind: Gslb
metadata:
  name: app-gslb
  namespace: test-gslb
spec:
  ingress:
    rules:
      - host: app.cloud.example.com
        http:
          paths:
            - backend:
                serviceName: app
                servicePort: http
              path: /
  strategy: failover 
    primary: cluster-x
  • Each cluster has one worker node that accepts Ingress traffic. The worker node in each cluster has the following name and IP:
cluster-x-worker-1: 10.0.1.10
cluster-y-worker-1: 10.1.1.11

When issuing the following command, curl -v http://app.cloud.example.com, I would expect the IP's resolved to reflect as follows (if this command was executed 3 times consecutively):

$ curl -v http://app.cloud.example.com # execution 1
*   Trying 10.0.1.10...
...

$ curl -v http://app.cloud.example.com # execution 2
*   Trying 10.0.1.10...
...

$ curl -v http://app.cloud.example.com # execution 3
*   Trying 10.0.1.10...
...

The resolved node IP's that ingress traffic will be sent should be "pinned" to the primary cluster named explicitly in the Gslb resource above, even though there was a healthy Deployment in cluster Y, the Ingress node IPs for cluster Y would not be resolved.

Scenario 2:

  • Same configuration as Scenario 1 except that the Deployment only has healthy Pods on one cluster, cluster Y. I.e. The Deployment on cluster X has no healthy Pods.

When issuing the following command, curl -v http://app.cloud.example.com, I would expect the IP's resolved to reflect as follows (if this command was executed 3 times consecutively):

$ curl -v http://app.cloud.example.com # execution 1
*   Trying 10.1.1.11...
...

$ curl -v http://app.cloud.example.com # execution 2
*   Trying 10.1.1.11...
...

$ curl -v http://app.cloud.example.com # execution 3
*   Trying 10.1.1.11...
...

In this scenario, only Ingress node IPs for cluster Y are resolved given that there is not a healthy Deployment for the Gslb host on the primary cluster, cluster X. Therefore, the "failover" cluster(s) are resolved instead (cluster Y in this scenario).

Now, given that the Deployment on cluster X (the primary cluster) now becomes healthy once again, I would expect the IP's resolved to reflect as follows (if this command was executed 2 times consecutively):

$ curl -v http://app.cloud.example.com # execution 1
*   Trying 10.0.1.10...
...

$ curl -v http://app.cloud.example.com # execution 2
*   Trying 10.0.1.10...
...

The primary cluster's Ingress node IPs are now resolved exclusively once again.

NOTE:

  • The design of the specification around how to indicate the primary cluster as described in this issue is solely for the purpose of describing the scenario. It should not be considered a design.
  • The existence of multiple "secondary" failover clusters should also be considered. For example, if there were 3 clusters (X, Y and Z) in the scenario 2 above, could the Ingress node IPs for both clusters (X and Z) be resolved and if so, how (in terms of "load balancing") would the Ingress node IPs across both those secondary/failover clusters be resolved? Would they use the default round robin strategy, if any strategy at all?
@donovanmuller donovanmuller added the enhancement New feature or request label Feb 24, 2020
@donovanmuller donovanmuller added this to the 0.6 milestone Feb 24, 2020
ytsarev added a commit that referenced this issue Mar 17, 2020
* Extends `Strategy` CRD Spec
* Implements simple failover logic with respect to `PrimaryGeoTag`
* Associated test suite extension
* Resolves #46
ytsarev added a commit that referenced this issue Mar 18, 2020
* Extends `Strategy` CRD Spec
* Implements simple failover logic with respect to `PrimaryGeoTag`
* Associated test suite extension
* Resolves #46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant