DNS based cross Gslb communication #30

ytsarev · 2020-01-20T16:48:36Z

We need to exchange information between multiple Gslb
instances which are deployed to different clusters
Instead of exposing k8s or any other form of API we
can just rely on DNS itself
We expose only working IP addresses for specific Gslb
as an A record for service hostsz.$gslb.Name.$dnzZone
which is created automatically by the operator
The data we expose is totally non-sensitive so we simplify
configuration requiring no service account tokens / tls certificates
or similar for Gslb information exchange
External Gslb enabled clusters are specified as configuration
environemnt variable in operator deployment and abstracted
as ohmyglb.extGslbClusters value in operator helm chart
First and naive implementation of roundRobin Gslb strategy

Example of this code working on local cluster

$ k -n test-gslb get dnsendpoints.externaldns.k8s.io -o yaml
...
  spec:
    endpoints:
    - dnsName: hostsz.test-gslb.example.com
      recordTTL: 30
      recordType: A
      targets:
      - 172.17.0.2
    - dnsName: app3.cloud.example.com
      recordTTL: 30
      recordType: A
      targets:
      - 172.17.0.2
      - 172.17.0.2
...

Here we observe populates service hostsz entry and also
extended target list for app3.cloud.example.com with roundRobin strategy
(IPs are duplicates given the local testing scenario)

* We need to exchange information between multiple Gslb instances which are deployed to different clusters * Instead of exposing k8s or any other form of API we can just rely on DNS itself * We expose only working IP addresses for specific Gslb as an A record for service `hostsz.$gslb.Name.$dnzZone` which is created automatically by the operator * The data we expose is totally non-sensitive so we simplify configuration requiring no service account tokens / tls certificates or similar for Gslb information exchange * External Gslb enabled clusters are specified as configuration environemnt variable in operator deployment and abstracted as `ohmyglb.extGslbClusters` value in operator helm chart * First and naive implementation of `roundRobin` Gslb strategy Example of this code working on local cluster ``` $ k -n test-gslb get dnsendpoints.externaldns.k8s.io -o yaml ... spec: endpoints: - dnsName: hostsz.test-gslb.example.com recordTTL: 30 recordType: A targets: - 172.17.0.2 - dnsName: app3.cloud.example.com recordTTL: 30 recordType: A targets: - 172.17.0.2 - 172.17.0.2 ... ``` Here we observe populates service `hostsz` entry and also extended target list for `app3.cloud.example.com` with `roundRobin` strategy (IPs are duplicates given the local testing scenario)

donovanmuller · 2020-01-20T18:23:43Z

@ytsarev Given two clusters (A and B) with external Gslb enabled on both and one cluster (A) suffers network issues, what cleans up the records for Gslb on cluster A, so that no traffic is sent there.

I.e. If Gslb controller cannot clean up due to catastrophic failure or network issues, how does garbage collection work so as to prevent stale records for the affected cluster (which for arguments sake, cannot accept any ingress traffic)?

ytsarev · 2020-01-20T20:00:59Z

@donovanmuller
Clusters perform cross-check of each other.
Records are getting updated during each Gslb reconciliation
If cluster B can't get anything from 53/udp of cluster A then it contain only own (B) targets in A record
Similar way if cluster A is suffering from network partition then it will 'think' that B is dead and will contain only own records until the network connection is getting recovered.

So as a further implementation steps we need to think about:

Enabling periodic reconciliation (currently it is based purely on reaction to in-cluster Events). Or figure out how to track external Event(s)
Return only healthy targets for healthy services.

Speaking of 2) i think we can remove special entry of hostsz and append extended targets directly from matching ingress host fqdn on externalGslb.
So app3.cloud.example.com records on cluster A will be amended with app3.cloud.example.com records of cluster B and vise versa. With this approach backend service healthcheck is embedded.
What do you think ?

* Register per Gslb Ingress host `localtargets.*` instead of global `hostsz` * `localtargets.*` A record is getting populated only if backend service is healthy * Make Gslb to return healthy records of *external* Gslb even if associated service in own cluster is `Unhealthy/NotFound`

ytsarev · 2020-01-20T20:57:31Z

@donovanmuller I've implemented 2) in 2bcff9f please check it out. Not yet sure if we need 1) (scheduled reconciliation)

donovanmuller

👍

donovanmuller · 2020-01-21T04:53:07Z

@ytsarev understood, I like the updated implementation 👍

ytsarev requested a review from donovanmuller January 20, 2020 16:48

ytsarev force-pushed the grab_external_records branch 4 times, most recently from 408ffbb to ca2f5f5 Compare January 20, 2020 17:07

ytsarev force-pushed the grab_external_records branch from ca2f5f5 to e435c5f Compare January 20, 2020 17:12

ytsarev added 2 commits January 20, 2020 22:02

REFACTOR: no need to pass gslb anymore

f29e501

REFACTOR: not required to be a method on reconcile

19e12cc

donovanmuller approved these changes Jan 21, 2020

View reviewed changes

ytsarev merged commit 3e2d8c2 into master Jan 21, 2020

ytsarev deleted the grab_external_records branch January 21, 2020 09:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DNS based cross Gslb communication #30

DNS based cross Gslb communication #30

ytsarev commented Jan 20, 2020

donovanmuller commented Jan 20, 2020 •

edited

Loading

ytsarev commented Jan 20, 2020

ytsarev commented Jan 20, 2020

donovanmuller left a comment

donovanmuller commented Jan 21, 2020

DNS based cross Gslb communication #30

DNS based cross Gslb communication #30

Conversation

ytsarev commented Jan 20, 2020

donovanmuller commented Jan 20, 2020 • edited Loading

ytsarev commented Jan 20, 2020

ytsarev commented Jan 20, 2020

donovanmuller left a comment

Choose a reason for hiding this comment

donovanmuller commented Jan 21, 2020

donovanmuller commented Jan 20, 2020 •

edited

Loading