Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoreDNS provider ownership is broken with multiple A records of the same host #1414

Closed
ytsarev opened this issue Feb 10, 2020 · 4 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. provider/coredns

Comments

@ytsarev
Copy link
Member

ytsarev commented Feb 10, 2020

Encountered in context of k8gb-io/k8gb#38

The context on setup:

  • external-dns with coredns provider and txt registry backend
  • --txt-owner-id=ohmyglb ( practically it does not matter as the problem is encountered with default as well)
  • source == "CRD" ( I believe it should be source agnostic issue but specifying just in case)

The issue itself:

  • Source DNSEndpoint CRD
apiVersion: externaldns.k8s.io/v1alpha1
kind: DNSEndpoint
....
spec:
  endpoints:
  - dnsName: localtargets.app3.cloud.example.com
    recordTTL: 30
    recordType: A
    targets:
    - 172.17.0.2
    - 172.17.0.3
    - 172.17.0.5
  - dnsName: app3.cloud.example.com
    recordTTL: 30
    recordType: A
    targets:
    - 172.17.0.2
    - 172.17.0.3
    - 172.17.0.5
  • Initially we observe the following in external-dns pod logs:
time="2020-02-10T16:59:53Z" level=info msg="Add/set key /skydns/com/example/cloud/app3/3f31b771 to Host=172.17.0.2, Text=\"heritage=external-dns,external-dns/owner=ohmyglb,external-dns/resource=crd/test-gslb/test-gslb\", TTL=30"
time="2020-02-10T16:59:53Z" level=info msg="Add/set key /skydns/com/example/cloud/app3/2c604519 to Host=172.17.0.3, Text=, TTL=30"
time="2020-02-10T16:59:53Z" level=info msg="Add/set key /skydns/com/example/cloud/app3/29a93fa8 to Host=172.17.0.5, Text=, TTL=30"
time="2020-02-10T17:00:53Z" level=info msg="Add/set key /skydns/com/example/cloud/app3/3f31b771 to Host=172.17.0.2, Text=\"heritage=external-dns,external-dns/owner=ohmyglb,external-dns/resource=crd/test-gslb/test-gslb\", TTL=30"
time="2020-02-10T17:00:53Z" level=info msg="Add/set key /skydns/com/example/cloud/app3/3f31b771 to Host=172.17.0.3, Text=, TTL=30"
time="2020-02-10T17:00:53Z" level=info msg="Add/set key /skydns/com/example/cloud/app3/3f31b771 to Host=172.17.0.5, Text=, TTL=30"
  • Notice that Text= is empty on 2nd and 3rd A records in etcd
  • During the consequent attempts external-dns "loses" the ownership to its own entries and we have an overall operation troubles
time="2020-02-10T17:01:53Z" level=debug msg="Skipping endpoint app3.cloud.example.com 30 IN A  172.17.0.2;172.17.0.3;172.17.0.5 [] because owner id does not match, found: \"\", required: \"ohmyglb\""
time="2020-02-10T17:01:53Z" level=debug msg="Skipping endpoint app3.cloud.example.com 30 IN A  172.17.0.5 [] because owner id does not match, found: \"\", required: \"ohmyglb\""
time="2020-02-10T17:01:53Z" level=info msg="Add/set key /skydns/com/example/cloud/app3/localtargets/7017c5bb to Host=172.17.0.2, Text=\"heritage=external-dns,external-dns/owner=ohmyglb,external-dns/resource=crd/test-gslb/test-gslb\", TTL=30"
time="2020-02-10T17:01:53Z" level=info msg="Add/set key /skydns/com/example/cloud/app3/localtargets/7017c5bb to Host=172.17.0.3, Text=, TTL=30"
time="2020-02-10T17:01:53Z" level=info msg="Add/set key /skydns/com/example/cloud/app3/localtargets/7017c5bb to Host=172.17.0.5, Text=, TTL=30"
time="2020-02-10T17:02:53Z" level=debug msg="Skipping endpoint app3.cloud.example.com 30 IN A  172.17.0.2;172.17.0.3;172.17.0.5 [] because owner id does not match, found: \"\", required: \"ohmyglb\""
time="2020-02-10T17:02:53Z" level=debug msg="Skipping endpoint app3.cloud.example.com 30 IN A  172.17.0.5 [] because owner id does not match, found: \"\", required: \"ohmyglb\""
time="2020-02-10T17:02:53Z" level=info msg="Add/set key /skydns/com/example/cloud/app3/localtargets/7017c5bb to Host=172.17.0.2, Text=\"heritage=external-dns,external-dns/owner=ohmyglb,external-dns/resource=crd/test-gslb/test-gslb\", TTL=30"
time="2020-02-10T17:02:53Z" level=info msg="Add/set key /skydns/com/example/cloud/app3/localtargets/7017c5bb to Host=172.17.0.3, Text=, TTL=30"

I did not find an obvious way to fix it in the code, but here are things I noticed that might help to track it

Currently I will workaround it with disabling ownership with noop backend as in my scenario it is 'local' coredns, but this issue might be critical for other kinds of setup

ytsarev added a commit to k8gb-io/k8gb that referenced this issue Feb 10, 2020
* Disables ownership for local scenario
* Fixes #38
* Upstream issue details kubernetes-sigs/external-dns#1414
ytsarev added a commit to k8gb-io/k8gb that referenced this issue Feb 10, 2020
* Disables ownership for local scenario
* Fixes #38
* Upstream issue details kubernetes-sigs/external-dns#1414
@ytsarev ytsarev changed the title Coredns provider ownership is broken with multiple A records CoreDNS provider ownership is broken with multiple A records of the same host Feb 10, 2020
ytsarev added a commit to k8gb-io/k8gb that referenced this issue Feb 11, 2020
* Disables ownership for local scenario
* Fixes #38
* Upstream issue details kubernetes-sigs/external-dns#1414
@njuettner njuettner added provider/coredns kind/bug Categorizes issue or PR as related to a bug. labels Feb 12, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 12, 2020
@ytsarev
Copy link
Member Author

ytsarev commented May 12, 2020

it is something to retest with the fix of #1475 . Will take care

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 11, 2020
@ytsarev
Copy link
Member Author

ytsarev commented Jun 28, 2020

#1475 appears to also fix this issue, no lost of txt ownership after the fix. Closing.

@ytsarev ytsarev closed this as completed Jun 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. provider/coredns
Projects
None yet
Development

No branches or pull requests

4 participants