DNS does not work in containers if host uses local server #3277

nertpinx · 2019-06-07T14:41:53Z

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Steps to reproduce the issue:

Be in a network that prohibits external DNS queries, disable external DNS communication or just use some only-locally available hostname in step 3.
Setup local DNS server/forwarder (e.g. systemd-resolved) so that the local address is in /etc/resolv.conf
Start any container (without --network host) and try to resolve a hostname (e.g. podman run --rm -it fedora curl -v ifconfig.me)

Describe the results you received:
curl: (6) Could not resolve host: ifconfig.me

Describe the results you expected:
No error (some IP address)

Additional information you deem important (e.g. issue happens only occasionally):
The contents of /etc/resolv.conf are:

search virt
nameserver 8.8.8.8
nameserver 8.8.4.4
nameserver 2001:4860:4860::8888
nameserver 2001:4860:4860::8844
nameserver 10.0.2.3
options edns0

Which would normally work (although I might not want to send my DNS requests somewhere else because I might have services available in a local network), but I am in a network that prohibits external DNS queries, so that doesn't work.

If I leave just the slirp4netns nameserver there (echo nameserver 10.0.2.3 >/etc/resolv.conf) it works in a VM where I am trying to reproduce this issue. However on my original host, where I discovered this, 10.0.2.3 is still inaccessible (even though the version and the command-line of slirp4netns is identical, apart from the PID argument).

Output of podman version:

Version:            1.3.1
RemoteAPI Version:  1
Go Version:         go1.12.2
OS/Arch:            linux/amd64

Output of podman info --debug:

debug:
  compiler: gc
  git commit: ""
  go version: go1.12.2
  podman version: 1.3.1
host:
  BuildahVersion: 1.8.2
  Conmon:
    package: podman-1.3.1-1.git7210727.fc30.x86_64
    path: /usr/libexec/podman/conmon
    version: 'conmon version 1.12.0-dev, commit: c9a4c48d1bff85033b7fc9b62d25961dd5048689'
  Distribution:
    distribution: fedora
    version: "30"
  MemFree: 2884521984
  MemTotal: 4133556224
  OCIRuntime:
    package: runc-1.0.0-93.dev.gitb9b6cc6.fc30.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc8+dev
      commit: e3b4c1108f7d1bf0d09ab612ea09927d9b59b4e3
      spec: 1.0.1-dev
  SwapFree: 644870144
  SwapTotal: 644870144
  arch: amd64
  cpus: 4
  hostname: fedora30.virt
  kernel: 5.0.9-301.fc30.x86_64
  os: linux
  rootless: true
  uptime: 19m 40.53s
registries:
  blocked: null
  insecure: null
  search:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  ConfigFile: /home/nert/.config/containers/storage.conf
  ContainerStore:
    number: 0
  GraphDriverName: overlay
  GraphOptions:
  - overlay.mount_program=/usr/bin/fuse-overlayfs
  GraphRoot: /home/nert/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 1
  RunRoot: /tmp/1000
  VolumePath: /home/nert/.local/share/containers/storage/volumes

Additional environment details (AWS, VirtualBox, physical, etc.):
I am trying this in a Fedora 30 VM, clean install, as that is the easiest and cleanest reproducer I can get. I cannot reproduce the issue related to my local environment in there.

The text was updated successfully, but these errors were encountered:

nertpinx · 2019-06-07T14:42:52Z

@giuseppe This is the issue we were talking about, I hope it has all the information that is related, feel free to ask for any additional info, I will gladly provide it.

mheon · 2019-06-07T15:15:02Z

For root containers, if you have 127.0.0.1 in resolv.conf, we remove it (and add default nameservers if there are none remaining). We can't expect to be able to connect to a DNS server running on the host's localhost address; it might not be listening on the bridge we created (this last bit isn't really relevant for rootless containers, so it could be safe there?).

nertpinx · 2019-06-10T07:52:53Z

@mheon Well, clearly slirp4netns, as the userspace process running in the default network namespace, should be able to connect to the same nameservers as other processes in the default net namespace, so I see no reason for having google dns servers in /etc/resolv.conf. I just do not know if that is done by slirp4netns or podman.

rhatdan · 2019-06-10T10:17:46Z

That would most certainly be podman doing the resolv.conf.

nertpinx · 2019-06-10T11:57:57Z

That would most certainly be podman doing the resolv.conf.

Great! So did I miss any reasoning behind providing default google nameservers with slirp4netns instead of just using 10.0.2.3?

rhatdan · 2019-06-10T13:14:19Z

@giuseppe Any ideas?

mheon · 2019-06-10T13:51:13Z

@nertpinx So, to clarify - what does your host system's resolv.conf look like? 127.0.0.1, then 10.0.2.3? And we're dropping 127.0.0.1 in favor of the Google DNS servers, despite 10.0.2.3 being in resolv.conf already?

nertpinx · 2019-06-10T14:03:32Z

No, 10.0.2.3 is the slirp4netns' DNS provided on the emulated network stack that works. On my machine resolvconf has only ::1, on the clean fedora install with systemd-resolved properly applied it has only 127.0.0.53.

mheon · 2019-06-10T14:06:17Z

Aha. Alright, I think we're probably seeing a bad interaction between our resolv.conf handling and the handling in slirp4netns, then.

giuseppe · 2019-06-11T06:34:40Z

I think the issue is that we block slirp4netns from accessing the loopback device for security reasons. If you check, you'll see that we are passing an explicit --disable-host-loopback to the slirp4netns process.

nertpinx · 2019-06-11T07:54:50Z

I would guess that forbids accessing the host directly (through 10.0.2.2) from the child namespace, not the slirp4netns process. And trying it out it really is the case, if I leave 10.0.2.3 in resolv.conf it works nicely (on the fedora reproducer VM).

When using slirp4netns, be sure the built-in DNS server is the first one to be used. Closes: containers#3277 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

giuseppe · 2019-06-12T08:32:16Z

we could probably just drop all the other DNS servers and use only 10.0.2.3 but let's be safe and keep the other DNS servers around. I've changed it so now 10.0.2.3 is the first in the list:

#3305

@nertpinx, does it work if you place it as the first entry in the /etc/resolv.conf file?

nertpinx · 2019-06-14T10:53:00Z

@giuseppe Yes it does (at least the main issue), thank you!

intelfx · 2020-03-12T11:41:51Z

@giuseppe

does it work if you place it as the first entry in the /etc/resolv.conf file?

This does not always work. The nameserver is chosen at random (or at least it is not defined how it is chosen), so chances are that a generic nameserver will be chosen instead of 10.0.2.3.

This has just hit me. Host resolv.conf:

$ cat /etc/resolv.conf
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0
search bu1.loc

Container resolv.conf:

/ # cat /etc/resolv.conf
search bu1.loc
nameserver 10.0.2.3
nameserver 8.8.8.8
nameserver 8.8.4.4
nameserver 2001:4860:4860::8888
nameserver 2001:4860:4860::8844
options edns0

Host is connected to a corporate VPN which provides a custom nameserver that resolves several internal-only domains. The semantics of "try all nameservers" is enforced via systemd-resolved on the host side. However, connections to internal domains from the container can (and do) arbitrarily fail. Manually changing the container's resolv.conf to only include the 10.0.2.3 resolves this issue.

we could probably just drop all the other DNS servers and use only 10.0.2.3

As far as I can see, this would be the fully correct behavior. Please reconsider.

mheon · 2020-03-12T13:25:07Z

Chosen as random? What? That does not sound correct. Per the resolv.conf manpage:

If there are multiple servers, the resolver library queries them in the order listed.

Queried in order should be a safe assumption. How are you making DNS queries / what libc are you using?

mheon · 2020-03-12T13:25:59Z

Ah, I see. systemd-resolvd has apparently decided that the rules for everyone else don't apply to it, and is doing its own thing. So that's fun.

intelfx · 2020-03-12T16:27:49Z

@mheon

How are you making DNS queries / what libc are you using?

curl in an alpine container (so, musl).

Ah, I see. systemd-resolvd has apparently decided that the rules for everyone else don't apply to it, and is doing its own thing.

No, systemd-resolved is on the host, not in the container, and it's actually doing the right thing here (as tempting as it is to blame systemd for everything).

rhatdan · 2020-03-12T20:32:45Z

If the resolv.conf is properly formatted, their is little podman can do to fix that.
Is the issue that the processes inside of the container can not reach the nameserver?

intelfx · 2020-03-14T18:42:37Z

@rhatdan

If the resolv.conf is properly formatted, their is little podman can do to fix that.

I'm not quite sure what do you mean by that, but earlier in this thread @giuseppe did something to reorder nameservers in the generated resolv.conf, which means that my suggestion is also possible.

Is the issue that the processes inside of the container can not reach the nameserver?

No, the issue is that the processes inside of the container reach the wrong nameserver — instead of contacting systemd-resolved on the host they try to contact upstream nameservers directly.

giuseppe · 2020-03-16T08:11:16Z

if you'd like to have only 10.0.2.3, you can force it with --dns 10.0.2.3

bulhoes · 2020-11-06T09:10:14Z

I had the same issue on my setup by I was able to overcome the issue with a simple fix.
The issue might be related to the acl on the named server.
Can you please let me know if you have the containers network allowed on the named acl?
That might be the issue.

fbezdeka · 2020-12-04T16:08:28Z

A /etc/resolv.conf generated by systemd-resolved looks like this:

nameserver 127.0.0.53
options edns0
search some.dom

As a result podman seems to remove the nameserver line and adds the "upstream" DNS servers directly.
Bypassing systemd-resolved on the host may work in some scenarios, but it breaks others.
Consider a corporate VPN connections where "upstream" is not defined. (We have at least two "upstreams" when connected)

Bypassing the host's systemd-resolved has at least the following problems:

Upstream/Internet DNS servers do not know about corporate specific DNS entries
Corporate DNS servers may deliver different addresses than public DNS servers
Corporate DNS servers may not deliver results for public stuff
Corporate DNS servers may change, so unable to use fixed IP addresses

So whatever which DNS server I choose by setting --dns=<ip>, I will never get the same results as talking to the systemd-resolved running on the host.

How to fix that?
Well, I guess it's not possible at all. podman would have to replace nameserver 127.0.0.53 with something that is forwarded or hosted on the host. But systemd-resolved is listening on the loopback interface only and does not allow (AFAIK) to change / configure that.

[Edit]
The combination of systemd-resolved and podman is the default for Fedora users.
So privileged containers are quite unusable when corporate VPNs are in the game.

rhatdan · 2020-12-07T21:20:04Z

Docker has the same issue.

Although I think some new features have been added to allow users to share the hosts localhost network. At least in rootless mode.

@mheon @giuseppe WDYT?

mheon · 2020-12-07T21:50:10Z

We do have a (limited) ability to do our own DNS via dnsname; maybe using that as a forwarder to the systemd-resolved server would be sufficient?

rhatdan · 2020-12-07T22:20:24Z

That would work, @baude WDYT

SISheogorath · 2021-02-13T11:00:50Z

I would love to see a DNS proxy since my upstream DNS servers are DoT-only. Currently I pass --dns 1.1.1.1 to solve the problem, but it's a not correct to assume that systemd-resolved's configured name servers are Do53 servers.

lfarkas · 2021-02-24T09:25:28Z

so on fedora (as the main developer platform for podman people) and also rhel-8 systemd-resolved is the default. so using a properly configured systemd-resolved on host can't be used as a name server for containers run by root!? and the only solution is to use --net host!

why did you close this issue???

patrickbkr · 2021-06-01T09:17:52Z

@rhatdan Can this issue be reopened?

From what I read above podman is still unreliable when there are non-trivial dns setups on the host.

rhatdan · 2021-06-01T19:06:32Z

Please open a new issue, describing your exact issues.

brianjmurrell · 2022-12-07T20:41:42Z

So what is the new (still open, since I still see this problem) issue that covers a localhost (127.0.0.1) DNS resolver and rootful containers?

mheon · 2022-12-08T15:30:46Z

There is none, best of my knowledge, though I suspect that will be supported from Podman 4.4 onwards via the DNS changes in Aardvark.

jiridanek · 2023-01-11T15:20:49Z

Please open a new issue, describing your exact issues.

@patrickbkr, @brianjmurrell reported as #17075

openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 7, 2019

mheon added the rootless label Jun 11, 2019

giuseppe mentioned this issue Jun 12, 2019

rootless: use the slirp4netns builtin DNS first #3305

Merged

openshift-merge-robot closed this as completed in #3305 Jun 12, 2019

matpen mentioned this issue Jun 19, 2019

Rootless buildah + slirp4netns = no network containers/buildah#1660

Closed

langdon mentioned this issue Nov 13, 2019

configuration option for dns server to use #4508

Closed

bobhenkel mentioned this issue Sep 13, 2020

podman run --rm -it alpine sh (apk update or any egress doesn't work) #7613

Closed

skateman mentioned this issue Mar 2, 2021

chore(container): Correct webpack binding in container RedHatInsights/compliance-frontend#1023

Merged

14 tasks

jiridanek mentioned this issue Jan 11, 2023

Rootful containers when using corporate VPN on systemd-resolved systems have broken DNS #17075

Closed

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 4, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DNS does not work in containers if host uses local server #3277

DNS does not work in containers if host uses local server #3277

nertpinx commented Jun 7, 2019

nertpinx commented Jun 7, 2019

mheon commented Jun 7, 2019

nertpinx commented Jun 10, 2019

rhatdan commented Jun 10, 2019

nertpinx commented Jun 10, 2019

rhatdan commented Jun 10, 2019

mheon commented Jun 10, 2019

nertpinx commented Jun 10, 2019

mheon commented Jun 10, 2019

giuseppe commented Jun 11, 2019

nertpinx commented Jun 11, 2019

giuseppe commented Jun 12, 2019

nertpinx commented Jun 14, 2019

intelfx commented Mar 12, 2020

mheon commented Mar 12, 2020

mheon commented Mar 12, 2020

intelfx commented Mar 12, 2020 •

edited

Loading

rhatdan commented Mar 12, 2020

intelfx commented Mar 14, 2020 •

edited

Loading

giuseppe commented Mar 16, 2020

bulhoes commented Nov 6, 2020

fbezdeka commented Dec 4, 2020 •

edited

Loading

rhatdan commented Dec 7, 2020

mheon commented Dec 7, 2020

rhatdan commented Dec 7, 2020

SISheogorath commented Feb 13, 2021

lfarkas commented Feb 24, 2021

patrickbkr commented Jun 1, 2021

rhatdan commented Jun 1, 2021

brianjmurrell commented Dec 7, 2022

mheon commented Dec 8, 2022

jiridanek commented Jan 11, 2023 •

edited

Loading

DNS does not work in containers if host uses local server #3277

DNS does not work in containers if host uses local server #3277

Comments

nertpinx commented Jun 7, 2019

nertpinx commented Jun 7, 2019

mheon commented Jun 7, 2019

nertpinx commented Jun 10, 2019

rhatdan commented Jun 10, 2019

nertpinx commented Jun 10, 2019

rhatdan commented Jun 10, 2019

mheon commented Jun 10, 2019

nertpinx commented Jun 10, 2019

mheon commented Jun 10, 2019

giuseppe commented Jun 11, 2019

nertpinx commented Jun 11, 2019

giuseppe commented Jun 12, 2019

nertpinx commented Jun 14, 2019

intelfx commented Mar 12, 2020

mheon commented Mar 12, 2020

mheon commented Mar 12, 2020

intelfx commented Mar 12, 2020 • edited Loading

rhatdan commented Mar 12, 2020

intelfx commented Mar 14, 2020 • edited Loading

giuseppe commented Mar 16, 2020

bulhoes commented Nov 6, 2020

fbezdeka commented Dec 4, 2020 • edited Loading

rhatdan commented Dec 7, 2020

mheon commented Dec 7, 2020

rhatdan commented Dec 7, 2020

SISheogorath commented Feb 13, 2021

lfarkas commented Feb 24, 2021

patrickbkr commented Jun 1, 2021

rhatdan commented Jun 1, 2021

brianjmurrell commented Dec 7, 2022

mheon commented Dec 8, 2022

jiridanek commented Jan 11, 2023 • edited Loading

intelfx commented Mar 12, 2020 •

edited

Loading

intelfx commented Mar 14, 2020 •

edited

Loading

fbezdeka commented Dec 4, 2020 •

edited

Loading

jiridanek commented Jan 11, 2023 •

edited

Loading