Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNI DNS config ignored for Docker task #11102

Closed
radriaanse opened this issue Aug 30, 2021 · 6 comments · Fixed by #20007
Closed

CNI DNS config ignored for Docker task #11102

radriaanse opened this issue Aug 30, 2021 · 6 comments · Fixed by #20007
Assignees
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/cni theme/networking type/bug
Milestone

Comments

@radriaanse
Copy link

radriaanse commented Aug 30, 2021

Nomad version

Nomad v1.1.2 (60638a0)

Operating system and Environment details

CentOS Stream release 8
Docker version 20.10.7, build f0df350

Issue

When setting up name servers inside a CNI network configuration, for example using the bridge plugin, Nomad seems to not take into account the name servers in the context of starting a Docker container.
Although the upstream bridge plugin at a first glance doesn't seem to support setting DNS this way (but rather should do so via an ipam plugin; which isn't implemented) it does work as can be seen by the debug log that Nomad produces on receiving the CNI config.

I've marked it as a bug since looking at the source it does actually parse this information but then apparently gets lost somewhere in the process.

Reproduction steps

Setup Nomad client CNI:

client {
  enabled = true
  cni_path = "/usr/libexec/cni"
  cni_config_dir = "/etc/cni/net.d"
}

And configure a CNI network:

{
	"cniVersion": "0.4.0",
	"name": "testnet",
	"plugins": [
		{
			"type": "bridge",
			"bridge": "testnet",
			"ipMasq": true,
			"isDefaultGateway": true,
			"forceAddress": true,
			"dns": {
				"nameservers": [
					"1.1.1.1"
				]
			},
			"ipam": {
				"type": "host-local",
				"ranges": [
					[
						{
							"subnet": "172.16.3.0/24",
							"rangeStart": "172.16.3.10",
							"rangeEnd": "172.16.3.250",
							"gateway": "172.16.3.1"
						}
					]
				]
			}
		},
		{
			"type": "firewall",
			"backend": "firewalld"
		},
		{
			"type": "portmap",
			"capabilities": {
				"portMappings": true
			},
			"snat": true
		}
	]
}

Expected Result

The name servers defined in the CNI conflist are added into the resolv.conf

Actual Result

Docker adds the default/fallback name servers to the resolv.conf

nomad exec 8b168412-5337-ccd7-2558-fd8cbe92d1b9 cat /etc/resolv.conf

nameserver 8.8.8.8
nameserver 8.8.4.4

Job file (if appropriate)

job "testnameserver" {
  datacenters = ["dc1"]

  group "test" {
    network {
      mode = "cni/testnet"
    }

    task "test" {
      driver = "docker"

      config {
        image = "docker.io/library/busybox"
        command = "sleep"
        args = ["infinity"]
      }
    }
  }
}

Nomad Server logs (if appropriate)

[DEBUG] client.fingerprint_mgr: detected CNI network: name=testnet
[DEBUG] client.alloc_runner.runner_hook: received result from CNI: alloc_id=8b168412-5337-ccd7-2558-fd8cbe92d1b9 result={"Interfaces":{"eth0":{"IPConfigs":[{"IP":"172.16.3.12","Gateway":"172.16.3.1"}],"Mac":"32:c5:28:82:c9:2a","Sandbox":"/var/run/docker/netns/d55bd9024671"},"testnet":{"IPConfigs":null,"Mac":"46:e0:9a:96:12:53","Sandbox":""},"vethc7fa5953":{"IPConfigs":null,"Mac":"aa:2e:eb:38:be:8b","Sandbox":""}},"DNS":[{"nameservers":["1.1.1.1"]}],"Routes":[{"dst":"0.0.0.0/0","gw":"172.16.3.1"}]}
@jrasell
Copy link
Member

jrasell commented Aug 31, 2021

Hi @radriaanse and thanks for raising this issue. I have reproduced this locally and spent a fair time locally investigating this but have unable to so far find a solution or the exact cause of the issue. It does, however, initially seem to be a problem with the CNI plugins repository rather than Nomad. Issue #431 seems to roughly describe a similar problem, however, there has been no responses from containernetworking members.

Testing outside of Nomad to check whether we are manipulating state we shouldn't be, I used cnitool to allocate a network to a network namespace. I firstly create a network namespace using ip netns add testnamespace. I then wrote the following example cni config to disk which was used for all the cnitool commands:

{
	"cniVersion": "0.4.0",
	"name": "testnet",
	"dns": {
		"nameservers": [
			"1.1.1.1"
		]
	},
	"type": "bridge",
	"bridge": "testnet",
	"ipMasq": true,
	"isDefaultGateway": true,
	"forceAddress": true,
	"ipam": {
		"type": "host-local",
		"ranges": [
			[{
				"subnet": "172.16.3.0/24",
				"rangeStart": "172.16.3.10",
				"rangeEnd": "172.16.3.250",
				"gateway": "172.16.3.1"
			}]
		]
	}
}

The cnitool add command CNI_PATH=/opt/cni/bin/ cnitool add testnet /var/run/netns/testnamespace completed successfully with the following detailed output which includes the DNS. Nomad logs this object without any manipulation, which explains why the log line suggests success.

{
    "cniVersion": "0.4.0",
    "interfaces": [
        {
            "name": "testnet",
            "mac": "de:b4:cf:9a:0d:e7"
        },
        {
            "name": "vethc75b9344",
            "mac": "1a:ee:4f:81:db:60"
        },
        {
            "name": "eth0",
            "mac": "86:d4:d1:aa:f5:8d",
            "sandbox": "/var/run/netns/testing"
        }
    ],
    "ips": [
        {
            "version": "4",
            "interface": 2,
            "address": "172.16.3.22/24",
            "gateway": "172.16.3.1"
        }
    ],
    "routes": [
        {
            "dst": "0.0.0.0/0",
            "gw": "172.16.3.1"
        }
    ],
    "dns": {
        "nameservers": [
            "1.1.1.1"
        ]
    }

When I exec into the network namespace to check the resolv.conf the file is not as we supplied, and is a copy of the host machine /etc/resolv.conf:

$ ip netns exec testnamespace cat /etc/resolv.conf
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "systemd-resolve --status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0
search Home

I also tested by adding a temporary custom resolv.conf file to disk and referencing that within the ipam configuration block as shown below. This didn't achieve any different/better results.

{
	"cniVersion": "0.4.0",
	"name": "testnet",
	"type": "bridge",
	"bridge": "testnet",
	"ipMasq": true,
	"isDefaultGateway": true,
	"forceAddress": true,
	"ipam": {
		"type": "host-local",
		"resolvConf": "/tmp/cni_resolv.conf",
		"ranges": [
			[{
				"subnet": "172.16.3.0/24",
				"rangeStart": "172.16.3.10",
				"rangeEnd": "172.16.3.250",
				"gateway": "172.16.3.1"
			}]
		]
	}
}

Apologies I am unable to provide a workaround or propose a solution at this time. I will keep the issue open; if we have time to continue the investigation we will do so and respond with any updates.

@radriaanse
Copy link
Author

Thanks @jrasell for looking into it, I didn't think about using something like cnitool to verify the behavior outside of Nomad.
Looks like some oddness going on with the bridge plugin indeed and I mistakenly assumed it was Nomad since the output of the plugin looked just fine.

I'll also try to dig into this further and update here!

@SPROgster
Copy link

SPROgster commented Nov 15, 2021

Good time of a day, colleges.

I have same issue and run some debugs and maybe find out point there dns config is missing.
At this point dns config stored at ar.state.NetworkStatus.DNS (allocRunner), but can't find out any not nil reference to correct network configuration in task struct.

After all, I can't confirm ResolvConfPath file's correct creation via CNI

It seems nomad should create resolv.conf file, like it does using network->dns stanza because i failed to find any real dns configuration in cni plugins. Also i failed to find any dns configuration mechanics in cni docs

[1] https://unix.stackexchange.com/questions/443898/separate-dns-configuration-in-each-network-namespace

@sinisterstumble
Copy link

Can confirm a similar behavior with ipvlan and host-local resolvConf option.

{
  "cniVersion": "0.4.0",
  "name": "vpc",
  "plugins": [
    {
      "type": "ipvlan",
      "master": "eth1",
      "mode": "l3s",
      "ipam": {
        "type": "host-local",
        "resolvConf": "/opt/cni/run/vpc-resolv.conf",
        "dataDir": "/var/run/cni",
        "ranges": [
          [
            {
              "subnet": "172.16.6.96/28"
            }
          ],
          [
            {
              "subnet": "2a05:d014:d9e:c300:4f2:0:0:0/80"
            }
          ]
        ],
        "routes": [
          {
            "dst": "::/0"
          },
          {
            "dst": "0.0.0.0/0"
          }
        ]
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      },
      "snat": true
    },
    {
      "type": "firewall",
      "backend": "iptables"
    }
  ]
}
nameserver 172.16.6.97
nameserver 2a05:d014:d9e:c300:4f2::1
$ nomad alloc exec -i -t -task redis 077c1c44 /bin/bash
root@8e87a7b70408:/data# cat /etc/resolv.conf 

nameserver 8.8.8.8
nameserver 8.8.4.4

@jrasell jrasell self-assigned this Jan 7, 2022
@tgross
Copy link
Member

tgross commented Feb 16, 2024

While working on #10628 I bumped into this. Something like #16624 might be the fix, but I'll need to get it sorted out sooner rather than later in any case.

@tgross tgross assigned tgross and unassigned jrasell Feb 16, 2024
@tgross tgross added stage/accepted Confirmed, and intend to work on. No timeline committment though. and removed stage/needs-investigation labels Feb 16, 2024
@tgross tgross added this to the 1.7.x milestone Feb 16, 2024
tgross added a commit that referenced this issue Feb 16, 2024
CNI plugins may set DNS configuration, but this isn't threaded through to the
task configuration so that we can write it to the `/etc/resolv.conf` file as
needed. Add the `AllocNetworkStatus` to the alloc hook resources so they're
accessible from the taskrunner, which will prepend the DNS entries to any
entries provided by the user.

Fixes: #11102
tgross added a commit that referenced this issue Feb 20, 2024
CNI plugins may set DNS configuration, but this isn't threaded through to the
task configuration so that we can write it to the `/etc/resolv.conf` file as
needed. Add the `AllocNetworkStatus` to the alloc hook resources so they're
accessible from the taskrunner. Any DNS entries provided by the user will
override these values.

Fixes: #11102
tgross added a commit that referenced this issue Feb 20, 2024
CNI plugins may set DNS configuration, but this isn't threaded through to the
task configuration so that we can write it to the `/etc/resolv.conf` file as
needed. Add the `AllocNetworkStatus` to the alloc hook resources so they're
accessible from the taskrunner. Any DNS entries provided by the user will
override these values.

Fixes: #11102
tgross added a commit that referenced this issue Feb 20, 2024
CNI plugins may set DNS configuration, but this isn't threaded through to the
task configuration so that we can write it to the `/etc/resolv.conf` file as
needed. Add the `AllocNetworkStatus` to the alloc hook resources so they're
accessible from the taskrunner. Any DNS entries provided by the user will
override these values.

Fixes: #11102
tgross added a commit that referenced this issue Feb 20, 2024
…o release/1.5.x (#20012)

CNI plugins may set DNS configuration, but this isn't threaded through to the
task configuration so that we can write it to the `/etc/resolv.conf` file as
needed. Add the `AllocNetworkStatus` to the alloc hook resources so they're
accessible from the taskrunner. Any DNS entries provided by the user will
override these values.

Fixes: #11102

Co-authored-by: Tim Gross <tgross@hashicorp.com>
tgross added a commit that referenced this issue Feb 20, 2024
…o release/1.6.x (#20013)

CNI plugins may set DNS configuration, but this isn't threaded through to the
task configuration so that we can write it to the `/etc/resolv.conf` file as
needed. Add the `AllocNetworkStatus` to the alloc hook resources so they're
accessible from the taskrunner. Any DNS entries provided by the user will
override these values.

Fixes: #11102

Co-authored-by: Tim Gross <tgross@hashicorp.com>
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/cni theme/networking type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants