Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No network after installing systemd-genie until restart #130

Closed
fliespl opened this issue Mar 12, 2021 · 42 comments
Closed

No network after installing systemd-genie until restart #130

fliespl opened this issue Mar 12, 2021 · 42 comments
Labels
help wanted Extra attention is needed

Comments

@fliespl
Copy link

fliespl commented Mar 12, 2021

Using ubuntu 20.04 WSL2.

After initial install my networking goes down. I need to wsl --shutdown and then it works correctly. Any idea what can I do to fix it during deploying systemd-genie?

@fliespl
Copy link
Author

fliespl commented Mar 12, 2021

Okay, I figured the problem is systemd-resolved.

Curious now, if I should mask it as I did with other services or there is better / proper way?

@cerebrate
Copy link
Member

Hm. What's systemd-resolved doing to break it, do you know? (Is your /etc/resolv.conf by chance going missing?)

@fliespl
Copy link
Author

fliespl commented Mar 13, 2021

Once systemd starts after running genie -s /etc/resolv.conf is being replaced by systemd-rssolved with broken dns info:
nameserver 127.0.0.53

@cerebrate
Copy link
Member

Ah, right. Yeah, that's normal for systemd-resolved; it's designed to take over DNS resolution from local clients, hence the rewriting of /etc/resolv.conf.

You can just disable it, which should restore the default without-systemd behavior. Myself, though, I prefer the systemd-resolved resolver over the default one, so prefer to resolve this by configuring it appropriately for my network. You can get fuller details on how this is supposed to work with man 8 systemd-resolved and man 5 resolved.conf, but the short version is that you'll need to put the DNS resolvers you want in /etc/systemd/resolved.conf.

This is what mine looks like after doing this:

#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.
#
# Entries in this file show the compile time defaults.
# You can change settings by editing this file.
# Defaults can be restored by simply deleting this file.
#
# See resolved.conf(5) for details

[Resolve]
DNS=172.16.0.128 172.16.0.130
#FallbackDNS=
Domains=arkane-systems.lan
LLMNR=yes
MulticastDNS=yes
DNSSEC=allow-downgrade
DNSOverTLS=opportunistic
#Cache=yes
DNSStubListener=yes
#ReadEtcHosts=yes

But the important bits you have to set are the DNS= and Domains= lines.

@fliespl
Copy link
Author

fliespl commented Mar 13, 2021

Yeah, I know about that. What I meant was: how should it work under wsl2.

The problem is - it's unreliable at start, cause WSL generates it's own resolvconf by default.

So:

  • on clean WSL2 instance (ubuntu) resolv.conf is generated by windows generateResolvConf = true is true by default,
  • after installing systemd-genie - systemd-resolved takes it's place - and doesn't work at start until I make changes.

I am just not sure if it should take over if generatedResolvConf is true.

@cerebrate
Copy link
Member

Ah, right, gotcha.

For myself, I just turn off generateResolvConf and generateHosts, since I essentially never use WSL without genie/systemd, so it doesn't matter to me; a systemd-resolved instance is always running. I can see why you might want to disable systemd-resolved and stick with the autogenerated resolv.conf if you do sometimes use WSL in the raw, though.

...I should probably put some "things you might want to consider" on the wiki one of these days, but time continues to be unfortunately scarce.

@fliespl
Copy link
Author

fliespl commented Mar 13, 2021

Sorry to add to your plate :)

TBH I don't use generateResolvConf - I disable it and just create my own resolve.conf file.

I wanted to create proper playbook for both debian and ubuntu. I will just make a seperete rules for debian and ubuntu then. On debian - custom resolv.conf file and on ubuntu - suggested changes to systemd-resolved.

Thanks! Appreciate all your work here :)

@cerebrate
Copy link
Member

That's how it should work, though. resolv.conf contains that to point DNS lookups at systemd-resolved (listening on 127.0.0.53), and then systemd-resolved does the lookup based on the nameservers you specify in resolved.conf.

My resolv.conf looks just the same.

@fliespl
Copy link
Author

fliespl commented Mar 15, 2021

Thank you for all your help @cerebrate. Really appreciate it!

@fliespl
Copy link
Author

fliespl commented Apr 28, 2021

Strange thing is that newest build + systemd-genie made DNS resolution fail when:
generateResolvConf = false

and systemd-resolver is set up.

flies@flies-pc-wsl:/mnt/c/Users/flies$ ping google.com
ping: google.com: Name or service not known

/etc/systemd/resolved.conf
[Resolve]
DNS=8.8.8.8

I have to generate:
/etc/resolv.conf with

nameserver 127.0.0.53

on each boot.

@PavelSosin-320
Copy link

PavelSosin-320 commented Apr 28, 2021

@fliespl @Celebrate It is expected. You should configure DNS in another file - see [Resolved service configuration] /etc/systemd/resolved.conf (https://www.freedesktop.org/software/systemd/man/systemd-resolved.service.html). Put your DNS address into DNS properties following the instruction and it will work amazing. You can test your configuration using resolvectl query. Use either your DNS proxy, DNS box or service running on the WiFi router as a 1st primary DNS and the best available DynDNS Cloud DNS service IP as the second.

@fliespl
Copy link
Author

fliespl commented Apr 28, 2021

@PavelSosin-320 my configuration is correct. I think you missed the point or I didn't understand you properly.

/etc/systemd/resolved.conf
[Resolve]
DNS=8.8.8.8

$ resolvectl query wp.pl
wp.pl: 212.77.98.9 -- link: eth0

$ ping wp.pl
ping: wp.pl: Name or service not known

It was working this way up till newest WSL + Genie update with such configuration generateResolvConf = false

But it started failing after updates.

After I add following - it starts resolving correctly.

nameserver 127.0.0.53 to /etc/resolv.conf

$ ping wp.pl
PING wp.pl (212.77.98.9) 56(84) bytes of data.
64 bytes from www.wp.pl (212.77.98.9): icmp_seq=1 ttl=53 time=10.0 ms

@PavelSosin-320
Copy link

P.S. Don't miss to enable and start systemd-networkd service to ensure DNS reachability. Check it using networlctl:
eth0 ether routable unmanaged

@PavelSosin-320
Copy link

@fliespl I see wp.pl via Quad9 DNS perfectly:
ping wp.pl
PING wp.pl (212.77.98.9) 56(84) bytes of data.
64 bytes from www.wp.pl (212.77.98.9): icmp_seq=1 ttl=53 time=85.2 ms

I use DynDNS service running in my WiFi router backed by 9.9.9.9 DNS that has an instance running on AWS infra close to me . So, I have a double DNS cache: router's OpenWRT plus systemd-resolved cache.

@fliespl
Copy link
Author

fliespl commented Apr 28, 2021

@PavelSosin-320 what is your windows build + genie version + contents of /etc/resolv.conf?

@PavelSosin-320
Copy link

@fliespl
I'm Windows insider, so Windows build causes me more problems than solutions. Currently, I use genie 1.39 but it works stable since 1.37. My wsl.conf contains
[network]
generateHosts = true <- keep it true or create blank /etc/hosts to avoid problems with NetworkManager.
generateResolvConf = false

P.S. As I saw for your geolocation the best choice is either OpenDNS 1.1.1.1 (Frankfurt) or 9.9.9.9 (Warsaw, Frankfurt). GoogleDNS is too far.

@fliespl
Copy link
Author

fliespl commented Apr 28, 2021

@PavelSosin-320 it doesn't matter which DNS I choose ;) It's not the root cause.

Gonna dig into it.

@fliespl
Copy link
Author

fliespl commented Apr 28, 2021

Okay, just did a clean install.

It seems that:
resolvectl query works correctly

but direct system calls end up trying "127.0.0.1" for dns (instead of expected 127.0.0.53 - which DNS resolver uses).

Strace:
connect(7, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, 16) = 0

Ideas?

@cerebrate
Copy link
Member

For systemd-resolved to be configured correctly, now I engage in the absolute last resort of reading the documentation, what it really wants is for /etc/resolv.conf to be a symbolic link to /run/systemd/resolve/stub-resolv.conf . It uses the existence of this link to figure out what it ought to be doing, and if it's a file rather than a symlink to the right place, it will do things rather wrong.

See systemd-resolved(8) for more details, but that's probably the first thing to do.

@fliespl
Copy link
Author

fliespl commented Apr 29, 2021

@cerebrate yeap, that's what I do now after boot (Create this symlink manually).

But the problem is that it doesn't stick between boots, so a little confused, cause I didn't have to do it with previous versions (not sure if it's wsl change or systemd-genie?).

@fliespl
Copy link
Author

fliespl commented May 4, 2021

@cerebrate did you have a chance to look at this issue?

Or did I misunderstood your point and there is nothing to fix and it should be done manually on each boot (symlinking /etc/resolv.conf).

It's fine by me, cause I already have a custom service for that, but might be tricky for others.

Happy to help - I can provide any info that you might require.

@cerebrate
Copy link
Member

Where that one is concerned...

It turns out I've actually been suffering from that one for myself for a while, hacked a quick script together to fix it until I got around to solving the actual problem, then promptly forgot about it.

Unfortunately, I have now looked at it and concluded that I have have Not A Bloody Clue as to what could be deleting the resolv.conf symlink; only the more so after finding reports of people using Debian not-under-WSL who are having the exact same problem. So, short version, I have and probably will again, but I'm not expecting success any time soon, I'm afraid. 😢

@cerebrate cerebrate reopened this May 4, 2021
@cerebrate cerebrate added the help wanted Extra attention is needed label May 4, 2021
@fliespl
Copy link
Author

fliespl commented May 4, 2021

Do you think that setting sysdig process or auditctl with highest priority (start as fast as possible) could help find what is deleting that symlink?

@fliespl
Copy link
Author

fliespl commented May 10, 2021

@cerebrate just letting you know, that none of my ideas worked :D both required custom additional modules, which (obviously) didn't work right away with WSL.

Gonna try again if I have some other idea to pin-point it.

@esgie
Copy link

esgie commented Jun 6, 2021

@cerebrate did you have a chance to look at this issue?

Or did I misunderstood your point and there is nothing to fix and it should be done manually on each boot (symlinking /etc/resolv.conf).

It's fine by me, cause I already have a custom service for that, but might be tricky for others.

Happy to help - I can provide any info that you might require.

I have just solved the problem in a same way. According to systemd documentation, systemd auto-detects whether /etc/resolv.conf is a symlink to /run/systemd/resolve/stub-resolv.conf or not and proceeds accordingly, in particular it does not touch the symlink if it's valid and it seems to remove it is symlink to anything else. I have noticed that the same stub file is located in /usr/lib/systemd/resolv.conf - please make sure you aren't linking to this one - even of contents are the same; your /etc/resolv.conf symlink should point explicitly to /run/systemd/resolve/stub-resolv.conf. Then, it should not disappear between "reboots".

@esgie
Copy link

esgie commented Jun 8, 2021

Well, it seems that /etc/resolv.conf symlink is disappearing that way or another by default no matter what the target is if generateResolvConf is disabled. And it actually seems that we have to recreate it on our own on each launch.
I feel that wsl’s config option to disable generating resolv.conf automatically was designed to prevent wsl from touching the user’s manual dns config if he decides to use such, so it stays persistent between shutdowns. If the resolv.conf is still being deleted and needs to be reverted, the aim isn’t achieved.
Anyway. I prefer to avoid using hacky custom fixes if possible and trying to stick with the features offered by systemd. Symlink to systemd stub config can be therefore easily recreated without custom scripts, but by utilizing systemd-tmpfiles.d. To achieve this, one must create a config e.g. /etc/tmpfiles.d/00-stub-resolv.conf with contents:
L /etc/resolv.conf - - - - /run/systemd/resolve/stub-resolv.conf

Then, resolv.conf symlink should be recreated by systemd itself on genie init ;) This result in dns resolving fully working via systemd-resolved as designed (let me remind that one should also put his preferred DNS servers config into /etc/systemd/resolved.conf or - if one like to keep system config files provided by default untouched - /etc/systemd/resolved.conf.d/override.conf).

Please note that if using tmpfiles.d, when genie is initialized, /etc/resolv.conf will be replaced by the symlink even if it was created withgenerateResolvConf set to true (To avoid this, change first part of that tmpfiles.d entry above from L to L+). However, i prefer the default behavior combined with auto generation enabled. Auto generated resolv.conf means dns resolving will be working normally when running without genie / systemd. When enabling systemd, resolv.conf will be overwritten, and resolving will be handled to systemd.

@cerebrate
Copy link
Member

@esgie Thank you kindly for your investigation of this issue. Your solution to it will be included in genie 1.43.

cerebrate added a commit that referenced this issue Jun 9, 2021
@cerebrate
Copy link
Member

@esgie

Please note that if using tmpfiles.d, when genie is initialized, /etc/resolv.conf will be replaced by the symlink even if it was created withgenerateResolvConf set to true (To avoid this, change first part of that tmpfiles.d entry above from L to L+). However, i prefer the default behavior combined with auto generation enabled. Auto generated resolv.conf means dns resolving will be working normally when running without genie / systemd. When enabling systemd, resolv.conf will be overwritten, and resolving will be handled to systemd.

Can I check which build you're on? On 21390, this doesn't appear to be the behavior: auto-generated hosts and/or resolv.conf will overwrite the genie/systemd-resolved versions even when inside the bottle/systemd has started successfully.

@esgie
Copy link

esgie commented Jun 9, 2021 via email

@xuanruiqi
Copy link
Contributor

Perhaps not everyone wants to use systemd-resolved, though. What would I do if I want to use something else, like unbound?

@cerebrate
Copy link
Member

After installing genie, delete the /usr/lib/genie-stub-resolv.conf file to stop it from replacing resolv.conf with this link.

(I'd have included an option if genie did the job itself, but it doesn't, so. This is just optimizing for the default case, at the moment, figuring that people interested in replacing their resolver with a non-default cause can figure out what to do most easily.)

@esgie
Copy link

esgie commented Aug 4, 2021 via email

@PavelSosin-320
Copy link

@xuanruiqi Working with systemd-based distro people have no choice to use systemd-resolved because the entire networking stack is based on the pair of networked and resolved services started automatically as the network.target and serve everybody via D-Bus socket interface. If something is started via systemd unit and needs networking it would be better to add Requires=network.target and After=network-ready target to the unit to avoid false start. This is the "systemd way" to avoid described problem. Resolved itself works 10 times better than WSL's "generate resolved.conf".

@xuanruiqi
Copy link
Contributor

Well, I use Arch Linux, and it's not really necessary to use systemd-resolved. For instance I prefer to use unbound, and it's fully supported by Arch out of the box.

@PavelSosin-320
Copy link

PavelSosin-320 commented Aug 4, 2021

@xuanruiqi According to ArchLinux networking Arch Linux has crossed the front lines and plays now for systemd team. I expect that all utilities and libraries for Arch Linux use systemd networking today, including ping.
P.S. The simple way to restore networking is systemctl daemon-reexec - it will restart systemd networking. Then check systemctl status to see if the system is running.

@xuanruiqi
Copy link
Contributor

Using systemd (and systemd networking stack) doesn't mean all functionality have to be used. systemd-resolved isn't enabled by default on Arch, and also isn't really what most Arch users use. Many Arch users prefer unbound, for example. Arch provides out-of-box support for many DNS resolvers with systemd integration, including of course systemd-resolved but also many other resolvers.

That said, I agree with @esgie - "stub" mode shouldn't be the default as many users would rather not use systemd-resolved. It might be a good idea to make this configurable in genie.ini.

@PavelSosin-320
Copy link

@xuanruiqi The straightforward way to disable services is systemctl [--user]disable. "Jus-do-It" It's the persistent configuration and will be preserved for subsequent runs. I don't see any scenario when genie -i is explicitly or implicitly executed at least once and resolved can be disabled. I hope that systemd is able to process dependencies in this case correctly. The proposed mechanism may contradict the way how systemd works. If some unit requires/waits for networking target and this target will be reached partially due to invisible to systemd reasons it can break the entire unit chain. The resolve. conf already allows to choose how /etc/resolve.conf is used. So, it will be not only redundant functionality but also confusing documentation. Shortly, I don't think that genie can alter the normal systemd functionality.

@xuanruiqi
Copy link
Contributor

You seem to be missing my point. Many DNS resolvers have full systemd integration and can drop-in replace resolved; that's why Arch (and perhaps other dists) provide them as replacement for the default resolver. It will handle the networking target correctly, so there's no need to "alter the normal systemd functionality", because it is supposed to integrate with normal systemd functionality.

That said, the only difference from resolved is that it doesn't use the stub in /run/systemd/resolve/stub-resolv.conf, and apparently resolved is the only resolver that uses it. So I argue that a symlink shouldn't be the default, or at least configurable.

@PavelSosin-320
Copy link

PavelSosin-320 commented Aug 5, 2021 via email

@cerebrate
Copy link
Member

Okay, let's kick this discussion over to a new issue instead of repurposing this one. Now at #187.

@PavelSosin-320
Copy link

The resolver package comes with Distro installation directly from the distro repository. It is installed with its units spread over all directories where systemd searches unit files. I suppose that in 99% of cases it is initially enabled but can be disabled using systemctl disable unit-name. Certain unit attributes can be changed using systemctl edit unit-name. It doesn't change the original unit but creates the new drop-in file. It content is merged with the original unit next time when systemd reexec the daemon when the machine is rebooted, systemd daemon is reexecuted or genie -i runs. These are the precise boundaries that systemd draws for administrators. Since systemd package is installed with the distro it can't be changed using the outer config file - it is not read by systemd. Once systemd --system or systemd --user is fired it works according to its own highly standardized rules developed by Redhat. Although systemd.io is an independent company today the developers in this company are RedHat, i.e. IBM employers. The endorsement of genie by systemd.io last year is an enormous achievement because it opens the doors for genie into the enterprise software world. RedHat is not only the main Linux provider but also the main security and IT automation provider in this world. The genie only fires systemd. The Feature request for management systemd using outer file files is de-facto systemd feature request. If something can be done using systemctl, systemd-run, etc., and unit files it shall be done using existing tools, otherwise the doors can close.

@cerebrate
Copy link
Member

The endorsement of genie by systemd.io last year

Out of curiosity, is this online somewhere? I managed to miss it entirely...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants