Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systemd services does not wait until it is ready #1539

Closed
jreidinger opened this issue Aug 14, 2024 · 3 comments
Closed

Systemd services does not wait until it is ready #1539

jreidinger opened this issue Aug 14, 2024 · 3 comments

Comments

@jreidinger
Copy link
Contributor

In general issue is that Agama has correct systemd dependencies, but due to configuration is just start service and does not wait to have it really started. Result is e.g. this issue ( quoted below report and relevant IRC discussion part):

Hi, for agama auto installation, sometimes it will stay at production selection page and haven't start the auto installation. https://openqa.opensuse.org/tests/4401941#step/auto/7 From the service log, it seems login issue. service log link: https://openqa.opensuse.org/tests/4401941/file/auto-agama-logs.tar.gz   Please help to have a look. (edited) 
openqa.opensuse.org
openQA: opensuse-agama-9.0-agama-installer-x86_64-Build10.3-agama_boot_iso@uefi test results
openQA is a testing framework mainly for distributions

<jreidinger> ancorgs: we probably need some time to setup web service...from logs I see that agama service starts at 10:57:23 and agama-auto also starts at 10:57:23...but service is startup is done at 10:57:32...but agama auto already failed at 10:57:24 ... I think issue is that systemd just start binary...but we need some time to set it up. So we either misconfired systemd services or need to change it somehow. It works in past as CLI use dbus 
<jreidinger> and dbus waits till it start answering...but now it failed with missing auth token for web API.
@mchf
Copy link
Contributor

mchf commented Aug 14, 2024

Little bit idea without further investigation:
In my POV and systemd understanding the After= directive should guarantee that agama-auto is started once (not only) agama-web-server service is activated (otherwise I dont see the difference between After and Requires) so question is if the token is generated and written (by the web server) before the web service switches to active state or not. May be we need to call e.g. explicit fsync call as part of the web server startup process or so.

We can use workaround with PreExec=/usr/bin/sleep 10 in agama-auto but that is really ugly.

@jreidinger
Copy link
Contributor Author

I think issue is how systemd should recognize that it is started. We use type Simple which consider unit started immediatelly after exec. Maybe we should consider using notify and send message when it really started as described in https://www.freedesktop.org/software/systemd/man/latest/systemd.service.html#Type=

@jreidinger
Copy link
Contributor Author

for notify approach, I would say that this crate should be enough https://docs.rs/sd-notify/latest/sd_notify/

jreidinger added a commit that referenced this issue Aug 19, 2024
## Problem

Problem description is in issue #1539 


## Solution

Implement sd_notify for agama-web-service to notify systemd when web
service is ready to serve requests.
Improve systemd dependencies to better reflect dependencies. For
agama-auto adapt dependencies to reflect that CLI switch from direct
systemd communication to HTTP API. related PR is
#1438


## Testing

- *Tested manually*


## Screenshots

snippet from log before change ( truncated, but to demonstrate that
agama-auto runs too soon):

```
Aug 14 13:10:47 dhcp118.suse.cz systemd[1]: Started Agama Installer Service.
Aug 14 13:10:47 dhcp118.suse.cz systemd[1]: Started Agama Web Server.
Aug 14 13:10:47 dhcp118.suse.cz systemd[1]: Started Agama automatic profile runner.
Aug 14 13:10:47 dhcp118.suse.cz systemd[1]: Starting Postpone login prompt after the SSL fingerprint issue is generated...
Aug 14 13:10:47 dhcp118.suse.cz agama-auto[2689]: + export YAST_SKIP_XML_VALIDATION=1
Aug 14 13:10:47 dhcp118.suse.cz agama-auto[2689]: + YAST_SKIP_XML_VALIDATION=1
Aug 14 13:10:47 dhcp118.suse.cz agama-auto[2689]: + '[' -z '' ']'
Aug 14 13:10:47 dhcp118.suse.cz agama-auto[2691]: ++ awk -F agama.auto= '{sub(/ .*$/, "", $2); print $2}'
Aug 14 13:10:47 dhcp118.suse.cz agama-auto[2689]: + url=
Aug 14 13:10:47 dhcp118.suse.cz agama-auto[2689]: + '[' -z '' ']'
Aug 14 13:10:47 dhcp118.suse.cz agama-auto[2689]: + echo 'no autoinstallation profile'
Aug 14 13:10:47 dhcp118.suse.cz agama-auto[2689]: no autoinstallation profile
Aug 14 13:10:47 dhcp118.suse.cz agama-auto[2689]: + exit 0
Aug 14 13:10:47 dhcp118.suse.cz systemd[1]: agama-auto.service: Deactivated successfully.
Aug 14 13:10:47 dhcp118.suse.cz sshd-gen-keys-start[2683]: ssh-keygen: generating new host keys: RSA ECDSA ED25519
Aug 14 13:10:47 dhcp118.suse.cz sshd[2702]: Server listening on 0.0.0.0 port 22.
Aug 14 13:10:47 dhcp118.suse.cz sshd[2702]: Server listening on :: port 22.
Aug 14 13:10:47 dhcp118.suse.cz systemd[1]: Started OpenSSH Daemon.
Aug 14 13:10:47 dhcp118.suse.cz systemd[1]: Starting Generate issue file for SSH host keys...
Aug 14 13:10:47 dhcp118.suse.cz systemd[1]: agama-ssh-issue.service: Deactivated successfully.
Aug 14 13:10:47 dhcp118.suse.cz systemd[1]: Finished Generate issue file for SSH host keys.
Aug 14 13:10:47 dhcp118.suse.cz agama-web-server[2688]: registering event source with poller: token=Token(94923250365696), interests=READABLE | WRITABLE
Aug 14 13:10:47 dhcp118.suse.cz agama-web-server[2688]: Initializing
Aug 14 13:10:47 dhcp118.suse.cz agama-web-server[2688]: Waiting for DATA or OK from server
Aug 14 13:10:47 dhcp118.suse.cz agama-web-server[2688]: Received OK from server
Aug 14 13:10:47 dhcp118.suse.cz agama-web-server[2688]: Waiting for Unix FD passing agreement from server
Aug 14 13:10:47 dhcp118.suse.cz agama-web-server[2688]: Unix FD passing agreed by server
```

and now after change:

```
Aug 15 10:19:15 dhcp118.suse.cz systemd[1]: Started Agama Web Server.
Aug 15 10:19:15 dhcp118.suse.cz agama-web-server[12467]: Starting Agama web server at :::80
Aug 15 10:19:15 dhcp118.suse.cz agama-web-server[12467]: Starting Agama web server at :::443
...
Aug 15 10:19:15 dhcp118.suse.cz agama-web-server[12467]: Waiting for message on the socket..
...
Aug 15 10:19:15 dhcp118.suse.cz agama-web-server[12467]: Listening for property changes on org.freedesktop.NetworkManager.Device...
Aug 15 10:19:15 dhcp118.suse.cz systemd[1]: Started Agama automatic profile runner.
Aug 15 10:19:15 dhcp118.suse.cz agama-web-server[12467]: Message received on the socket: Msg { type: Signal, sender: UniqueName(Str(Borrowed(":1.2"))), path: ObjectPath("/org/freedesktop/systemd1/unit/agama_2dauto_2eservice"), iface: InterfaceName(Str(Borrowed("org.freedesktop.DBus.Properties"))), member: MemberName(Str(Borrowed("PropertiesChanged"))), body: Signature("sa{sv}as") }
Aug 15 10:19:15 dhcp118.suse.cz agama-auto[12479]: + export YAST_SKIP_XML_VALIDATION=1
Aug 15 10:19:15 dhcp118.suse.cz agama-auto[12479]: + YAST_SKIP_XML_VALIDATION=1
Aug 15 10:19:15 dhcp118.suse.cz agama-auto[12479]: + '[' -z '' ']'
...
Aug 15 10:19:15 dhcp118.suse.cz agama-web-server[12467]: Waiting for message on the socket..
Aug 15 10:19:15 dhcp118.suse.cz agama-auto[12482]: ++ awk -F agama.auto= '{sub(/ .*$/, "", $2); print $2}'
Aug 15 10:19:15 dhcp118.suse.cz agama-auto[12479]: + url=
Aug 15 10:19:15 dhcp118.suse.cz agama-auto[12479]: + '[' -z '' ']'
Aug 15 10:19:15 dhcp118.suse.cz agama-auto[12479]: + echo 'no autoinstallation profile'
Aug 15 10:19:15 dhcp118.suse.cz agama-auto[12479]: no autoinstallation profile
Aug 15 10:19:15 dhcp118.suse.cz agama-auto[12479]: + exit 0
Aug 15 10:19:15 dhcp118.suse.cz systemd[1]: agama-auto.service: Deactivated successfully.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants