-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
init.krun
does not reap zombie processes
#189
Comments
This is a regression I've also noticed last week. I have a fix here, will create a PR later. |
Tell the kernel that we want to ignore SIGCHLD so it'll reap our children for us to avoid leaving zombie objects. Fixes containers#189 Signed-off-by: Sergio Lopez <slp@redhat.com>
Tell the kernel that we want to ignore SIGCHLD so it'll reap our children for us to avoid leaving zombie objects. Fixes #189 Signed-off-by: Sergio Lopez <slp@redhat.com>
This issue came up in my feed. Never realized krun has it's own init system. FWIW systemd is as fast as written from scratch init systems if it's trimmed down like the one in this dracut module-setup.sh file: https://gitlab.com/CentOS/automotive/rpms/dracut-automotive/-/blob/main/module-setup.sh I've written an init system more than once, but when compared it to these mini systemd setups, it's typically just as fast and less maintenance to use systemd, with some custom binaries for specific tasks as systemd .service files. Now the fat complicated systemd configuration from Fedora isn't so fast, but that's just because by default there's like a million features and .service units configured, when only around 5 units are actually needed. |
The above systemd is configured for initramfs, but just giving an example of how lean systemd can be. I am glad I wrote an init system at least once for the learning experience though... |
I'm guessing its because we want a statically compiled init system for all OSes 🤔 |
Since krun shares the filesystem with the host OS, I don't think systemd is a good idea... I don't think anyone tests multiple instances of systemd that both think they are PID 1 but share a filesystem, that sounds like a recipe for trouble... |
This is one of the reasons podman was created, multiple instances of systemd, sharing filesystems, etc. this use case is regularly tested and deployed. I'm not saying this is why we should or shouldn't use systemd but saying nobody tests this is just not true. systemd can be as simple as a binary that acts as a process manager that forks other processes, almost every feature in systemd is optional (and at run-time, just change around the systemd unit files to do what is desired). |
systemd-nspawn does this kinda thing also but I'm more familiar with podman. |
I thought the whole point of containers was that they run with a different filesystem (root)? We run with the same filesystem root.
Evidently the systemd people don't think this is supposed to work. |
Yup and the podman equivalent is:
I do question the approach of sharing the whole root with both OSes, in the vast majority of VM/container solutions, it's pick and share what you need rather than share everything, even if the "share what you need" ends up being 80% of the hosts contents. I also think we'd re-implement less this way. But if we want to try something unique, why not I guess :) This is very loosely related to other conversations going on at the moment, again the angle is more towards ephemeral containers though: |
Like for example if we were desigining systemd to be run in a microVM, it's basically just ensuring you don't include certain directories and populating those with alternate configs, etc. this kind of thing. |
A bootc image for inside the microVM could be very well suited for this use-case eventually also, even on a non-booc OS. |
It also wouldn't be a bad idea to reduce the pressure on virtofs tbh. |
But don't get me wrong, I'd be happy to see this approach continue with virtiofs by default. It's novel and it could be interesting to see a project like this one try something completely different. |
This way of using PID1 for example, doesn't initialize selinux in the guest kernel, now for this use-case maybe some people don't care. But some people care deeply about having selinux on in all running kernel instances. If we had a systemd binary in there, we could pick whether to initialize selinux or not. |
That doesn't actually run systemd, it just sets up the environment to run systemd. It also doesn't actually use the host filesystem, instead it sets up an overlayfs. If you try to actually use the host FS:
So it doesn't work. So nobody can be testing this, by definition. It really is a very, very different usecase to all the container stuff people do. |
Yeah I understand, it was the wording of this was open to interpretation:
I read this and instantly got surprised as people do this all the time, but then you clarified with filesystem root, just a misunderstanding. It wouldn't make sense to have exactly the same root filesystem anyway using two systemd's, the guest one would ideally have a modified /usr, /etc, etc. to trim it down so it's minimised. I assumed we were gonna do things more like the ChromeOS Linux environment approach, but I guess not, interesting to see where this goes :) |
This does run systemd, it sets up systemd and runs it, the /use/lib/systemd/systemd part exec's systemd.
This is interesting, this is supposed to work, pivot_rooting to yourself is weird though, this should probably be logged as a bug, if anyone cares about this feature.
|
(Moved from https://github.com/slp/krun/issues/16)
@asahilina:
Is there some documentation of what PID 1 is expected to do?
Relevant writeups:
The text was updated successfully, but these errors were encountered: