-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrading installed C-Lightning package breaks running instance in potentially dangerous ways #4346
Comments
An idea would be for We could also have |
That would fail if |
Well, I feel this is a reasonably clean solution that does not require us to move to zygote/stemcell type. On the other hand, with the vague whispers of Android restricting process-forking, it may be needed to put everything in a single process anyway, which would mean something very much like a zygote would be needed as well. |
Yeah, I learned this the hard way, and I now shutdown before We could have subdaemons send their version string on startup, then restart lightningd (after a few seconds' sleep) if the version is wrong. If we restart and our own version hasn't changed, then simply fail. I think upgrading running instances is a better choice than trying to keep them going? |
Self-restarting is an interesting thought. One nice aspect is that the new process would retain the same PID as the old process, so system service managers (like OpenRC) wouldn't notice anything awry at any instant. |
I think this is important, is there any update? Should I avoid the PPA and just install manually (so that I can be sure ubuntu does not decide to upgrade automatically) @rustyrussell |
@mrmanpew: I'm not that well-versed in Debian-like distros, but I would think you can pin/freeze an installed package so that it will not be upgraded unless unpinned/thawed. |
We could also just add a protocol version (sha256 hash of the wire file?) to the init messages, so subdaemons can decide whether it matches their version, and throw a tantrum if it doesn't. That'd not be as seamless as the zygote or file descriptor option, but it'd be safe and noisy pointing the user to towards how to fix the issue (restart). |
Users would still need to freeze their installed version of C-Lightning so that the OS does not upgrade it automatically in the background. I can tell you, one of the greatest annoyances as a node operator is coming back to my machine only to notice that my node shit the bed half a day ago. Yes, I could use some sort of service babysitter that automatically restarts crashed services, but I'm reluctant to do that, as it would paper over problems. |
You still shouldn't do this (you could get some transient failures), but at least you have a decent chance if you reinstall over a running daemon, instead of getting confusing internal errors if message formats have changed. Changelog-Added: lightningd: we now try to restart if subdaemons are upgraded underneath us. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Fixes: ElementsProject#4346
You still shouldn't do this (you could get some transient failures), but at least you have a decent chance if you reinstall over a running daemon, instead of getting confusing internal errors if message formats have changed. Changelog-Added: lightningd: we now try to restart if subdaemons are upgraded underneath us. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Fixes: ElementsProject#4346
You still shouldn't do this (you could get some transient failures), but at least you have a decent chance if you reinstall over a running daemon, instead of getting confusing internal errors if message formats have changed. Changelog-Added: lightningd: we now try to restart if subdaemons are upgraded underneath us. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Fixes: ElementsProject#4346
While you all are considering a fundamental redesign of C-Lightning's daemon organization, I would request that you please consider what happens when a C-Lightning installation is upgraded on disk while an instance of that installation is running.
It is customary in the Linux world for the system package manager to upgrade installed packages simply by replacing executable files on disk with newer versions, irrespective of whether those executables may currently be running. This is fine under POSIX semantics, as the mapped VMAs of any running processes will maintain references to the inodes of the replaced versions, preventing them from being deallocated on disk, even after their link counts drop to zero, until no more VMAs reference them. Thus, running daemons will continue running the old code until they are subsequently restarted. While this works just fine for most daemons, it fails spectacularly for C-Lightning, which spawns subdaemons at arbitrary times by
execve
ing their executable images on disk without regard for whether those images are of the same (or compatible) version as their parent daemons.One potential solution would be to use a "zygote" process in the manner of Chromium/Chrome. A zygote process starts once upon startup of the top-level process and has the sole purpose of forking child processes, which in turn subsequently differentiate into whatever subdaemon specialization is needed, notably without
execve
ing anything on disk. However, this pattern may not be well-suited to C-Lightning, given its capability for dynamically loading and unloading plugins.Another possibility would be for the top-level
lightningd
process toopen
every critical subdaemon executable file at startup, and then when spawning a subdaemon, the appropriate/proc/self/fd/<n>
symlink should beexecve
d rather than/usr/libexec/c-lightning/<whatever>
. This would have the effect of executing whatever version of the subdaemon was installed at the time whenlightningd
was launched, irrespective of whatever version of the subdaemon might be installed at the time of spawning. The downsides are: this only works on Linux, and the process name of the subdaemon will then be some opaque number, though I think there may be some tricks to rewrite that for the benefit ofps
,top
, and friends. Spawning non-critical plugins could still be done byexecve
ing whatever happens to be on disk at the time, with the understanding that the version of the plugin on disk may speak a different protocol than the parent process.One last potential solution (really more of a punt) would be for subdaemons simply to handshake on a protocol version upon spawning. This wouldn't allow continued operation in the event of a version mismatch, but it would at least protect against accidental catastrophe due to internal protocol changes.
Impetus for this request? I have personally experienced my C-Lightning node dying upon attempting to spawn a new subdaemon after I had upgraded the installed package on disk. Note that this does not happen immediately upon upgrading the package on disk but will happen at some unpredictable time afterward. To mitigate this problem, I have hacked up the Gentoo ebuild for C-Lightning so that it cowardly refuses to upgrade an installation of C-Lightning if it notices that an instance may be running, but this is gross and runs contrary to the expectations of users, who expect that they can upgrade packages on disk and then restart their services afterward to switch to the new versions.
The text was updated successfully, but these errors were encountered: