-
-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
frankenphp is exiting on SIGPIPE failure #1020
Comments
Hi, this is likely caused by an extension that doesn't block the SIGPIPE signal. We fixed the amqp extension (php-amqp/php-amqp#550, #682) as well as FrankenPHP itself (#651) in the past. Could you try to gather a stack trace, wo we can identify (and hopefully fix) the offender? The easiest way is to use a debug binary with GDB: https://frankenphp.dev/docs/contributing/#debugging-segmentation-faults-with-static-builds But in this case using Thanks. |
Um... I don't really understand any of that?
|
Ok a bit of research later and I've set up strace (I think):
So now I'll wait for a day or so and see if we catch any fish... |
@geoidesic, any fish? |
Nope. The strace output file is empty but the frankenphp pid has changed. So I guess don't know how to capture that. I can see that it's only been up and running for 2 days, which means it's still falling over periodically:
|
Anything in |
A lot of the TLS handshake errors are originating from the server itself e.g.:
This is the IP address of the server itself: 82.180.154.41 |
Here's some more logs. Note this is a standard request made to the app. After the request completes successfully there seems to be a number of requests from 127.0.0.1 to 127.0.0.1 that fail the TLS lookup. These are not generated by our application code. Something – presumably frankenphp – is causing those spurious requests. I don't understand it. This is happening frequently. Very frequently.
|
FrankenPhp doesn't make any requests by itself. To be honest, if your application is doing nonsensical things, you might want to check if you have been hacked. I remember a client once where I saw stuff like this and discovered a whole cialis spam website in a hidden folder. But to reiterate, FrankenPhp doesn't make any requests. It forwards request from caddy to php. So, if you see a request in your logs, that request was made by a client. |
Yeah. Fair comment really. The legacy code is a bit of a custom job which is hard to evaluate fully because of how it's been written. Definitely not hacked but quite possibly some rogue legacy code doing weird things. Also maybe a red-herring because it's not related to the SIGPIPE fault, it's just been getting in the way as I'm studying the logs. |
I've been through the logs again today. I can't see any SIGPIPE failures. My logs go back to 8 September; nothing. So maybe that's no longer an issue. Frankenphp is still stopping and restarting often but it seems to be a deactivation, e.g.:
It's not clear why this might be happening. Any suggestions on how to monitor this? |
In the next release, there will be some metrics: https://github.com/dunglas/frankenphp/blob/main/docs/metrics.md That might be helpful. |
I think we are observing the very same issue with the static build 1.2.5. Regardless if run by systemd or not, frankenphp is killed after a few minutes of runtime by a SIGPIPE. Absolut no journal entry when frankenphp gets cut off:
Attached strace output. |
@Eckieck could you try to replace your static build with a debug version (the binary with the |
Could you run the |
Of course, here is another another, complete one. |
Thank you very much. It looks like this happen in the Go runtime itself, not in FrankenPHP or PHP. This is very weird. This seems to occur when Go is doing some crypto things to handle a TLS connection. Is there anything special on your system (processor for instance) that may be uncommon? |
According to the manual, this should not happen: https://pkg.go.dev/os/signal#hdr-SIGPIPE Maybe similar: |
There is something very weird in the logs. The thread is named |
The system is Rocky Linux 9.4 in a QEMU-VM on Proxmox 8.2. CPU is Xeon Silver 4416+. Crash happens after between a few seconds and three minutes. Barly usable. |
Ok, systemd seems to do weird things: https://github.com/moby/moby/blob/1b7c209de1f158937480f675e85d075fda1c9743/cmd/dockerd/docker.go#L87-L90 Let's try to totally ignore SIGPIPE Go-side (we already ignore it C-side) to see if this fixes the issue: #1101 Could you try this patch @Eckieck? The CI will build a test binary. |
But: while debugging what is happening i started frankenphp without systemd and it got killed, too. Should i re-try this with the old build?
Using the generated debug-build I would say it is now fixed, new runtime record:
Thank you! |
@dunglas How can I get hold of that binary for my environment? That way I can also test it. (I don't know what you mean by Tx. |
The fix-build is still working for me. It is segfaulting sometimes after a few hours, but i have to investigate this further. The sigpipe-problem is gone. @geoidesic Have look here: https://github.com/dunglas/frankenphp/actions/runs/11374351086 |
@Eckieck @dunglas That looks like a docker build. I'm using static binaries. Maybe I just don't understand the UI? I can't find where to download the binary. [Update: nvm. they are linked at the bottom of that page you linked.] |
FYI @geoidesic (only seeing this now)
That looks like normal bot traffic. To explain, Caddy needs to choose a certificate to perform the TLS handshake. Many bots don't bother to send TLS SNI to declare a domain to try to just get a default certificate from the server when it scans the open internet. Caddy will look for a certificate matching the RemoteAddr of the connection when TLS SNI is empty, but doesn't find one (because you didn't configure it to generate one) so the handshake fails. Totally normal. |
What happened?
The problem we’re experiencing now with this new setup is that the frankenphp service keeps shutting down without any reason that I can see.
The systemd unit file is configured to restart on failure, but I guess this shutdown is not seen as a failure because it’s not restarting.
Build Type
Official static build
Worker Mode
No
Operating System
GNU/Linux
CPU Architecture
x86_64
PHP configuration
Relevant log output
From
journalctl -xeu frankenphp.service
The text was updated successfully, but these errors were encountered: