-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
appservice: Replace nginx with Starlette reverse proxy, add session status API #37
Conversation
7624947
to
be8ebb0
Compare
I've spent some time trying the workarounds in aio-libs/aiohttp#4581 , and I can't get it to work. I keep getting |
efbd955
to
216ff8f
Compare
babd504
to
28722de
Compare
3158bee
to
a0bcda4
Compare
This works well enough with podman now, but aio-libs/aiohttp#4581 still completely breaks deployment on 3scale/k8s. I've wasted another two hours on trying to find a workaround.. |
a0bcda4
to
661cebe
Compare
661cebe
to
b0bed4a
Compare
@tiran created a starlette PoC yesterday, many thanks! I rewrote the appservice/proxy based on his code it seems to work nicely at least with local podman. I'm going to try that on kubernetes next. However, I would like to keep the aiohttp implementation in git history, just in case we need it again some day. It's also a nice reference to look at. |
293fe2a
to
31adf0b
Compare
32befda
to
d425997
Compare
This works with our podman emulation now, but still not on actual k8s. Needs a bigger hammer. |
d425997
to
d29c330
Compare
3scale sets this to "Upgrade", not "upgrade".
nginx does not get along well with unavailable proxy_pass targets -- as soon as one session pod unexpectedly goes away (crashes, idle timeouts, networking flakes), it errors out and no session routes can be resolved any more. There are workarounds [1], but I can't get them to work properly. Also, nginx does not allow us to hook into (dis)connection events to implement the session status API (see issue #28). So it's finally time to grow to something more flexible. Rewrite multiplexer.py using aiohttp [2], which now handles both our own session control API as well as the dynamic reverse proxying to the session pods. This should still be considered as a PoC, but at least there are fewer moving parts and an easier-to-understand architecture now. Check behaviour with broken session pods in the tests: Crash a session pod, and validate that the other sessions still behave correctly. Fixes #36 [1] https://sandro-keil.de/blog/let-nginx-start-if-upstream-host-is-unavailable-or-down/ [2] https://docs.aiohttp.org/en/stable/index.html
This is larger than Debian, but using UBI is a requirement for deploying to 3scale.
Track session status. Right after creation, it starts as `wait_target`. After the target machine has connected (first connection to the /ws path), it changes to `running`. We'll probably add more states in the future. Add /sessions/ID/status API to query it, so that e.g. the target machine and the UI can poll this to know when to connect. Fixes #28
aiohttp has a long-standing race condition with reading replies [1]. This is completely killing cockpit on kubernetes, and so far there is no workaround. Thus move to a different framework. Starlette [1] is quite pleasant and performant, and supports async and websockets. This also allows us to export the API to consoledot as OpenAPI. Many thanks to Christian Heimes @tiran for his initial code! [1] aio-libs/aiohttp#4581 [2] https://www.starlette.io/
That sole `Connection: upgrade` header gets attached to all requests on /wss/* paths, even for normal HTTP requests which don't have an `Upgrade:` header. uvicorn rejects these with "Unsupported upgrade request". Monkey-patch the h11 module to deliver patched headers. See https://issues.redhat.com/browse/RHCLOUD-21326
d29c330
to
e1d438b
Compare
Huzzah! Fortunately that was simple, just a differently capitalized header value between our fake 3scale and the actual one. This now works fine on k8s! |
Ah, interesting (but unrelated) failure due to a race condition with redis startup. I can reproduce locally with --- webconsoleapp-local.yaml
+++ webconsoleapp-local.yaml
@@ -35,6 +35,7 @@ spec:
- name: redis
image: docker.io/redis
+ command: ["sh", "-ec", "sleep 2; redis-server"]
- name: 3scale
image: docker.io/library/nginx |
The pod starts up in parallel with appservice, so redis may not yet be ready. This caused an occasional CI failure, which can be reproduced with delaying the redis startup in webconsoleapp-local.yaml: command: ["sh", "-ec", "sleep 2; redis-server"]
ef60b95
to
7185e72
Compare
'user': 'cockpit-wsinstance', | ||
} | ||
|
||
async with httpx.AsyncClient(transport=httpx.AsyncHTTPTransport(uds=PODMAN_SOCKET)) as podman: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if the podman socket is gone, this just throws an exception? Otherwise this will return an unitialized status, content
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does. with uds=PODMAN_SOCKET + 'x'
, you get a httpcore.ConnectError: [Errno 2] No such file or directory
.
nginx does not get along well with unavailable proxy_pass targets -- as soon as one session pod unexpectedly goes away (crashes, idle timeouts, networking flakes), it errors out and no session routes can be resolved any more. There are workarounds [1], but I can't get them to work properly. Also, nginx does not allow us to hook into (dis)connection events to implement the session status API (see issue #28).
So it's finally time to grow to something more flexible.
Fixes #36
Fixes #28
[1] https://sandro-keil.de/blog/let-nginx-start-if-upstream-host-is-unavailable-or-down/
@tiran FYI