-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query hangs when using async_exec #583
Comments
Is it possible to create a dockerfile with a repro? |
The ruby side is easily replicable with docker, but requires a specific state on the CockroachDB side to trigger this, which I haven't been able to recreate so far - I'll try to do that. Is it the expected behaviour that pgconn_async_get_last_result would ever block for a significant period? From the comment here it sounds like it shouldn't block: Lines 3138 to 3139 in beafa09
However, since it calls wait_socket_readable without specifying a timeout, it can actually end up permanently blocked waiting for the socket to be readable... |
Yes it is intended that it blocks with no timeout. I improved the comment in 7ced092 . I could imagine that there's a bad interaction between OpenSSL and libpq as described in #325 (comment) . Is it possible for you to test it without SSL? |
I've been debugging an issue where we have intermittent hangs of our ruby processes when running queries on CockroachDB using the ruby-pg driver. This is with with ruby-3.3.4, libpq-dev 15.8-0+deb12u1, ruby-pg 1.5.7, and Linux 6.8.0-1012-gcp however I have been able to reproduce the issue on other versions also.
We are able to reproduce the issue ~80% of the time with a simple query like this:
This will hang forever, tying up a ruby thread in our sidekiq workers until the worker is shutdown, even days later.
If I disable async_exec, the same query always works, i.e. this works 100% of the time:
The triggering of this is related to the content returned from the query - if I leave out certain columns, or remove the ORDER BY, then the issue doesn't show up.
Here's the gdb trace when the query is hung:
It seems the issue is that wait_socket_readable is getting called when there is nothing more to read from the socket, and as pgconn_async_get_last_result is passing NULL as the timeout value ( https://github.com/ged/ruby-pg/blob/master/ext/pg_connection.c#L3139 ), the process just waits forever. I compiled ruby-pg with this change to hardcode a wait_timeout, and afterwards I would see a hang of a few seconds when the issue was triggered, rather than an infinite one:
This issue seems related to #325 (comment) , and some investigation there suggested it was a result of the frame sizes.
The text was updated successfully, but these errors were encountered: