-
-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Erroring with "Too many open files" when good_job tries reconnecting to database #530
Comments
@adibsaad thanks for reporting this! that's not expected. I'll see if I can figure out what's happening here. Briefly poking at it, I expected there to be a problem with the Notifier holding open file descriptors, but it also appears that execution threads are as well. This is unexpected because the execution thread database connections are wholly managed by Rails' execution wrapper, which afaik are supposed to manage connection lifecycle. |
Here is what I'm seeing running ruby 50923 bensheldon 12u IPv6 0x935f30b3386e4bab 0t0 TCP localhost:53047->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 13u IPv6 0x935f30b333d2596b 0t0 TCP localhost:53048->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 14u IPv6 0x935f30b333d2294b 0t0 TCP localhost:53049->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 15u IPv6 0x935f30b333d2302b 0t0 TCP localhost:53051->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 16u IPv6 0x935f30b333d2370b 0t0 TCP localhost:53052->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 17u IPv6 0x935f30b333d23deb 0t0 TCP localhost:53053->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 18u IPv6 0x935f30b3386c604b 0t0 TCP localhost:53054->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 19u IPv6 0x935f30b3386c3deb 0t0 TCP localhost:53055->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 20u IPv6 0x935f30b3386c294b 0t0 TCP localhost:53056->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 21u IPv6 0x935f30b3386c302b 0t0 TCP localhost:53057->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 22u IPv6 0x935f30b333d2b02b 0t0 TCP localhost:53059->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 23u IPv6 0x935f30b333d2b70b 0t0 TCP localhost:53060->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 24u IPv6 0x935f30b333d2bdeb 0t0 TCP localhost:53061->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 25u IPv6 0x935f30b333d2cbab 0t0 TCP localhost:53066->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 26u IPv6 0x935f30b333d2d28b 0t0 TCP localhost:53067->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 27u IPv6 0x935f30b333d2e04b 0t0 TCP localhost:53069->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 28u IPv6 0x935f30b333d2a94b 0t0 TCP localhost:53070->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 29u IPv6 0x935f30b3386dcbab 0t0 TCP localhost:53071->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 30u IPv6 0x935f30b3386dd96b 0t0 TCP localhost:53072->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 31u IPv6 0x935f30b3386da94b 0t0 TCP localhost:53073->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 32u IPv6 0x935f30b3386d84cb 0t0 TCP localhost:53074->localhost:postgresql (CLOSED)
ruby 50923 bensheldon 33u IPv6 0x935f30b3386c128b 0t0 TCP localhost:53079->localhost:postgresql (CLOSED) I'm a bit stumped because when looking at a normal |
Doing some more research and experimentation. It looks like the file descriptors are cleaned up when the socket objects are garbage collected: ❯ bin/rails c
Loading development environment (Rails 6.1.4.6)
[1] pry(main)> ObjectSpace.each_object(PG::Connection).count
=> 0
[2] pry(main)> ObjectSpace.each_object(BasicSocket).count
=> 0
[3] pry(main)> 10.times { ActiveRecord::Base.connection rescue nil }
=> 10
[4] pry(main)> ObjectSpace.each_object(PG::Connection).count
=> 10
[5] pry(main)> ObjectSpace.each_object(BasicSocket).count
=> 10
[6] pry(main)> GC.start
=> nil
[7] pry(main)> ObjectSpace.each_object(PG::Connection).count
=> 0
[8] pry(main)> ObjectSpace.each_object(BasicSocket).count
=> 0 I can't find an obvious way to clean up the file descriptors without manually triggering GC. When debugging this, I was frequently interrupted by the GC running (even in the middle of a [1] pry(main)> ActiveRecord::Base.connection
ActiveRecord::ConnectionNotEstablished: connection to server at "::1", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
from /Users/ben@codeforamerica.org/.rbenv/versions/3.0.3/lib/ruby/gems/3.0.0/gems/activerecord-6.1.4.6/lib/active_record/connection_adapters/postgresql_adapter.rb:83:in `rescue in new_client'
Caused by PG::ConnectionBad: connection to server at "::1", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
from /Users/ben@codeforamerica.org/.rbenv/versions/3.0.3/lib/ruby/gems/3.0.0/gems/pg-1.3.1/lib/pg/connection.rb:637:in `async_connect_or_reset'
[2] pry(main)> conn = ObjectSpace.each_object(PG::Connection).first
=> #<PG::Connection:0x000000013791bdc8>
# The `#socket` method is deprecated
[3] pry(main)> fd = conn.socket
=> 13
[4] pry(main)> io = IO.for_fd(fd)
=> #<IO:fd 13>
[5] pry(main)> io.close
=> nil
[6] pry(main)> io
=> #<IO:(closed)> @adibsaad are you doing anything custom with the Ruby Garbage Collector? e.g. |
Nope, not doing any GC magic. Forgot to mention, I'm on OSX Monterey. |
It looks like the root cause here is |
Tracking ged/ruby-pg#439 |
Fix was released in pg v1.3.4. |
Problem
My guess is it's not closing the fail descriptor after failing to reconnect to the db.
Stack trace
The text was updated successfully, but these errors were encountered: