Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO::FileDescriptor & Socket finalizers do far too much #14807

Closed
ysbaddaden opened this issue Jul 12, 2024 · 1 comment · Fixed by #14882
Closed

IO::FileDescriptor & Socket finalizers do far too much #14807

ysbaddaden opened this issue Jul 12, 2024 · 1 comment · Fixed by #14882

Comments

@ysbaddaden
Copy link
Contributor

ysbaddaden commented Jul 12, 2024

The initial idea for the finalizer was to avoid leaking file descriptors when you'd call File.new instead of File.open(&) and then forgot to close it. See #780 and 3bfd62a.

The actual implementation called the #close method, and it hasn't changed ever since, and... it's actually doing a lot more things than the original idea meant to:

  1. it flushes any buffered data;
  2. it re-enqueues pending readers & writers;
  3. it finally closes the file descriptor.

When called from regular code, this is the expected behavior, but let's contemplate that it may happen in a GC finalizer, that is be called during a GC collection, while the world is stopped and we want/need to resume it ASAP.

Note: while I think the unflushed buffer can happen in practice (we can write to a socket, forget to flush, then lose the reference) I don't think the pending reader/write can happen: it would mean a fiber got suspended trying to read or write, and they must have the file/socket reference in the function call somewhere (i.e. at least one pointer on the fiber stack) so the GC won't collect it.

Trying to flush means that it can try to write, hence call into the event loop, that may have to wait for the fd to be writable, which means calling into epoll_wait 😱 (while the world is stopped). The event loop implementation may need to allocate, or we get an error and try to raise an exception that will also try to allocate memory... during a GC collection 😱

Proposal: I'd merely call #file_descriptor_close in the finalizer and either always print a warning to STDERR when we do, or maybe only when there was pending buffered data (that won't be sent).

@straight-shoota
Copy link
Member

We need non-raising variants of file_descriptor_close. The default implementations raise if the system close errors. In a finalizer we shouldn't care about this error though (and it would involve an allocation for the exception).

I don't think printing a warning is a good idea. Closing in the finalizer doesn't really contribute to whether the buffer is flushed. If there was no finalizer, the file descriptor would never be closed (and no data flushed).
Maybe warnings could be useful as a tool while transitioning away from the current semantics with flush. But it shouldn't be an eternal thing. And then I don't see a good mechanism to decide when to enable it and when not. So I'd rather leave it away. Maybe we can provide an opt-in for such warnings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants