Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execute file open/close in executors #96229

Closed
elupus opened this issue Aug 24, 2022 · 7 comments
Closed

Execute file open/close in executors #96229

elupus opened this issue Aug 24, 2022 · 7 comments
Labels
pending The issue will be closed if no feedback is provided topic-asyncio type-feature A feature request or enhancement

Comments

@elupus
Copy link

elupus commented Aug 24, 2022

Feature or enhancement

Execute the open/close:ing of unix file descriptors/protocols in executors.

Pitch

A unix file descriptor close() call can perform io/flush data. As such it may time a little bit of time. To add to that some protocols explicitly need sleeps after closing to make sure it's done.

Here is an location where explicit sleeps are added, and are hard to avoid.
https://github.com/pexpect/ptyprocess/blob/master/ptyprocess/ptyprocess.py#L403

A similar fix was done for pyserial-asyncio's protocols here:
pyserial/pyserial-asyncio#82

The culprit protocol in this case is that pipe transport handlers:

def _call_connection_lost(self, exc):

But they could all benefit from making the closing function execute in an executor.

Previous discussion

@kumaraditya303
Copy link
Contributor

This is a bit surprising. It would be helpful if you can provide a script to reproduce the issue as we need more data about this.

@gvanrossum gvanrossum added the pending The issue will be closed if no feedback is provided label Oct 23, 2022
@gvanrossum
Copy link
Member

A unix file descriptor close() call can perform io/flush data. As such it may time a little bit of time. To add to that some protocols explicitly need sleeps after closing to make sure it's done.

The evidence you give below doesn't really prove that.

Here is an location where explicit sleeps are added, and are hard to avoid. https://github.com/pexpect/ptyprocess/blob/master/ptyprocess/ptyprocess.py#L403

This code is managing a subprocess. There's no evidence that closing the file descriptor takes any time (in the process doing the OS-level close) that would block an event loop (if it were an asyncio app, which it isn't). The sleep 0.1 here is to give the subprocess that's hanging off the file descriptor time to respond to the EOF it receives, etc. Of course if you were to call this user-level close() method in an event loop there's a problem, but that's because the user-level close() method sleeps. In such cases, yes, an asyncio app would have to run that close() method in an executor -- it's not asyncio's responsibility.

A similar fix was done for pyserial-asyncio's protocols here: pyserial/pyserial-asyncio#82

That's a blocking flush() call (on a serial port) -- again it is user-level code that blocks and hence should be run in an executor.

The culprit protocol in this case is that pipe transport handlers:

def _call_connection_lost(self, exc):

But they could all benefit from making the closing function execute in an executor.

I'm not even sure what you're proposing here. Do you want the self._pipe.close() call to be executed in an executor?

Unless you come with an example where closing a pipe file descriptor actually blocks I'm inclined to close this as "won't fix".

@kumaraditya303
Copy link
Contributor

@elupus Can you provide a reproducer for this without third-party libraries ? If not then we'll have to close this as there is no reproducer.

@elupus
Copy link
Author

elupus commented Oct 26, 2022

I will try to get something to reproduce. However, part of the problem is third party libraries adhering to the API set here. But the point is really that close() per say IS file io, as such could block the loop.

Also this occurs in destructor/cleanup code outside of control of the third party library. Its the core that closes the file down. So if closing take some time its hard to avoid blocking the loop, while also making sure close is actually finished.

I can see solutions to schedule the close in an executor, but then you are not sure its fully closed, and re-opening same file would lead to the file being busy.

@elupus
Copy link
Author

elupus commented Oct 26, 2022

I will see if i can reproduce it with lingering sockets, since since a socket with linger on should block on close according to: https://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html

@elupus
Copy link
Author

elupus commented Nov 2, 2022

I was unable to reproduce it using lingering sockets. Seems the close completes directly. I find it a bit strange, but maybe the sockets are non blocking and the close call internally fails with EWOULDBLOCK. Anyway, not a usable way to repro.

The ptyprocess issue stems from it's use in the async path of pexpect. So yes pexpect should probably be fixed to avoid calling ptyprocess close in event loop, which solve the blocking of the loop, however.. If the closing is pushed to executor, and the event loop is not awaiting this result. Any user code that does something that results in:

while True:
  something.open("serial")
  # do some funky stuff
  await something.read()
  something.close()

Would fail since something.close() have not actually completely closed the something since it had to push part of it's close call to an executor, which it can't await the result of, and you need some random await asyncio.sleep(x) to ensure the close, running in a detached executor ends up being finished.

That is why i proposed to have the _call_connection_lost be converted to an async function, here and running the close() functions in an executor.

That said, hese things can be worked around in userspace, and executors add overhead. So we can close this issue if you guys don't think it make sense for such a change.

@gvanrossum
Copy link
Member

Thanks for trying. For backwards compatibility we can't really change close() to become async. If a particular API requires waiting for a close, they will have to implement a new, async method. (For writable files, this could be flush(), perhaps.)

So yeah, let's close this as "won't fix".

@gvanrossum gvanrossum closed this as not planned Won't fix, can't repro, duplicate, stale Nov 2, 2022
Repository owner moved this from Todo to Done in asyncio Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending The issue will be closed if no feedback is provided topic-asyncio type-feature A feature request or enhancement
Projects
Status: Done
Development

No branches or pull requests

3 participants