Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

starting a thread in __del__ hangs at interpreter shutdown #87950

Open
sylikc mannequin opened this issue Apr 9, 2021 · 12 comments
Open

starting a thread in __del__ hangs at interpreter shutdown #87950

sylikc mannequin opened this issue Apr 9, 2021 · 12 comments
Labels
3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-bug An unexpected behavior, bug, or error

Comments

@sylikc
Copy link
Mannequin

sylikc mannequin commented Apr 9, 2021

BPO 43784
Nosy @rhettinger, @pfmoore, @pitrou, @tjguk, @zware, @eryksun, @zooba, @sylikc
Files
  • win_subprocess_hang.py: Sample code that hangs
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-04-09.05:25:49.909>
    labels = ['3.8', 'type-bug', 'library', '3.9', '3.10']
    title = 'starting a thread in __del__ hangs at interpreter shutdown'
    updated_at = <Date 2021-04-10.06:05:35.590>
    user = 'https://github.com/sylikc'

    bugs.python.org fields:

    activity = <Date 2021-04-10.06:05:35.590>
    actor = 'eryksun'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2021-04-09.05:25:49.909>
    creator = 'sylikc'
    dependencies = []
    files = ['49947']
    hgrepos = []
    issue_num = 43784
    keywords = []
    message_count = 4.0
    messages = ['390587', '390593', '390658', '390695']
    nosy_count = 8.0
    nosy_names = ['rhettinger', 'paul.moore', 'pitrou', 'tim.golden', 'zach.ware', 'eryksun', 'steve.dower', 'sylikc']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue43784'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @sylikc
    Copy link
    Mannequin Author

    sylikc mannequin commented Apr 9, 2021

    I've noticed an issue (or user error) in which Python a call that otherwise usually works in the __del__ step of a class will freeze when the Python interpreter is exiting.

    I've attached sample code that I've ran against Python 3.9.1 on Windows 10.

    The code below runs a process and communicates via the pipe.

    class SubprocTest(object):
    	def run(self):
    		print("run")
    		proc_args = ["cmd.exe"]
    		self._process = subprocess.Popen(proc_args, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    	
    	def __del__(self):
    		print("__del__")
    		if self._process is not None:
    			self.terminate()
    	
    	def terminate(self):
    		print("terminate")
    		self._process.communicate(input=b"exit\n", timeout=1)
    		print("kill")
    		self._process.kill()
    		self._process = None
    
    if __name__ == "__main__":
    	s = SubprocTest()
    	s.run()
    	del s
    	print("s done")
    	
    	t = SubprocTest()
    	t.run()
    	print("t done")

    Current output:
    run
    __del__
    terminate
    kill
    s done
    run
    t done
    __del__
    terminate
    <<<<<< hangs indefinitely here, even though timeout=1

    Expected output:
    run
    __del__
    terminate
    kill
    s done
    run
    t done
    __del__
    terminate
    kill

    In normal circumstances, when you del the object and force a run of __del__(), the process ends properly and the terminate() method completes.

    When the Python interpreter exits, Python calls the __del__() method of the class. In this case, the terminate() never completes and the script freezes indefinitely on the communicate()

    @sylikc sylikc mannequin added OS-windows 3.9 only security fixes stdlib Python modules in the Lib dir labels Apr 9, 2021
    @eryksun
    Copy link
    Contributor

    eryksun commented Apr 9, 2021

    It's not a subprocess bug, per se. It's due to creating the stdout/stderr worker threads from the __del__ finalizer while the interpreter is shutting down. Minimal reproducer, confirmed in both Linux and Windows:

        import threading
    
        class C:
            def __del__(self):
                t = threading.Thread(target=print, args=('spam',), daemon=True)
                t.start()
    
        c = C()
        #del c # uncomment to prevent hanging

    @eryksun eryksun added 3.8 (EOL) end of life 3.10 only security fixes and removed OS-windows labels Apr 9, 2021
    @eryksun eryksun changed the title [Windows] interpreter hangs indefinitely on subprocess.communicate during __del__ at script exit starting a thread in __del__ hangs at interpreter shutdown Apr 9, 2021
    @eryksun eryksun added 3.8 (EOL) end of life type-bug An unexpected behavior, bug, or error 3.10 only security fixes and removed OS-windows labels Apr 9, 2021
    @eryksun eryksun changed the title [Windows] interpreter hangs indefinitely on subprocess.communicate during __del__ at script exit starting a thread in __del__ hangs at interpreter shutdown Apr 9, 2021
    @eryksun eryksun added the type-bug An unexpected behavior, bug, or error label Apr 9, 2021
    @sylikc
    Copy link
    Mannequin Author

    sylikc mannequin commented Apr 9, 2021

    eryksun, wow, that's speedy analysis, but there might be more to it. I went and tested a bunch of test cases. my subrocess code doesn't seem to hang on Linux where the thread example code does?

    Linux - Python 3.6.8 - your threading example DOESN'T hang
    Linux - Python 3.6.8 - My subprocess code also DOESN'T hang

    Linux - Python 3.8.5 - thread example HANGs
    Linux - Python 3.8.5 - My subprocess code also DOESN'T hang

    Windows - Python 3.9.1 - thread example HANGs
    Windows - Python 3.9.1 - subprocess code HANGs

    @eryksun
    Copy link
    Contributor

    eryksun commented Apr 10, 2021

    my subrocess code doesn't seem to hang on Linux where the
    thread example code does?

    Reader threads for stdout and stderr are only used in Windows, since there's no equivalent to select/poll for synchronous pipes in Windows.

    Without using threads, I/O with synchronous pipes requires a busy loop. It has to poll PeekNamedPipe() to get the number of bytes available to read without blocking. For stdin, on the other hand, the Windows API does not allow getting the WriteQuotaAvailable from the PipeLocalInformation [1]. Without knowing how much can be written without blocking, the input pipe would have to be made non-blocking, which in turn requires an inner loop that tries to write a successively smaller chunk size to stdin until either it succeeds or the size is reduced to 0.

    If we wanted to change communicate() in Windows to not use threads and not require a busy loop, we'd have to switch to using named pipes opened in asynchronous (overlapped) mode on our side of each pipe. But that's an integration problem. For normal use as proc.stdout, etc, we would need an adapter or a new raw file type that implements a synchronous interface by waiting for I/O completion and maintaining its own file pointer. Such files would return the Windows file handle for fileno(), like sockets do, since a C file descriptor requires a synchronous-mode file.

    ---

    [1] https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_file_pipe_local_information

    @sylikc
    Copy link

    sylikc commented Feb 7, 2023

    Hi, just curious, how far down in the backlog is this issue, and how likely would it to get fixed? It still exhibits this behavior on the most recent releases

    @arhadthedev
    Copy link
    Member

    Reader threads for stdout and stderr are only used in Windows, since there's no equivalent to select/poll for synchronous pipes in Windows.

    Adding OS-windows tag to notify these who can design around the race between a worker thread and interpreter shutdown routine. Probably a critical section or mutex would warrant? Thus, tagging export-multiprocessing as well for expertise on possible deadlocks here.

    @zooba
    Copy link
    Member

    zooba commented Feb 8, 2023

    It's not Windows-specific, apart from the fact that subprocess.Popen.communicate on Windows relies on threads and on Linux it doesn't.

    A threading/state expert is needed to figure out the best way to cause thread creation to fail during finalization (either object or interpreter - I'm not sure which). And it's likely that this will still leave the original code broken, just more quickly and with an unhandled exception.

    @zooba zooba removed the OS-windows label Feb 8, 2023
    @sylikc
    Copy link

    sylikc commented Feb 8, 2023

    🙄 Looks like this is a truly complex issue.

    Anyways, I think an unhandled exception is preferred to hanging the interpreter. At least that way, control flow can continue... like an external caller to run a python script doesn't wait indefinitely, but can read an error code from the interpreter crash

    @gpshead
    Copy link
    Member

    gpshead commented May 26, 2023

    A threading/state expert is needed to figure out the best way to cause thread creation to fail during finalization (either object or interpreter - I'm not sure which). And it's likely that this will still leave the original code broken, just more quickly and with an unhandled exception.

    #104826 from #104690 may be of interest...

    That was focusing on the atexit finalizer case. I'm not sure if it does enough to raise an exception for the case outlined in this issue but it's at least a step in that direction.

    @sylikc
    Copy link

    sylikc commented Jun 2, 2023

    Interesting issues referred. If the PR #104826 goes through I'd be curious if it'd be a partial fix... possibly.

    Afterall, it touches static PyObject * thread_PyThread_start_new_thread(PyObject *self, PyObject *fargs) and catches interpreter shutdown... I've never looked into Python source code, but it looks close to my issue 😂

    @sylikc
    Copy link

    sylikc commented Nov 13, 2023

    So, I tested the code chunk from #87950 (comment) on CPython 3.12.0 on Windows x64, and it definitely doesn't hang anymore.

        import threading
    
        class C:
            def __del__(self):
                t = threading.Thread(target=print, args=('spam',), daemon=True)
                t.start()
    
        c = C()
        #del c # uncomment to prevent hanging

    It crashes with

    Exception ignored in: <function C.__del__ at 0x000002953146E3E0>
    Traceback (most recent call last):
      File "C:\TEMP\test.py", line 6, in __del__
      File "C:\Python312\Lib\threading.py", line 971, in start
    RuntimeError: can't create new thread at interpreter shutdown
    

    Which is much more ideal. I can now trap this error and do additional processing.

    Is it possible to backport this to older versions, or is this issue considered resolved but only for v3.12.0+?

    @gpshead
    Copy link
    Member

    gpshead commented Nov 13, 2023

    The details on this kind of change are often quite a challenge to get right. We're not likely to backport any changes that may have helped fix this to 3.11.x (which is going to move to security fix only status soon anyways).

    (keeping open for now as this being fixed deserves further verification.)

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-bug An unexpected behavior, bug, or error
    Projects
    Status: No status
    Status: No status
    Development

    No branches or pull requests

    5 participants