Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: fix Python multi-processing hang on unix #2649

Merged
merged 1 commit into from
Jul 20, 2021

Conversation

dzenanz
Copy link
Member

@dzenanz dzenanz commented Jul 15, 2021

Closes #2069.

Initial (old) description:

This turns a hang into:

(pyEnv) dzenan@corista:~/nn_work_dir$ python itkMultiProcessingMONAI.py 
main pid: 2578395 140116265719616
Resampling image_0.nii pid: 2578395 140116265719616...
Resampling image_0.nii pid: 2578404 140116265719616...
Resampling image_1.nii pid: 2578404 140116265719616...
In ThreadPool::atfork_prepareIn ThreadPool::atfork_resumeTraceback (most recent call last):
  File "itkMultiProcessingMONAI.py", line 44, in <module>
    p.map(f, paths)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '[<itk.itkImagePython.itkImageUC3; proxy of <Swig Object of type 'itkImageUC3 *' at 0x7f6ee94955a0> >]'. Reason: 'TypeError("cannot pickle 'SwigPyObject' object")'
(pyEnv) dzenan@corista:~/nn_work_dir$

@github-actions github-actions bot added area:Core Issues affecting the Core module type:Enhancement Improvement of existing methods or implementation labels Jul 15, 2021
Modules/Core/Common/src/itkThreadPool.cxx Outdated Show resolved Hide resolved
Modules/Core/Common/src/itkThreadPool.cxx Outdated Show resolved Hide resolved
@blowekamp
Copy link
Member

Am I understanding correctly that with this PR, when a fork occurs, the all threads are terminated in the parent, and new threads are recreated in both new processes? I would expect the parent keeps it's threads and the child process "abandons" the parents threads and keeps new ones.

@dzenanz
Copy link
Member Author

dzenanz commented Jul 16, 2021

parent keeps it's threads and the child process "abandons" the parents threads

That might be possible, but with more experimentation/effort. The only thing I know of which forks with ITK is Python multiprocessing. And I am trying to address a hang in this situation, with minimal effort. Stopping and cleaning up the thread pool before the fork greatly simplifies the reasoning.

@dzenanz
Copy link
Member Author

dzenanz commented Jul 16, 2021

I guess that #1948 is a blocker for this. Now waiting for that to be done before continuing work on this.

@blowekamp
Copy link
Member

@dzenanz I think if you take the test code from #2069, and just write the image instead of returning it, the pool issue should still be reproduced. This would separate this issue from #1948, to move part of it forward.

@dzenanz
Copy link
Member Author

dzenanz commented Jul 16, 2021

@blowekamp you are right! Changing return to imwrite avoids pickling, thus allowing me to verify this fix works. I cleaned it up as per Hans' and Matt's suggestions. Please re-review.

@dzenanz dzenanz marked this pull request as ready for review July 16, 2021 20:03
@dzenanz dzenanz requested a review from hjmjohnson July 16, 2021 20:03
Copy link
Member

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@dzenanz dzenanz merged commit 597a6df into InsightSoftwareConsortium:master Jul 20, 2021
@dzenanz dzenanz deleted the atforkUnix branch July 20, 2021 18:04
@thewtex
Copy link
Member

thewtex commented Aug 6, 2021

Applied to the release branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Core Issues affecting the Core module type:Enhancement Improvement of existing methods or implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ITK Python Pool Threader deadlocks with multiprocess module
4 participants