Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution hangs in module level map call when pool is built in a submodule and submodule is imported in module's __init__ #19

Open
grayfall opened this issue May 20, 2016 · 28 comments
Labels

Comments

@grayfall
Copy link

Suppose we've got a project:

test/
     test.py
     pkg/
         __init__.py
         lib/
             __init__.py
             workers.py
         test/
             __init__.py
             workers.py

In pkg/lib/workers we have:

import multiprocess.pool as mp


class Test:
    def __init__(self, f):
        self.f = f

    def method(self, data):
        return workers.map(self.f, data)


if __name__ == "__main__":
    raise RuntimeError
else:
    workers = mp.Pool(processes=2)

in pkg/test/workers.py

from ..lib.workers import *


print(Test(max).method([[1,2,3], [1,2,3]]))

in test.py

print("Hello")

import pkg.test.workers

print("Goobye")

When I run test.py I get:

$ python test.py 
Hello
[3, 3]
Goobye

If I change the second line of code in pkg/test/workers.py from print(Test(max).method([[1,2,3], [1,2,3]])) to print(Test(lambda x: max(x)).method([[1,2,3], [1,2,3]])), I get

$ python test.py 
Hello

And the process freezes. Nothing happens for hours. No errors, no messages.

P.S.

This is a stripped down version of my real project, where I use the pool of workers inside a bound method of an instance and pass an attribute to the pool as a function to use. I believe this example reproduces the exact same problem.

P.P.S.

I've also asked the question on Stack Overflow (link)

@mmckerns mmckerns changed the title Execution hangs when an unpickleable object is passed to a Pool. Execution hangs when map is called in a different file than where the pool is built Jun 2, 2016
@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

I'm not sure what the issue is… I've not seen this before. So here are some tests...

@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

# trial1.py

import multiprocess
workers = multiprocess.Pool(2)
x = (i for i in range(4))
y = workers.map(lambda x:x, [x]*2)
print(y)
workers.close()
workers.join()

This has an unpickable object (the generator), and it fails by throwing an error (e.g. doesn't hang).

@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

# trial2.py

import multiprocess
workers = multiprocess.Pool(2)

class Test:
    def __init__(self, f):
        self.f = f
    def method(self, data):
        return workers.map(self.f, data)

x = (i for i in range(4))
y = Test(lambda x:x).method([x]*2)
print(y)
workers.close()
workers.join()

This is basically your code, all in a single file. It throws an error due to the generator, but doesn't hang.

@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

# trial3.py

If set x as a list comprehension, it works just fine.

@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

Now, replicating what you have, but in a reduced form…

# trial5.py

from trial.trial4 import Test

x = [i for i in range(4)]
y = Test(lambda x:x).method([x]*2)
#y = Test(max).method([x]*2)
print(y)

where:

# trial4.py

import multiprocess
workers = multiprocess.Pool(2)

class Test:
    def __init__(self, f):
        self.f = f
    def method(self, data):
        return workers.map(self.f, data)

if __name__ == '__main__':
    x = [i for i in range(4)]
    y = Test(lambda x:x).method([x]*2)
    print(y)

You can infer the package structure from the import. trial/trial4.py and trail5.py are the two relevant files.

Running trail4.py as main works. Running trail5.py as main hangs for the lambda, but success for the max (commented out). Actually, let me clarify, max works in python3.5, but hangs in python2.7.

@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

Going to the extreme case:

# trial7.py

from trial.trial6 import workers

x = [i for i in range(4)]
y = workers.map(lambda x:x, [x]*2)
#y = workers.map(max, [x]*2)
print(y)

where:

# trial6.py

import multiprocess
workers = multiprocess.Pool(2)

Then trial7.py also fails similarly to trail5.py.

So it has nothing to do with the class Test, and everything to do with using the pool across an import.

@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

Now, going back to using a Test class:

# trail9.py

rom trial.trial8 import Test
import multiprocess
workers = multiprocess.Pool(2)

x = [i for i in range(4)]
y = Test(lambda x:x, workers).method([x]*2)
#y = Test(max, workers).method([x]*2)
print(y)
workers.close()
workers.join()

where:

# trial8.py

class Test:
    def __init__(self, f, workers):
        self.f = f
        self.workers = workers
    def method(self, data):
        return self.workers.map(self.f, data)

if __name__ == '__main__':
    import multiprocess
    workers = multiprocess.Pool(2)
    x = [i for i in range(4)]
    y = Test(lambda x:x, workers).method([x]*2)
    print(y)
    workers.close()
    workers.join()

Then trial9.py also fails similarly to trial7.py.

@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

If I edit the class Test to take a map instead of the pool, then it has exactly the same behavior.

@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

So… I think I might have the error behavior well understood, but I still have no idea what would be causing it.

@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

It might be related to this issue: uqfoundation/dill#128, or I think less likely to this issue: uqfoundation/dill#56.

@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

Turning on dill.detect.trace points to the failure to look up the lambda function. So it might just be an issue with lambda.

# trial10.py

import multiprocess
workers = multiprocess.Pool(2)
import dill
dill.detect.trace(True)

x = [i for i in range(4)]
y = workers.map(lambda x:x, [x]*2)
#y = workers.map(max, [x]*2)
print(y)

This produces:

dude@hilbert>$ python trial10.py 
F2: <function mapstar at 0x10c0abe18>
# F2
F1: <function <lambda> at 0x10c6b6488>
F2: <function _create_function at 0x10c642510>
# F2
Co: <code object <lambda> at 0x10be08db0, file "trial10.py", line 7>
T1: <class 'code'>
F2: <function _load_type at 0x10c642400>
# F2
# T1
# Co
D3: <dict object at 0x10bdcddc8>
# D3
D2: <dict object at 0x10c6bd3c8>
# D2
# F1
D2: <dict object at 0x10be324c8>
# D2
F2: <function mapstar at 0x10c0abe18>
# F2
F1: <function <lambda> at 0x10c6b6488>
F2: <function _create_function at 0x10c642510>
# F2
Co: <code object <lambda> at 0x10be08db0, file "trial10.py", line 7>
T1: <class 'code'>
F2: <function _load_type at 0x10c642400>
# F2
# T1
# Co
D3: <dict object at 0x10bdcddc8>
# D3
D2: <dict object at 0x10c6bd3c8>
# D2
# F1
D2: <dict object at 0x10c6bd2c8>
# D2
[[0, 1, 2, 3], [0, 1, 2, 3]]

While adding the same dill.detect.trace(True) to trail7.py yields;

dude@hilbert>$ python trial7.py 
F2: <function mapstar at 0x106c96048>
# F2

… and then hangs on the lookup of the lambda.

@mmckerns
Copy link
Member

mmckerns commented Jun 2, 2016

Hmm… both the use of a function and a class method also fail as before.

# trial11.py

from trial.trial6 import workers
import dill
dill.detect.trace(True)

class Foo(object):
  def doit(self, x):
    return x

def bar(x):
  return x

x = [i for i in range(4)]
y = workers.map(bar, [x]*2)
#y = workers.map(Foo().doit, [x]*2)
#y = workers.map(lambda x:x, [x]*2)
#y = workers.map(max, [x]*2)
print(y)

@mmckerns
Copy link
Member

mmckerns commented Jun 3, 2016

Modifying to move the functions to pickle into the imported module makes a difference…

# trial13.py

from trial.trial12 import workers, Foo, bar, lamb
import dill
dill.detect.trace(True)

x = [i for i in range(4)]
y = workers.map(bar, [x]*2)
#y = workers.map(Foo().doit, [x]*2)
#y = workers.map(lamb, [x]*2)
#y = workers.map(max, [x]*2)
print(y)

where:

# trial12.py

import multiprocess
workers = multiprocess.Pool(2)

class Foo(object):
  def doit(self, x):
    return x

def bar(x):
  return x

lamb = lambda x:x

Results are as follows:

for 'bar`:

dude@hilbert>$ python trial13.py 
F2: <function mapstar at 0x1031f2048>
# F2
F2: <function bar at 0x1037ea620>

then hangs.

for Foo.doit:

dude@hilbert>$ python trial13.py 
F2: <function mapstar at 0x105c51048>
# F2
Me: <bound method Foo.doit of <trial.trial12.Foo object at 0x105bdf4e0>>
T1: <class 'method'>
F2: <function _load_type at 0x1061da598>
# F2
# T1
F1: <function Foo.doit at 0x1062486a8>
F2: <function _create_function at 0x1061da6a8>
# F2
Co: <code object doit at 0x105be8030, file "/Users/mmckerns/dev/svn/pathos/multiprocess/tmp/test/trial/trial12.py", line 5>
T1: <class 'code'>
# T1
# Co

then hangs.

For lamb:

dude@hilbert>$ python trial13.py 
F2: <function mapstar at 0x10514c048>
# F2
F1: <function <lambda> at 0x105740730>
F2: <function _create_function at 0x1056d06a8>
# F2
Co: <code object <lambda> at 0x1050e31e0, file "/Users/mmckerns/dev/svn/pathos/multiprocess/tmp/test/trial/trial12.py", line 11>
T1: <class 'code'>
F2: <function _load_type at 0x1056d0598>
# F2
# T1
# Co

then hangs.

@mmckerns
Copy link
Member

mmckerns commented Jun 3, 2016

So it appears that failure is when there's a failed lookup of a function. Maybe coincidentally, when I control-C'd the thing, the tracebacks ended with this:

  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 1384, in find_class
    __import__(module, level=0)
  File "<frozen importlib._bootstrap>", line 205, in _lock_unlock_module
  File "<frozen importlib._bootstrap>", line 114, in acquire

Which seems relevant, but maybe it's coincidence.

I'm need to keep digging, but it might just be that the lookup of the function is happening in the wrong place.

@grayfall
Copy link
Author

grayfall commented Jun 3, 2016

First of all, thank you for this awesome feedback.

I modified your trial13/trial12 test by moving the Pool initialisation down below the creation of that lambda.

# trial14.py

from trial.libtrial14 import workers, Foo, bar, lamb
import dill
dill.detect.trace(True)

x = [i for i in range(4)]
print([x]*2)
print(workers.map(bar, [x]*2))
print(workers.map(Foo().doit, [x]*2))
print(workers.map(lamb, [x]*2))
print(workers.map(max, [x]*2))

Where

# libtrial14.py

import multiprocess


class Foo(object):
  def doit(self, x):
    return x


def bar(x):
  return x


lamb = lambda x:x

workers = multiprocess.Pool(2)

Surprisingly, it works:

(py35) $ python trial14.py 
[[0, 1, 2, 3], [0, 1, 2, 3]]
F2: <function mapstar at 0x1014bf0d0>
# F2
F2: <function bar at 0x100797840>
# F2
D2: <dict object at 0x10079fdc8>
# D2
F2: <function mapstar at 0x1014bf0d0>
# F2
F2: <function bar at 0x100797840>
# F2
D2: <dict object at 0x10301a708>
# D2
[[0, 1, 2, 3], [0, 1, 2, 3]]
F2: <function mapstar at 0x1014bf0d0>
# F2
Me: <bound method Foo.doit of <trial.libtrial14.Foo object at 0x10115bc88>>
T1: <class 'method'>
F2: <function _load_type at 0x101792378>
# F2
# T1
F1: <function Foo.doit at 0x101429e18>
F2: <function _create_function at 0x101792488>
# F2
Co: <code object doit at 0x1007ae5d0, file "/Users/ilia/Desktop/test/trial/libtrial14.py", line 7>
T1: <class 'code'>
# T1
# Co
D4: <dict object at 0x101169708>
# D4
D2: <dict object at 0x10301ad08>
# D2
# F1
T4: <class 'trial.libtrial14.Foo'>
# T4
# Me
D2: <dict object at 0x10301a708>
# D2
F2: <function mapstar at 0x1014bf0d0>
# F2
Me: <bound method Foo.doit of <trial.libtrial14.Foo object at 0x10115bc88>>
T1: <class 'method'>
F2: <function _load_type at 0x101792378>
# F2
# T1
F1: <function Foo.doit at 0x101429e18>
F2: <function _create_function at 0x101792488>
# F2
Co: <code object doit at 0x1007ae5d0, file "/Users/ilia/Desktop/test/trial/libtrial14.py", line 7>
T1: <class 'code'>
# T1
# Co
D4: <dict object at 0x101169708>
# D4
D2: <dict object at 0x10301ad08>
# D2
# F1
T4: <class 'trial.libtrial14.Foo'>
# T4
# Me
D2: <dict object at 0x10301acc8>
# D2
[[0, 1, 2, 3], [0, 1, 2, 3]]
F2: <function mapstar at 0x1014bf0d0>
# F2
F1: <function <lambda> at 0x101429ea0>
F2: <function _create_function at 0x101792488>
# F2
Co: <code object <lambda> at 0x1007b4780, file "/Users/ilia/Desktop/test/trial/libtrial14.py", line 13>
T1: <class 'code'>
F2: <function _load_type at 0x101792378>
# F2
# T1
# Co
D4: <dict object at 0x101169708>
# D4
D2: <dict object at 0x10301ad48>
# D2
# F1
D2: <dict object at 0x10301acc8>
# D2
F2: <function mapstar at 0x1014bf0d0>
# F2
F1: <function <lambda> at 0x101429ea0>
F2: <function _create_function at 0x101792488>
# F2
Co: <code object <lambda> at 0x1007b4780, file "/Users/ilia/Desktop/test/trial/libtrial14.py", line 13>
T1: <class 'code'>
F2: <function _load_type at 0x101792378>
# F2
# T1
# Co
D4: <dict object at 0x101169708>
# D4
D2: <dict object at 0x10301ad48>
# D2
# F1
D2: <dict object at 0x1014340c8>
# D2
[[0, 1, 2, 3], [0, 1, 2, 3]]
F2: <function mapstar at 0x1014bf0d0>
# F2
B1: <built-in function max>
F2: <function _get_attr at 0x101792d08>
# F2
# B1
D2: <dict object at 0x1014340c8>
# D2
F2: <function mapstar at 0x1014bf0d0>
# F2
B1: <built-in function max>
F2: <function _get_attr at 0x101792d08>
# F2
# B1
D2: <dict object at 0x10079fdc8>
# D2
[3, 3]

@mmckerns mmckerns added the bug label Jun 3, 2016
@mmckerns
Copy link
Member

mmckerns commented Jun 3, 2016

Wow. Seriously??? I can only theorize why that works at this point...

Does this let you workaround in your code? Knowing why the above works is probably important, and would enable me to figure out what would need to be changed and how -- but as long as there's a feasible workaround, then it's not as imperative to figure out.

@mmckerns
Copy link
Member

mmckerns commented Jun 3, 2016

@matsjoyce: I know this isn't a dill ticket, but it's a serialization issue. Any thoughts here?

@mmckerns
Copy link
Member

mmckerns commented Jun 3, 2016

I think I found the real underlying issue.

I was using a structure like this for my tests:

test/
     trial7.py
     trial/
         __init__.py
         trial6.py

And the pool was hanging… as noted above.

Then I tried your modification to my files and it still didn't work for me.

I realized, however, you'd probably used a blank __init__.py while I was importing the submodule (i.e. trial6) into the __init__,py. So, I deleted the import in __init__.py, and then it worked.

Matter of fact, I all the tests succeed when the __init__.py is blank.

So that's the real thing to watch out for. Don't import the submodule with the pool into the module's __init__.py. Why that makes a difference, I'm not sure yet, but it's probably just that it confuses the namespace lookup within pickle. So, I'm not certain, just right now, whether this is a dill issue or a multiprocess issue, but I'm thinking probably one for dill. We'll see.

@mmckerns mmckerns changed the title Execution hangs when map is called in a different file than where the pool is built Execution hangs when pool is built in a submodule and submodule is imported in module's __init__ Jun 3, 2016
@mmckerns mmckerns changed the title Execution hangs when pool is built in a submodule and submodule is imported in module's __init__ Execution hangs I'm module level map call when pool is built in a submodule and submodule is imported in module's __init__ Jun 3, 2016
@mmckerns mmckerns changed the title Execution hangs I'm module level map call when pool is built in a submodule and submodule is imported in module's __init__ Execution hangs in module level map call when pool is built in a submodule and submodule is imported in module's __init__ Jun 3, 2016
@matsjoyce
Copy link

I can't get it to hang 😟 I have trial5.py at the top level and trial4.py inside a directory called trial. Is that right? It might be an OS thing (or I just setup wrong) as the exception does look like a low down locking thing. I'm on Arch Linux 4.4.5-1, dill 0.2b2.dev and multiprocess 0.70.4.

@mmckerns
Copy link
Member

mmckerns commented Jun 3, 2016

That's right. Now if you have import trial4 inside trial\__init__.py, it should hang.

@matsjoyce
Copy link

Works fine for me, except from . import trial4 for python 3. It looks like I'm using the same versions as you (3.5, 2.7), but I'm on linux, and your still on windows?

@grayfall
Copy link
Author

grayfall commented Jun 4, 2016

I'm on OS X and the execution hangs with blank __init__.py just as well if the Pool is initialised before the functions.

@mmckerns
Copy link
Member

mmckerns commented Jun 4, 2016

Hmm… Above, I tested python 3.5.1 and 2.7.11, and both work without hanging, as long as the __init__.py is blank. My multiprocess and dill are the latest dev versions from github. I'm on OSX. I do have a Windows VM that I can test on, but it seems none of us are using Windows.

@matsjoyce
Copy link

OK, I got it to hang (yay!). It seems to freeze in the serialisation thread while in _import_module:

Thread-2     �[1;31m<------------------------------------------ exit function�[1;m save_int (/usr/lib/python2.7/pickle.py:443)
Thread-2     �[1;31m<---------------------------------------- exit function�[1;m save (/usr/lib/python2.7/pickle.py:269)
Thread-2     �[1;32m----------------------------------------> call function�[1;m save (/usr/lib/python2.7/pickle.py:269)
Thread-2     �[1;32m------------------------------------------> call function�[1;m persistent_id (/usr/lib/python2.7/pickle.py:333)
Thread-2     �[1;31m<------------------------------------------ exit function�[1;m persistent_id (/usr/lib/python2.7/pickle.py:333)
Thread-2     �[1;32m------------------------------------------> call function�[1;m get (/home/matthew/GitHub/dill/testing/dill/dill.py:386)
Thread-2     �[1;31m<------------------------------------------ exit function�[1;m get (/home/matthew/GitHub/dill/testing/dill/dill.py:386)
Thread-2     �[1;32m------------------------------------------> call function�[1;m save_function (/home/matthew/GitHub/dill/testing/dill/dill.py:1278)
Thread-2     �[1;32m--------------------------------------------> call function�[1;m _locate_function (/home/matthew/GitHub/dill/testing/dill/dill.py:768)
Thread-2     �[1;32m----------------------------------------------> call function�[1;m _import_module (/home/matthew/GitHub/dill/testing/dill/dill.py:754)
Thread-1     �[1;32m------------------------------------------------> call function�[1;m _maintain_pool (/home/matthew/GitHub/dill/testing/multiprocess/pool.py:226)
Thread-1     �[1;32m--------------------------------------------------> call function�[1;m _join_exited_workers (/home/matthew/GitHub/dill/testing/multiprocess/pool.py:195)

As you can see, Thread-2 stops after after the call to _import_module. From what I can gather, it freezes while trying to import multiprocess.pool using __import__, which causes some type of deadlock in the import mechanism (so maybe a python bug). I've only got it to freeze under 2.7, and not 3.5, though.

@grayfall
Copy link
Author

grayfall commented Jun 5, 2016

OK, here is my setup. I'm running Python 3.5.1 under OS X 10.11.5. multiprocess is installed via pip in a virtualenv environment. The directory structure is:

test/
       trial14.py
       trial15.py
       trial/
              __init__.py  # blank
              libtrial14.py
              libtrial15.py

where

# trial14.py
from trial.libtrial14 import workers, Foo, bar, lamb
# import dill
# dill.detect.trace(True)

x = [i for i in range(4)]
print([x]*2)
print(workers.map(bar, [x]*2))
print(workers.map(Foo().doit, [x]*2))
print(workers.map(lamb, [x]*2))
print(workers.map(max, [x]*2))
# trial15.py

from trial.libtrial15 import workers, Foo, bar, lamb
# import dill
# dill.detect.trace(True)

x = [i for i in range(4)]
print([x]*2)
print(workers.map(bar, [x]*2))
print(workers.map(Foo().doit, [x]*2))
print(workers.map(lamb, [x]*2))
print(workers.map(max, [x]*2))
# trial/libtrial14.py

import multiprocess

class Foo(object):
  def doit(self, x):
    return x

def bar(x):
  return x

lamb = lambda x:x

workers = multiprocess.Pool(2)
# trial/libtrial15.py

import multiprocess


workers = multiprocess.Pool(2)

class Foo(object):
  def doit(self, x):
    return x

def bar(x):
  return x

lamb = lambda x:x

As before python trial14.py runs perfectly fine, while the trial15.py hangs (nothing new here). I tried to trace the execution while redirecting the stdout with python -m trace --trace trial15.py > trial15_stdout.txt 2> trial15_stderr.txt. I kept that running for a while and killed it. Here are the logs:
trial15_stderr.txt
trial15_stdout.txt

First of all, as seen in the stderr it fails to find bar (AttributeError: module 'trial.libtrial15' has no attribute 'bar'), but the error doesn't halt the execution for some reason. Then in stdout there is a block that keeps repeating before the process is killed:

pool.py(367):             time.sleep(0.1)
pool.py(365):         while thread._state == RUN or (pool._cache and thread._state != TERMINATE):
pool.py(366):             pool._maintain_pool()
 --- modulename: pool, funcname: _maintain_pool
pool.py(239):         if self._join_exited_workers():
 --- modulename: pool, funcname: _join_exited_workers
pool.py(208):         cleaned = False
pool.py(209):         for i in reversed(range(len(self._pool))):
pool.py(210):             worker = self._pool[i]
pool.py(211):             if worker.exitcode is not None:
 --- modulename: process, funcname: exitcode
process.py(177):         if self._popen is None:
process.py(179):         return self._popen.poll()
 --- modulename: popen_fork, funcname: poll
popen_fork.py(26):         if self.returncode is None:
popen_fork.py(27):             while True:
popen_fork.py(28):                 try:
popen_fork.py(29):                     pid, sts = os.waitpid(self.pid, flag)
popen_fork.py(35):                     break
popen_fork.py(36):             if pid == self.pid:
popen_fork.py(42):         return self.returncode
pool.py(209):         for i in reversed(range(len(self._pool))):
pool.py(210):             worker = self._pool[i]
pool.py(211):             if worker.exitcode is not None:
 --- modulename: process, funcname: exitcode
process.py(177):         if self._popen is None:
process.py(179):         return self._popen.poll()
 --- modulename: popen_fork, funcname: poll
popen_fork.py(26):         if self.returncode is None:
popen_fork.py(27):             while True:
popen_fork.py(28):                 try:
popen_fork.py(29):                     pid, sts = os.waitpid(self.pid, flag)
popen_fork.py(35):                     break
popen_fork.py(36):             if pid == self.pid:
popen_fork.py(42):         return self.returncode
pool.py(209):         for i in reversed(range(len(self._pool))):
pool.py(217):         return cleaned

My pip freeze output:

appnope==0.1.0
bcbio-gff==0.6.2
biom-format==2.1.5
biopython==1.66
bipython==0.1.2
blessings==1.6
bpython==0.15
click==6.6
Comparable==1.0
curtsies==0.2.6
decorator==4.0.9
dill==0.2.5
future==0.15.2
gnureadline==6.3.3
greenlet==0.4.9
h5py==2.6.0
ipykernel==4.3.1
ipython==4.1.2
ipython-genutils==0.1.0
ipywidgets==4.1.1
Jinja2==2.8
joblib==0.9.4
jsonschema==2.5.1
jupyter==1.0.0
jupyter-client==4.2.2
jupyter-console==4.1.1
jupyter-core==4.1.0
Lasagne==0.1
llvmlite==0.11.0
MarkupSafe==0.23
matplotlib==1.4.3
mistune==0.7.2
multiprocess==0.70.4
nbconvert==4.1.0
nbformat==4.0.1
nose==1.3.7
notebook==4.1.0
numba==0.26.0
numpy==1.11.0
pandas==0.18.0
path.py==8.1.2
permute==0.1a3
pexpect==4.0.1
pickleshare==0.6
ppft==1.6.4.6
ptyprocess==0.5.1
Pygments==2.1.3
pyparsing==2.1.4
python-dateutil==2.5.2
pytz==2016.3
pyzmq==15.2.0
qtconsole==4.2.1
requests==2.9.1
scikit-learn==0.17.1
scipy==0.17.0
simplegeneric==0.8.1
six==1.10.0
sklearn==0.0
terminado==0.6
Theano==0.8.1
tornado==4.3
traitlets==4.2.1
urwid==1.3.1
wcwidth==0.1.6

@matsjoyce
Copy link

The repeated section it just the polling loop waiting for something to happen, which doesn't.

@mmckerns
Copy link
Member

mmckerns commented Jun 6, 2016

@grayfall: the one process is just polling the other, and waiting for results. A timeout could be added, so if the timeout is met, an error is thrown, but that is not the root issue. @matsjoyce: It seems that 2.7 fails in more cases than 3.5, and sometimes with different errors.

In the case of libtrial15 as @grayfall has coded above, you can see there are actually two issues going on.

  1. With the __init__.py blank, both 2.7 and 3.5 fail with an AttributeError trying to find bar. Could this be a side effect due to multiprocessing is supposed to only be run from the if __name__ == '__main__' block? One way or the other, as @grayfall points out, it's only looking above the creation of the Pool.
  2. When the __init__.py includes an import, 2.7 fails (hangs) as in are the 4-5 errors in multiprocessing unit tests harmless? #1, with the same error. However, 3.5 hangs on the import hook.

@mmckerns
Copy link
Member

Revisiting this... it seems that in python 3.7+ (and on MacOS, and with current versions of multiprocess and dill) everything works, except for trial13, which still hangs.

To summarize:
This fails:

# trial/trial13.py
import multiprocess
workers = multiprocess.Pool(2)

class Foo(object):
  def doit(self, x):
    return x

def bar(x):
  return x

lamb = lambda x:x

while this succeeds:

# trial/trial14.py
import multiprocess

class Foo(object):
  def doit(self, x):
    return x
    
def bar(x):
  return x
  
lamb = lambda x:x

workers = multiprocess.Pool(2)

With main of:

from trial.trial13 import workers, bar

x = [i for i in range(4)]
y = workers.map(bar, [x]*2)

and similarly for trial.trial14 (however, the latter succeeds while the former fails).

No idea why the position of the Pool instance makes a difference, but apparently it does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants