foo.bar.tasks.my_task fails with "No module named bar.tasks" when running in AWS / Elastic Beanstalk #105

kdmukai · 2015-10-29T04:49:30Z

So now I've got qcluster happily running under supervisord both in local dev and on Elastic Beanstalk (EB). Local testing works great.

I have a simple test task:

def task_test(user):
    logger.debug("Hello, from the task!!")

And I can make an async call on it in local dev:

async('myapp.member.tasks.task_test', request.user)

And it runs fine:

22:15:51 [Q] INFO Process-1:1 processing [colorado-eleven-lima-skylark]
2015-10-28 22:15:51,112 DEBUG    myapp.member.tasks:task_test(9): Hello, from the task!!
22:15:51 [Q] INFO Processed [colorado-eleven-lima-skylark]

But up on Elastic Beanstalk something strange is going on with traversing the app structure:

21:23:04 [Q] INFO Process-1:2 processing [potato-mars-connecticut-ink]
21:23:04 [Q] ERROR Failed [potato-mars-connecticut-ink] - No module named member.tasks

I also tried passing the function directly:

from myapp.member.tasks import task_test
async(task_test, request.user)

But end up with a similar error:

22:09:07 [Q] INFO Process-1:10 pushing tasks at 3695
Process Process-1:10:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django_q/cluster.py", line 300, in pusher
    task = signing.SignedPackage.loads(task[1])
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django_q/signing.py", line 31, in loads
    serializer=PickleSerializer)
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/core/signing.py", line 145, in loads
    return serializer().loads(data)
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django_q/signing.py", line 44, in loads
    return pickle.loads(data)
ImportError: No module named member.tasks

Same problem if I do a relative import:

from .tasks import task_test
async(task_test, request.user)

The EB supervisord.conf is straightforward:

[program:qcluster]
command=/opt/python/run/venv/bin/python manage.py qcluster
numprocs=1
directory=/opt/python/current/app/myapp
environment=$djangoenv

($djangoenv is injecting the environment variables elsewhere in the deploy script, but didn't make a difference with or without them):

djangoenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
djangoenv=${djangoenv%?}

I also tried SSHing into EB and doing a manual async call through the manage.py shell but saw the same errors in the qcluster logs.

Nothing crazy about the project structure:

approot
|
+---myapp
|    +---member
|    |    +---__init__.py
|    |    +---tasks.py
|    |    +---urls.py
|    |    +---views.py
|    +---__init__.py
|    +---admin.py
|    +---forms.py
|    +---models.py
|    +---settings.py
|    +---urls.py
|    +---views.py
|    +---wsgi.py
+---manage.py

Running on Python 2.7.9.

This seems most likely to be something with the EB environment, but let me know if you have any ideas. I'm all out at this point!

The text was updated successfully, but these errors were encountered:

kdmukai · 2015-10-29T05:22:21Z

Did another test: I tried moving the tasks.py up to the app root so the call is now:

async('myapp.tasks.test_task', request.user)

And the error message:

00:04:42 [Q] INFO Process-1:2 processing [snake-nineteen-artist-berlin]
00:04:42 [Q] ERROR Failed [snake-nineteen-artist-berlin] - No module named tasks

Also tried moving the task function into views.py:

async('myapp.views.test_task', request.user)

Same problem:

00:15:22 [Q] INFO Process-1:2 processing [nevada-princess-friend-ack]
00:15:22 [Q] ERROR Failed [nevada-princess-friend-ack] - No module named views

I'm stumped.

Koed00 · 2015-10-29T07:50:57Z

You will need the exact same environment variables as you have on your webserver. Most importantly the DJANGO_SETTINGS_MODULE=MyProject.settings part.

You could try running python manage.py check to see if Django gives you any errors.

Explicitly copies the environment from the Cluster into the Sentinel and all worker processes. Fixes Koed00#105

Koed00 · 2015-10-30T11:47:58Z

I'm not convinced yet about this pull request. I feel it is quite a big change for something that might not even be a real issue. I have several Heroku deployments, Digital Ocean, Docker, Docker-Compose and Amazone ECS setups I administer with multiple web instances and redundant worker clusters and I've never seen this import problem before. I feel if we spend a little time with it we could probably fix your problem without a pull request. The reasoning behind it is that the environment does not need to be copied to the individual worker processes. Each is a complete fork of the spawning process and even uses the same memory space, so they run in the exact same environment.
If the imports are failing, it is because the environment at the root of the cluster has not been set up properly and not because they are not propagated to the child processes.

kdmukai · 2015-10-30T22:51:44Z

Yes, you were right. My fix was not actually the solution and the pull request should be deleted. But the good news is that I now have a workaround!

Here's a better understanding of what seems to be happening:

I deploy an update to Elastic Beanstalk.
It seems to load everything in an isolated environment (this makes sense--the server is still actively serving the previous code).
As part of the loading process supervisorctl re-creates the qcluster. However, the qcluster initializes itself to the LIVE app environment, not the new isolated environment.
When the new environment is ready, the existing environment is destroyed and the new one takes over.
At this point the qcluster seems to lose its ability to reference my app and I get the "No module found" errors.

So, weird as it sounds, when the original environment is destroyed, qcluster acts as if it can no longer find my app, even though the new app is now serving exactly where the old one used to be! Perhaps it's some internal EB symlinking that's leaving the qcluster pointing to the old code that has since been deleted.

I've confirmed that if I manually kill the qcluster after the new code is deployed, when it respawns everything will work as expected. This is how I inadvertently tricked myself into thinking that my os.environ changes had helped; it wasn't my changes, it was all the manual killing/respawning I was doing during testing.

My workaround:

In EB's post-deploy hook (the new code is now LIVE), I kill the qcluster and let supervisord regenerate it. Because the new code is now the live version, the qcluster will properly point to it and it's able to complete tasks as expected.

Koed00 · 2015-10-31T10:44:44Z

I've been reading up on supervisor, cause I've mostly been using Mozzilla's circus. I think your thoughts on the situation are correct. According to supervisors docs, it doesn't like daemonizing or forking processes. It actually modifies the environment with some of it own settings and I'm hypothesizing this might be happening in a way that leads to the problems you're seeing.

Btw. You always need to restart the clusters after deploying new code, cause you might be trying to queue tasks that just don't exist yet in the cluster copy of your Django project. I've just never seen that the environment completely disappears. Good stuff to learn about though.

kdmukai pushed a commit to kdmukai/django-q that referenced this issue Oct 29, 2015

Fixes Issue Koed00#105

9656a15

Explicitly copies the environment from the Cluster into the Sentinel and all worker processes. Fixes Koed00#105

kdmukai mentioned this issue Oct 29, 2015

Pass parent environment into all child processes #106

Closed

kdmukai closed this as completed Dec 7, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

foo.bar.tasks.my_task fails with "No module named bar.tasks" when running in AWS / Elastic Beanstalk #105

foo.bar.tasks.my_task fails with "No module named bar.tasks" when running in AWS / Elastic Beanstalk #105

kdmukai commented Oct 29, 2015

kdmukai commented Oct 29, 2015

Koed00 commented Oct 29, 2015

Koed00 commented Oct 30, 2015

kdmukai commented Oct 30, 2015

Koed00 commented Oct 31, 2015

foo.bar.tasks.my_task fails with "No module named bar.tasks" when running in AWS / Elastic Beanstalk #105

foo.bar.tasks.my_task fails with "No module named bar.tasks" when running in AWS / Elastic Beanstalk #105

Comments

kdmukai commented Oct 29, 2015

kdmukai commented Oct 29, 2015

Koed00 commented Oct 29, 2015

Koed00 commented Oct 30, 2015

kdmukai commented Oct 30, 2015

Koed00 commented Oct 31, 2015