Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker is closed for computation longer than 1 minute #228

Closed
Benjamin3381 opened this issue Mar 16, 2023 · 20 comments
Closed

Worker is closed for computation longer than 1 minute #228

Benjamin3381 opened this issue Mar 16, 2023 · 20 comments
Assignees
Labels
bug Something isn't working

Comments

@Benjamin3381
Copy link

Benjamin3381 commented Mar 16, 2023

Hi... very cool app!

I've got a problem. It strats ok but when I try to run my app (in windows) it seems the worker loses connection and the screen freezes with the message Waiting for worker below the run button. Same problem when I try to download pdf from demo dashes.

@Benjamin3381 Benjamin3381 changed the title Waiting worker Waiting for worker Mar 16, 2023
@pplonski
Copy link
Contributor

Hi @Benjamin3381,

Thank you for reporting the issue. I will need few details from you to be able to reproduce the problem. What operating system are you using? How have you installed mercury? Could you please run mercury with --verbose flag:

mercury run --verbose

# or run demo
mercury run demo --verbose

Please attach the verbose output. Thank you!

@Benjamin3381
Copy link
Author

Thank you for your response. It´s Windows 10. I installed via pip install mercury in the prompt.

Not sure if this part of the output is useful.

image

@pplonski
Copy link
Contributor

Thank you @Benjamin3381, could you please attach output from few lines above and from the beginning.

The worker is starting, but it is switched off because ping/pong timed out - I don't know why, yet ...

@olekniewiarowski
Copy link

olekniewiarowski commented Mar 21, 2023

Hi, I have the same problem when trying to run the demo in google colab. I enabled custom widgets,

from google.colab import output
output.enable_custom_widget_manager()

and get the port as:

from google.colab.output import eval_js
print(eval_js("google.colab.kernel.proxyPort(8000)"))

I see a "waiting for worker" message. Hovering over the worker icon gives me a session id but "worker: unknown". The websocket is disconnected

After a few min the runtime crashes and I have to reinstall mercury

@pplonski
Copy link
Contributor

Hi @olekniewiarowski,

Thank you for reporting the issue. I didn't test Mercury Server on Google Colab. However, Mercury Widgets should work in Colab.

I treat Colab as a place for developing notebooks. Then notebooks-apps should be served outside notebook IDE (Colab in this case). What do you think? What is your use case?

We are working on the Mercury Cloud service. You will be able to set up a website with a custom URL address with a few clicks. Notebook deployment will be as easy as file upload. There will be a free plan for quick tests :)

@olekniewiarowski
Copy link

olekniewiarowski commented Mar 23, 2023

Hi, we are testing different dashboarding solutions for our group and like the modularity of mercury. I was hoping to be able to test it on colab. I also tried installing mercury locally via conda forge but I ran into errors when trying to run the demo. I'm on windows 10.

(mercury) C:\Users\olek.niewiarowski>mercury run demo


     _ __ ___   ___ _ __ ___ _   _ _ __ _   _
    | '_ ` _ \ / _ \ '__/ __| | | | '__| | | |
    | | | | | |  __/ | | (__| |_| | |  | |_| |
    |_| |_| |_|\___|_|  \___|\__,_|_|   \__, |
                                         __/ |
                                        |___/

Traceback (most recent call last):
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\mercury\Scripts\mercury.exe\__main__.py", line 7, in <module>
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\mercury\mercury.py", line 115, in main
    execute_from_command_line(["mercury.py", "migrate", "-v", 0])
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\django\core\management\__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\django\core\management\__init__.py", line 395, in execute
    django.setup()
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\django\__init__.py", line 24, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\django\apps\registry.py", line 91, in populate
    app_config = AppConfig.create(entry)
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\django\apps\config.py", line 124, in create
    mod = import_module(mod_path)
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\importlib\__init__.py", line 127, in import_module    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\daphne\apps.py", line 6, in <module>    import daphne.server  # noqa: F401
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\daphne\server.py", line 7, in <module>
    from twisted.internet import asyncioreactor  # isort:skip
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\twisted\internet\asyncioreactor.py", line 19, in <module>
    from twisted.internet.posixbase import (
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\twisted\internet\posixbase.py", line 16, in <module>
    from twisted.internet import error, tcp, udp
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\twisted\internet\tcp.py", line 38, in <module>
    from twisted.internet._newtls import (
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\twisted\internet\_newtls.py", line 18, in <module>
    from twisted.protocols.tls import TLSMemoryBIOFactory, TLSMemoryBIOProtocol
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\twisted\protocols\tls.py", line 42, in <module>
    from OpenSSL.SSL import Connection, Error, SysCallError, WantReadError, ZeroReturnError
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\OpenSSL\__init__.py", line 8, in <module>
    from OpenSSL import crypto, SSL
  File "C:\Users\olek.niewiarowski\Anaconda3_WINDOWS\envs\Mercury\lib\site-packages\OpenSSL\crypto.py", line 3268, in <module>
    _lib.OpenSSL_add_all_algorithms()
AttributeError: module 'lib' has no attribute 'OpenSSL_add_all_algorithms'

@pplonski
Copy link
Contributor

pplonski commented Mar 23, 2023

Hi @olekniewiarowski,

Thank you for logs. Maybe you have updated pyopenssl? I found similar issue. Please try to update pyopenssl

conda update pyopenssl

The latest version of pyopenssl is 23.0.0 - please let me know if that fix problem for you, maybe it will be worth to add version constraint in the requirements.txt.

What library do you plan to use for visualizations? What requirements do you have for dashboards? Do you need authentication?

@olekniewiarowski
Copy link

olekniewiarowski commented Mar 23, 2023

Hi, now the demo starts successfully with pyopenssl 23.0.0, but I still have the "waiting for worker..." problem. The websocket is connected and the worker is queued.

Ideally, we are seeking a modular solution so that anyone can write a notebook with any python libraries (for starters matplotlib is enough) and then add the results to an interactive dashboard -- mercury looks like a good candidate wrt to those needs. However, it needs to work with data stored on Google Drive - I am not familiar with the authentication needs for that outside of Colab, but hopefully it can done with a jupyter extension. The purpose is mainly to document and compare results between many versions of numerical models (with visual and tabular output), and accessory documentation/annotation. And for non python users, the option to upload and arrange static text/image files would be a plus.

@pplonski
Copy link
Contributor

Thank you for checking this. Could you please run the server with a verbose flag:

mercury run --verbose

and attach the output. There might still be some issues.

Sounds like a challenging project. You can access Google Drive in Python with PyDrive2 package, so it should work. We are working on the Mercury Cloud version - you set up the website with a few clicks, add users, and upload files, would you like to test it when ready?

@nagatushar
Copy link

nagatushar commented Mar 26, 2023

I got to the bitter end of getting my mercury notebook to run on heroku but I also get a waiting for worker prompt and then just a white screen. I do think something is disconnecting and crashing. I got my application to work once but if it runs too long it does seem to simply give me a white screen with no other error message

@pplonski
Copy link
Contributor

Hi @nagatushar,

Please provide steps to reproduce or output logs, so I can fix the issue. To get logs from the server please start mercury with the verbose flag:

mercury run --verbose

We are working on a cloud version to make deployment easier. I hope to make it available this week.

@nagatushar
Copy link

sounds like heroku has a 30 second limit which may be why I am having issues. Is there any way around this?

@Benjamin3381
Copy link
Author

Benjamin3381 commented Mar 29, 2023

Hi @pplonski

I don't have the worker problem any longer. However, when I try to execute bambi Bayesian Model the app restarts. Specifically when I run the model.fit line

import bambi as bmb 
model=bmb.Model('y ~ x', df2[['y', 'x']]) 
results = model.fit(chains=4,cores=1) 

I was cheking out the promp and I got this:

DJ INFO 2023-03-28 21:13:47,017 runserver WebSocket HANDSHAKING /ws/client/6/68dd59bc-5c3b-4d5b-8fbd-bb9f68603f31/ [127.0.0.1:62016]
DJ INFO 2023-03-28 21:13:47,042 runserver WebSocket CONNECT /ws/client/6/68dd59bc-5c3b-4d5b-8fbd-bb9f68603f31/ [127.0.0.1:62016]
NB 2023-03-28 21:13:47,236 Exception when check if worker id=319 is stale
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\mercury\apps\nbworker\db.py", line 132, in is_worker_stale
    self.worker = Worker.objects.get(pk=self.worker_id)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\django\db\models\manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\django\db\models\query.py", line 435, in get
    raise self.model.DoesNotExist(
apps.ws.models.Worker.DoesNotExist: Worker matching query does not exist.
DJ INFO 2023-03-28 21:13:47,251 runserver WebSocket DISCONNECT /ws/worker/6/68dd59bc-5c3b-4d5b-8fbd-bb9f68603f31/319/ [127.0.0.1:51534]

@pplonski
Copy link
Contributor

Hi @nagatushar,

The 30 seconds limit sounds strange ... The Mercury Cloud should be available soon for testing. The developer version is already running at https://cloud.runmercury.com (you can create account and upload notebooks but it is developer version and I do many fixes there ...) but I will need ~2 weeks to polish it.

Alternative is docker-compose deployment on AWS, GCP or Azure.

@pplonski
Copy link
Contributor

HI @Benjamin3381,

Thank you for reporting the issue. Apologize for problems! There is a bug, for long running jobs, that worker is closed after 1 minute ... not sure if this is connected with @nagatushar issue?

I will fix it! I will do my best to have fix this week @aplonska

Thank you!

@pplonski pplonski self-assigned this Mar 29, 2023
@pplonski pplonski added the bug Something isn't working label Mar 29, 2023
@pplonski pplonski changed the title Waiting for worker Worker is closed for computation longer than 1 minute Mar 29, 2023
@pplonski
Copy link
Contributor

I fixed the issue. When the job was long (> 1 min) there was no update about worker status.

I will release the new version later today.

@nagatushar
Copy link

I'll review it when the app updates! I will show you what I've been up to :)

thank you

@pplonski
Copy link
Contributor

Version 2.1.3 with fix just released.

@Benjamin3381
Copy link
Author

Benjamin3381 commented Mar 31, 2023

Thak you @pplonski

Just a last question. I dont know what happened but after I uppgraded mercury I got this error:

(base) C:\Users\user>mercury --version

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\ProgramData\Anaconda3\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\ProgramData\Anaconda3\Scripts\mercury.exe\__main__.py", line 7, in <module>
  File "C:\ProgramData\Anaconda3\lib\site-packages\mercury\mercury.py", line 115, in main
    execute_from_command_line(["mercury.py", "migrate", "-v", 0])
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\django\core\management\__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\django\core\management\__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\django\core\management\base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\django\core\management\base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\django\core\management\base.py", line 89, in wrapped
    res = handle_func(*args, **kwargs)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\django\core\management\commands\migrate.py", line 95, in handle
    executor.loader.check_consistent_history(connection)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\django\db\migrations\loader.py", line 306, in check_consistent_history
    raise InconsistentMigrationHistory(
django.db.migrations.exceptions.InconsistentMigrationHistory: Migration storage.0001_initial is applied before its dependency workers.0001_initial on database 'default'.

@pplonski
Copy link
Contributor

pplonski commented Mar 31, 2023

You got old tables migration in database. Please run

mercury run clear

It will clean local database.

@pplonski pplonski reopened this Mar 31, 2023
@pplonski pplonski closed this as completed Apr 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants