Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: asyncio.run was sometimes used within a coroutine #560

Merged
merged 4 commits into from
Mar 8, 2024

Conversation

hoh
Copy link
Member

@hoh hoh commented Mar 6, 2024

asyncio.run was called when initializing a pool object using VmPool.__init__(...).

This caused two issues:

  1. The pool was sometimes created from within a coroutine in the context of tests, and this would raise an error.
  2. Having side effects inside the __init__ method makes objects more difficult to manipulate and test.
  3. Tests should not load persistent executions automatically.
  4. The network was configured after loading persistent executions, which could cause networking issues.

A related issue is the snapshot manager being started when initializing the VmPool, while this is not always desirable.

Solution proposed:

  1. Explicitly load the persistent executions using pool.load_persistent_executions() from the supervisor.run() function. This is now called after VmPool.setup() and therefore after the networking of the host has been configured.
  2. The snapshot manager is now started by VmPool.setup() instead of VmPool.__init__. This function is almost always called just after initializing the pool.
  3. Configuring settings.SNAPSHOT_FREQUENCY to zero now disables the snapshot manager.
  4. SnapshotManager.run_snapshots is renamed SnapshotManager.run_in_thread to make its behaviour more explicit.
RuntimeError: asyncio.run() cannot be called from a running event loop
(5 additional frame(s) were not displayed)
...
  File "<frozen runpy>", line 88, in _run_code
  File "__main__.py", line 4, in <module>
    main()
  File "aleph/vm/orchestrator/cli.py", line 353, in main
    asyncio.run(benchmark(runs=args.benchmark), debug=args.debug_asyncio)
  File "aleph/vm/orchestrator/cli.py", line 197, in benchmark
    pool = VmPool()
  File "aleph/vm/pool.py", line 73, in __init__
    asyncio.run(self._load_persistent_executions())

@hoh hoh requested a review from nesitor March 6, 2024 11:30
Copy link

codecov bot commented Mar 6, 2024

Codecov Report

Attention: Patch coverage is 21.42857% with 11 lines in your changes are missing coverage. Please review.

Project coverage is 34.61%. Comparing base (d6025f5) to head (3c384e0).

Files Patch % Lines
src/aleph/vm/pool.py 20.00% 8 Missing ⚠️
src/aleph/vm/orchestrator/supervisor.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #560      +/-   ##
==========================================
- Coverage   34.65%   34.61%   -0.04%     
==========================================
  Files          52       52              
  Lines        4776     4781       +5     
  Branches      558      561       +3     
==========================================
  Hits         1655     1655              
- Misses       3103     3108       +5     
  Partials       18       18              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@aleph-im aleph-im deleted a comment from github-actions bot Mar 6, 2024
@@ -156,6 +156,10 @@ def run():
app.on_cleanup.append(stop_balances_monitoring_task)
app.on_cleanup.append(stop_all_vms)

logger.info("Loading existing executions ...")
asyncio.run(pool.load_persistent_executions())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting the load in this step, will cause some side effects to fail, like the payment tasks, that depends on running executions.
Maybe putting it in the setup step will be better. Also, I noticed that the loading of executions don't start again the snapshots for the running executions. I will create a PR to fix that issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call await self.snapshot_manager.start_for(vm=execution.vm) inside pool.load_persistent_executions regarding the snasphot issue ?

I looked at monitor_payments and I don't see the issue to only load persistent executions after starting it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so, is the snapshot_manager is set, we need to call await self.snapshot_manager.start_for(vm=execution.vm) for every execution already running.

And about the monitor_payment method, the issue can appear at the moment that the executions aren't loaded yet. Yes, will not raise any error, but we will need to wait to the next loop to check the payments for running executions. In this case isn't a big deal, but there will be another logic involved apart from payments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snapshot manager started in fefede9

src/aleph/vm/pool.py Outdated Show resolved Hide resolved
hoh and others added 4 commits March 7, 2024 16:14
`asyncio.run` was called when initializing a pool object using `VmPool.__init__(...)`.

This caused two issues:
  1. The pool was sometimes created from within a coroutine in the context of tests, and this would raise an error.
  2. Having side effects inside the `__init__` method makes objects more difficult to manipulate and test.
  3. Tests should not load persistent executions automatically.
  4. The network was configured after loading persistent executions, which could cause networking issues.

A related issue is the snapshot manager being started when initializing the `VmPool`, while this is not always desirable.

Solution proposed:
  1. Explicitly load the persistent executions using `pool.load_persistent_executions()` from the `supervisor.run()` function. This is now called after `VmPool.setup()` and therefore after the networking of the host has been configured.
  2. The snapshot manager is now started by `VmPool.setup()` instead of `VmPool.__init__`. This function is almost always called just after initializing the pool.
  3. Configuring `settings.SNAPSHOT_FREQUENCY` to zero now disables the snapshot manager.
  4. `SnapshotManager.run_snapshots` is renamed `SnapshotManager.run_in_thread` to make its behaviour more explicit.
Schedule snapshots for loaded persistent executions.
Co-authored-by: nesitor <amolinsdiaz@yahoo.es>
@hoh hoh force-pushed the hoh-pool-side-effects branch from 6afa917 to 3c384e0 Compare March 7, 2024 15:16
@hoh hoh requested a review from nesitor March 7, 2024 16:20
@hoh hoh merged commit 80c22ff into main Mar 8, 2024
18 of 20 checks passed
@hoh hoh deleted the hoh-pool-side-effects branch March 8, 2024 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants