Fix: asyncio.run was sometimes used within a coroutine #560

hoh · 2024-03-06T11:28:47Z

asyncio.run was called when initializing a pool object using VmPool.__init__(...).

This caused two issues:

The pool was sometimes created from within a coroutine in the context of tests, and this would raise an error.
Having side effects inside the __init__ method makes objects more difficult to manipulate and test.
Tests should not load persistent executions automatically.
The network was configured after loading persistent executions, which could cause networking issues.

A related issue is the snapshot manager being started when initializing the VmPool, while this is not always desirable.

Solution proposed:

Explicitly load the persistent executions using pool.load_persistent_executions() from the supervisor.run() function. This is now called after VmPool.setup() and therefore after the networking of the host has been configured.
The snapshot manager is now started by VmPool.setup() instead of VmPool.__init__. This function is almost always called just after initializing the pool.
Configuring settings.SNAPSHOT_FREQUENCY to zero now disables the snapshot manager.
SnapshotManager.run_snapshots is renamed SnapshotManager.run_in_thread to make its behaviour more explicit.

RuntimeError: asyncio.run() cannot be called from a running event loop
(5 additional frame(s) were not displayed)
...
  File "<frozen runpy>", line 88, in _run_code
  File "__main__.py", line 4, in <module>
    main()
  File "aleph/vm/orchestrator/cli.py", line 353, in main
    asyncio.run(benchmark(runs=args.benchmark), debug=args.debug_asyncio)
  File "aleph/vm/orchestrator/cli.py", line 197, in benchmark
    pool = VmPool()
  File "aleph/vm/pool.py", line 73, in __init__
    asyncio.run(self._load_persistent_executions())

codecov · 2024-03-06T11:31:19Z

Codecov Report

Attention: Patch coverage is 21.42857% with 11 lines in your changes are missing coverage. Please review.

Project coverage is 34.61%. Comparing base (d6025f5) to head (3c384e0).

Files	Patch %	Lines
src/aleph/vm/pool.py	20.00%	8 Missing ⚠️
src/aleph/vm/orchestrator/supervisor.py	0.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #560      +/-   ##
==========================================
- Coverage   34.65%   34.61%   -0.04%     
==========================================
  Files          52       52              
  Lines        4776     4781       +5     
  Branches      558      561       +3     
==========================================
  Hits         1655     1655              
- Misses       3103     3108       +5     
  Partials       18       18

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

nesitor · 2024-03-06T14:10:57Z

src/aleph/vm/orchestrator/supervisor.py

@@ -156,6 +156,10 @@ def run():
            app.on_cleanup.append(stop_balances_monitoring_task)
            app.on_cleanup.append(stop_all_vms)

+        logger.info("Loading existing executions ...")
+        asyncio.run(pool.load_persistent_executions())


Putting the load in this step, will cause some side effects to fail, like the payment tasks, that depends on running executions.
Maybe putting it in the setup step will be better. Also, I noticed that the loading of executions don't start again the snapshots for the running executions. I will create a PR to fix that issue.

Should we call await self.snapshot_manager.start_for(vm=execution.vm) inside pool.load_persistent_executions regarding the snasphot issue ?

I looked at monitor_payments and I don't see the issue to only load persistent executions after starting it.

Yes, I think so, is the snapshot_manager is set, we need to call await self.snapshot_manager.start_for(vm=execution.vm) for every execution already running.

And about the monitor_payment method, the issue can appear at the moment that the executions aren't loaded yet. Yes, will not raise any error, but we will need to wait to the next loop to check the payments for running executions. In this case isn't a big deal, but there will be another logic involved apart from payments.

Snapshot manager started in fefede9

src/aleph/vm/pool.py

`asyncio.run` was called when initializing a pool object using `VmPool.__init__(...)`. This caused two issues: 1. The pool was sometimes created from within a coroutine in the context of tests, and this would raise an error. 2. Having side effects inside the `__init__` method makes objects more difficult to manipulate and test. 3. Tests should not load persistent executions automatically. 4. The network was configured after loading persistent executions, which could cause networking issues. A related issue is the snapshot manager being started when initializing the `VmPool`, while this is not always desirable. Solution proposed: 1. Explicitly load the persistent executions using `pool.load_persistent_executions()` from the `supervisor.run()` function. This is now called after `VmPool.setup()` and therefore after the networking of the host has been configured. 2. The snapshot manager is now started by `VmPool.setup()` instead of `VmPool.__init__`. This function is almost always called just after initializing the pool. 3. Configuring `settings.SNAPSHOT_FREQUENCY` to zero now disables the snapshot manager. 4. `SnapshotManager.run_snapshots` is renamed `SnapshotManager.run_in_thread` to make its behaviour more explicit.

Schedule snapshots for loaded persistent executions.

Co-authored-by: nesitor <amolinsdiaz@yahoo.es>

hoh requested a review from nesitor March 6, 2024 11:30

aleph-im deleted a comment from github-actions bot Mar 6, 2024

nesitor reviewed Mar 6, 2024

View reviewed changes

hoh mentioned this pull request Mar 6, 2024

Reloading executions does not restart the snapshots for the running executions. #563

Open

hoh force-pushed the hoh-pool-side-effects branch from 5f810ad to fefede9 Compare March 6, 2024 16:43

nesitor reviewed Mar 6, 2024

View reviewed changes

src/aleph/vm/pool.py Outdated Show resolved Hide resolved

hoh and others added 4 commits March 7, 2024 16:14

fixup! Fix: asyncio.run was sometimes used within a coroutine

a005944

Schedule snapshots for loaded persistent executions.

Update src/aleph/vm/pool.py

effa93a

Co-authored-by: nesitor <amolinsdiaz@yahoo.es>

fixup! Update src/aleph/vm/pool.py

3c384e0

hoh force-pushed the hoh-pool-side-effects branch from 6afa917 to 3c384e0 Compare March 7, 2024 15:16

hoh requested a review from nesitor March 7, 2024 16:20

hoh assigned nesitor Mar 7, 2024

nesitor approved these changes Mar 7, 2024

View reviewed changes

hoh merged commit 80c22ff into main Mar 8, 2024
18 of 20 checks passed

hoh deleted the hoh-pool-side-effects branch March 8, 2024 07:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: asyncio.run was sometimes used within a coroutine #560

Fix: asyncio.run was sometimes used within a coroutine #560

hoh commented Mar 6, 2024 •

edited

Loading

codecov bot commented Mar 6, 2024 •

edited

Loading

nesitor Mar 6, 2024

hoh Mar 6, 2024

nesitor Mar 6, 2024

hoh Mar 6, 2024

Fix: asyncio.run was sometimes used within a coroutine #560

Fix: asyncio.run was sometimes used within a coroutine #560

Conversation

hoh commented Mar 6, 2024 • edited Loading

codecov bot commented Mar 6, 2024 • edited Loading

Codecov Report

nesitor Mar 6, 2024

Choose a reason for hiding this comment

hoh Mar 6, 2024

Choose a reason for hiding this comment

nesitor Mar 6, 2024

Choose a reason for hiding this comment

hoh Mar 6, 2024

Choose a reason for hiding this comment

hoh commented Mar 6, 2024 •

edited

Loading

codecov bot commented Mar 6, 2024 •

edited

Loading