Fix detected blocking call inside the event loop #1120

chemelli74 · 2024-06-13T16:41:37Z

Description of Change

Fix detected blocking call inside the event loop. Reference HomeAssistant latest build:

2024-06-12 09:40:41.237 WARNING (MainThread) [homeassistant.util.loop] Detected blocking call to listdir inside the event loop by integration 'aws'
at homeassistant/components/aws/notify.py, line 38: return await session.get_available_regions(service) (offender: /usr/local/lib/python3.12/site-pa
ckages/botocore/loaders.py, line 311: api_versions = os.listdir(full_dirname)), please create a bug report at https://github.com/home-assistant/core
/issues?q=is%3Aopen+is%3Aissue+label%3A%22integration%3A+aws%22

Assumptions

none

Checklist for All Submissions

Checklist when updating botocore and/or aiohttp versions

I have read and followed CONTRIBUTING.rst
I have updated test_patches.py where/if appropriate (also check if no changes necessary)
I have ensured that the awscli/boto3 versions match the updated botocore version

thehesiod · 2024-06-14T04:56:20Z

sessions should be cached for the life of use so if this is a major issue the session should be kept around for a longer time. really botocore needs to speed this up on their end because it's way too slow

chemelli74 · 2024-06-14T06:52:36Z

Until then we have a fix anyway.
Maybe we add a comment about that.

thehesiod · 2024-06-14T17:44:05Z

I'm hesitant of doing this as run_in_executor triggers a thread pool to be created (if you haven't called it before) which can cause overhead. Further the executor should be configurable..this really opens a can of worms.

bdraco · 2024-06-14T19:41:22Z

It does looks like load_service_model calls are cached, but it does do blocking I/O in the event loop the first time its created. Ideally we would only call it in the executor the first time it needs to be created.

run_in_executor being called with None (default executor) seems fine to me as we do it all over the place in aiohttp and that is the prescribed way to run blocking code https://docs.python.org/3/library/asyncio-dev.html#running-blocking-code

chemelli74 · 2024-06-24T17:51:55Z

Any feedback @thehesiod ? Thank you in advance

thehesiod · 2024-06-24T18:38:58Z

so this seems like an issue with the aws component, why doesn't it do the session/client creation during async_setup instead? That seems like the correct fix no?

thehesiod · 2024-06-24T18:40:10Z

like i was saying, it should cache the session/client there, or if you don't want to hold onto the client, perhaps just pre-warm up the botocore cache

thehesiod · 2024-06-25T00:51:33Z

actually, this is invalid as botocore is not thread safe, with this you could have two requests, running in two different threads, trying to initialize the cache at the same time. closing

bdraco · 2024-06-25T01:22:07Z

@chemelli74 Thanks for trying to fix this. This problem must be solved eventually (unless the intent is WONTFIX). Please open an issue instead to track this in the long term. Thanks.

thehesiod · 2024-06-25T06:35:25Z

ya let me know if I missed something, I truly believe this is a botocore issue. The fact they're reading a ton of JSON files is unsustainable. probably should be done on the fly per service when you use it...and even then should be in a post-processed state, like their build should generate python code based on those json files.

thehesiod · 2024-07-01T18:54:05Z

so I looked a little into this, it seems like they use an instance_cache which basically just sets an attribute on the class, so presumably worst case scenario with this fix is you'd initialize the cache multiple times. However I really want to avoid adding threads to this library if possible. Looks like there are some async file alternatives like https://github.com/qweeze/uring_file for debian, I'd be open to something like that.

chemelli74 · 2024-07-01T23:16:24Z

What about pathlib ?
Is used all over

thehesiod · 2024-07-02T00:42:44Z

that uses aiofiles which just uses a threadpool again: https://github.com/Tinche/aiofiles/tree/main/src/aiofiles

thehesiod · 2024-07-02T00:43:32Z

i guess at the aiofiles level though it doesn't matter. But it sounds like there's a better solution

chemelli74 · 2024-07-17T06:11:08Z

i guess at the aiofiles level though it doesn't matter. But it sounds like there's a better solution

If you point me towards the solution you prefer, I'll try to work on it.
I really would like to fix this issue and it's effect on Home Assistant.

EDIT: Opened issue on botocore repo

chemelli74 · 2024-07-24T14:12:17Z

Here the official answer from botocore repo: boto/botocore#3222

What can we do to progress then ?

jakob-keller · 2024-07-24T15:31:13Z

edit: sorry, this is basically what @thehesiod suggested #1120 (comment). I also think it's the obvious solution for that HomeAssistant specific issue.

This might be a crazy idea, but if

botocore caches the stuff it loads from disk in global state (not thread-local)
aiobotocore uses the same cache
HomeAssistant supports set-up hooks for the AWS integration

then

The set up code in HomeAssistant could pre-populate the botocore cache in a non-blocking fashion, i.e. via asyncio.to_thread()
Eventual calls to aiobotocore would no longer block and leverage the pre-warmed cache

This would be a fix over at HomeAssistant. What do you think?

bdraco · 2024-07-24T16:08:36Z

This doesn’t seem like a Home Assistant specific issue. Any asyncio application is going to have its event loop blocked. The only difference is that Home Assistant is aware of its loop being blocked where-as another application may not be able to detect it.

jakob-keller · 2024-07-24T18:25:31Z

botocore (or us?) could provide some sort of synchronuous helper function to pre-populate the cache. Home Assistant or anyone else could then choose to run it as part of their setup routine, e.g. via asyncio.to_thread() and no one else would be affected. Does that make sense?

Btw, there are Python platforms that lack multithreading support. That's why I think @thehesiod has a point in not resorting to helper threads in this project unless absolutely necessary.

thehesiod · 2024-07-25T02:40:31Z

this isn't an aiobotocore problem I don't think, but a botocore problem, or do you need to go through an async call to get to the cache warmup? secondly they can already do this themselves without the need for a helper

chemelli74 · 2024-07-25T06:37:27Z

this isn't an aiobotocore problem I don't think, but a botocore problem.

Why you think so ?
This is a async issue and botocore is sync.
The main point of aiobotocore is to transform botocore to async, so IMHO it's a issue in this library
.

thehesiod · 2024-07-25T10:32:14Z

I suppose if we find a good async file API to use then it would be an async problem. but then that wouldn't require an init function. in terms of an init function that would equally apply to sync, however you'll then potentially run into locking issues

chemelli74 · 2024-08-16T21:56:50Z

@thehesiod how can we progress ?
I really want to find a fix

thehesiod · 2024-08-16T22:41:16Z

@chemelli74 suggest doing research of a good async API for aiobotocore to use. This will be a huge project though. Furthermore there are better ways to warm-up aiobotocore, if homeassistant wants to do that I already gave some suggestions like: #1120 (comment). They should init and keep the aioboto clients around instead of creating them on the fly

thehesiod · 2024-08-16T22:42:37Z

this is really a solved issue, we due the same thing, like in aiohttp server using the on_startup callback to init your resources

jakob-keller · 2024-08-18T19:00:12Z

There is another approach we could investigate:

botocore.loaders contains relevant code for "loading various model files", including blocking file IO. We have not turned that codepath asynchronous yet for reasons stated above.
We could implement aiobotocore.loaders with async codepaths, while initially retaining the blocking low-level file IO calls. The event loop would still be blocked, albeit more granularly and for shorter periods of time. This would already be a substantial undertaking for us.
We could then provide an alternative thread-pool loader which uses asyncio.to_thread() or similar to prevent blocking the event loop. This behaviour would be configurable (opt-in) and could be useful for many use cases, such as the one raised by @chemelli74.
Advanced users could implement custom loaders and use an async file IO library of their choice. I doubt this will add much value for regular users today, but maybe async file IO becomes more standardised over time.

@thehesiod: What is your opinion?

@chemelli74: Does that make sense to you?

bdraco · 2024-08-25T01:52:27Z

2. We could then provide an alternative thread-pool loader which uses asyncio.to_thread() or similar to prevent blocking the event loop. This behaviour would be configurable (opt-in) and could be useful for many use cases, such as the one raised by @chemelli74.

Something like this sounds like a good solution. Maybe a flag to enable loading models in the executor. Rough untested proof of concept.

diff --git a/aiobotocore/client.py b/aiobotocore/client.py
index 762ae52..4428e88 100644
--- a/aiobotocore/client.py
+++ b/aiobotocore/client.py
@@ -1,3 +1,4 @@
+import asyncio
 from botocore.awsrequest import prepare_request_dict
 from botocore.client import (
     BaseClient,
@@ -40,26 +41,17 @@ class AioClientCreator(ClientCreator):
         api_version=None,
         client_config=None,
         auth_token=None,
+        load_executor=False,
     ):
         responses = await self._event_emitter.emit(
             'choose-service-name', service_name=service_name
         )
         service_name = first_non_none_response(responses, default=service_name)   
-        service_model = self._load_service_model(service_name, api_version)
-        try:
-            endpoints_ruleset_data = self._load_service_endpoints_ruleset(
-                service_name, api_version
-            )
-            partition_data = self._loader.load_data('partitions')
-        except UnknownServiceError:
-            endpoints_ruleset_data = None
-            partition_data = None
-            logger.info(
-                'No endpoints ruleset found for service %s, falling back to '
-                'legacy endpoint routing.',
-                service_name,
-            )
-
+        if load_executor:
+            model_data = await asyncio.get_running_loop().run_in_executor(None,self._load_models, service_name, api_version)
+        else:
+            model_data = self._load_models(service_name, api_version)
+        service_model, endpoints_ruleset_data, partition_data = model_data
         cls = await self._create_client_class(service_name, service_model)
         region_name, client_config = self._normalize_fips_region(
             region_name, client_config
@@ -104,6 +96,23 @@ class AioClientCreator(ClientCreator):
         )
         return service_client
     
+    def _load_models(self, service_name, api_version):
+        service_model = self._load_service_model(service_name, api_version)
+        try:
+            endpoints_ruleset_data = self._load_service_endpoints_ruleset(
+                service_name, api_version
+            )
+            partition_data = self._loader.load_data('partitions')
+        except UnknownServiceError:
+            endpoints_ruleset_data = None
+            partition_data = None
+            logger.info(
+                'No endpoints ruleset found for service %s, falling back to '
+                'legacy endpoint routing.',
+                service_name,
+            )
+        return service_model, endpoints_ruleset_data, partition_data
+
     async def _create_client_class(self, service_name, service_model):
         class_attributes = self._create_methods(service_model)
         py_name_to_operation_name = self._create_name_mapping(service_model)

Fix detected blocking call inside the event loop

b9611bf

thehesiod closed this Jun 25, 2024

chemelli74 deleted the chemelli74-fix-io branch June 25, 2024 07:11

chemelli74 mentioned this pull request Jul 17, 2024

Detected blocking call inside the event loop boto/botocore#3222

Closed

chemelli74 restored the chemelli74-fix-io branch August 29, 2024 09:40

This was referenced Sep 3, 2024

Load botocore i/o methods in executor #1196

Open

Detected blocking call inside the event loop #1199

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix detected blocking call inside the event loop #1120

Fix detected blocking call inside the event loop #1120

chemelli74 commented Jun 13, 2024

thehesiod commented Jun 14, 2024

chemelli74 commented Jun 14, 2024

thehesiod commented Jun 14, 2024

bdraco commented Jun 14, 2024 •

edited

Loading

chemelli74 commented Jun 24, 2024

thehesiod commented Jun 24, 2024

thehesiod commented Jun 24, 2024

thehesiod commented Jun 25, 2024

bdraco commented Jun 25, 2024

thehesiod commented Jun 25, 2024

thehesiod commented Jul 1, 2024

chemelli74 commented Jul 1, 2024

thehesiod commented Jul 2, 2024

thehesiod commented Jul 2, 2024

chemelli74 commented Jul 17, 2024 •

edited

Loading

chemelli74 commented Jul 24, 2024

jakob-keller commented Jul 24, 2024 •

edited

Loading

bdraco commented Jul 24, 2024

jakob-keller commented Jul 24, 2024

thehesiod commented Jul 25, 2024

chemelli74 commented Jul 25, 2024

thehesiod commented Jul 25, 2024

chemelli74 commented Aug 16, 2024

thehesiod commented Aug 16, 2024

thehesiod commented Aug 16, 2024

jakob-keller commented Aug 18, 2024

bdraco commented Aug 25, 2024 •

edited

Loading

Fix detected blocking call inside the event loop #1120

Fix detected blocking call inside the event loop #1120

Conversation

chemelli74 commented Jun 13, 2024

Description of Change

Assumptions

Checklist for All Submissions

Checklist when updating botocore and/or aiohttp versions

thehesiod commented Jun 14, 2024

chemelli74 commented Jun 14, 2024

thehesiod commented Jun 14, 2024

bdraco commented Jun 14, 2024 • edited Loading

chemelli74 commented Jun 24, 2024

thehesiod commented Jun 24, 2024

thehesiod commented Jun 24, 2024

thehesiod commented Jun 25, 2024

bdraco commented Jun 25, 2024

thehesiod commented Jun 25, 2024

thehesiod commented Jul 1, 2024

chemelli74 commented Jul 1, 2024

thehesiod commented Jul 2, 2024

thehesiod commented Jul 2, 2024

chemelli74 commented Jul 17, 2024 • edited Loading

chemelli74 commented Jul 24, 2024

jakob-keller commented Jul 24, 2024 • edited Loading

bdraco commented Jul 24, 2024

jakob-keller commented Jul 24, 2024

thehesiod commented Jul 25, 2024

chemelli74 commented Jul 25, 2024

thehesiod commented Jul 25, 2024

chemelli74 commented Aug 16, 2024

thehesiod commented Aug 16, 2024

thehesiod commented Aug 16, 2024

jakob-keller commented Aug 18, 2024

bdraco commented Aug 25, 2024 • edited Loading

bdraco commented Jun 14, 2024 •

edited

Loading

chemelli74 commented Jul 17, 2024 •

edited

Loading

jakob-keller commented Jul 24, 2024 •

edited

Loading

bdraco commented Aug 25, 2024 •

edited

Loading