Skip to content

Commit

Permalink
Add clarification on multithreading and multiprocessing for resources
Browse files Browse the repository at this point in the history
  • Loading branch information
ryansonshine committed May 18, 2021
1 parent 8a6bd89 commit e46955a
Show file tree
Hide file tree
Showing 3 changed files with 68 additions and 11 deletions.
49 changes: 49 additions & 0 deletions docs/source/guide/clients.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,3 +107,52 @@ with the method's appropriate parameters passed in::

# Begin waiting for the S3 bucket, mybucket, to exist
s3_bucket_exists_waiter.wait(Bucket='mybucket')

Multithreading or multiprocessing with clients
----------------------------------------------

Unlike Resources and Sessions, clients **are** generally *thread-safe*.
There are some caveats, defined below, to be aware of though.

Caveats
~~~~~~~

**Multi-Processing:** While clients are *thread-safe*, they cannot be
shared across processes due to their networking implementation. Doing so
may lead to incorrect response ordering when calling services.

**Shared Metadata:** Clients expose metadata to the end user through a
few attributes (namely ``meta``, ``exceptions`` and ``waiter_names``).
These are safe to read but any mutations should not be considered
thread-safe.

**Custom**\ `Botocore Events`_\ **:** Botocore (the library Boto3 is
built on) allows advanced users to provide their own custom event hooks
which may interact with boto3’s client. The majority of users will not
need to use these interfaces, but those that do should no longer
consider their clients thread-safe without careful review.

General Example
~~~~~~~~~~~~~~~

.. code:: python
import boto3.session
from concurrent.futures import ThreadPoolExecutor
def do_s3_task(client, task_definition):
# Put your thread-safe code here
def my_workflow():
# Create a session and use it to make our client
session = boto3.session.Session()
s3_client = session.client('s3')
# Define some work to be done, this can be anything
my_tasks = [ ... ]
# Dispatch work tasks with our s3_client
with ThreadPoolExecutor(max_workers=8) as executor:
futures = [executor.submit(do_s3_task, s3_client, task) for task in my_tasks]
.. _Botocore Events: https://botocore.amazonaws.com/v1/documentation/api/latest/topics/events.html
25 changes: 16 additions & 9 deletions docs/source/guide/resources.rst
Original file line number Diff line number Diff line change
Expand Up @@ -198,23 +198,30 @@ keyword arguments. Examples of waiters include::
instance.wait_until_running()


Multithreading and multiprocessing
--------------------------------
It is recommended to create a resource instance for each thread / process in a multithreaded or multiprocess application rather than sharing a single instance among the threads / processes. For example::
Multithreading or multiprocessing with resources
----------------------------------

Resource instances are **not** thread safe and should not be shared
across threads or processes. These special classes contain additional
meta data that cannot be shared. It's recommended to create a new
Resource for each thread or process:

import boto3
import boto3.session
import threading

class MyTask(threading.Thread):
def run(self):
# Here we create a new session per thread
session = boto3.session.Session()

# Next, we create a resource client using our thread's session object
s3 = session.resource('s3')
# ... do some work with S3 ...

In the example above, each thread would have its own Boto3 session and its own instance of the S3 resource. This is a good idea because resources contain shared data when loaded and calling actions, accessing properties, or manually loading or reloading the resource can modify this data.
# Put your thread-safe code here

.. note::
Resources are **not** thread safe. These special classes contain additional meta data that cannot be shared between threads. When using a Resource, it is recommended to instantiate a new Resource for each thread, as is shown in the example above.

Low-level clients **are** thread safe. When using a low-level client, it is recommended to instantiate your client then pass that client object to each of your threads.
In the example above, each thread would have its own Boto3 session and
its own instance of the S3 resource. This is a good idea because
resources contain shared data when loaded and calling actions, accessing
properties, or manually loading or reloading the resource can modify
this data.
5 changes: 3 additions & 2 deletions docs/source/guide/session.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,9 @@ You can configure each session with specific credentials, AWS Region information
Multithreading or multiprocessing with sessions
-----------------------------------------------

Similar to ``Resource`` objects, ``Session`` objects are not thread safe and should not be shared across threads and processes. You should create a new ``Session`` object for each thread or process::

Similar to ``Resource`` objects, ``Session`` objects are not thread safe
and should not be shared across threads and processes. It's recommended
to create a new ``Session`` object for each thread or process:

import boto3
import boto3.session
Expand Down

0 comments on commit e46955a

Please sign in to comment.