-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is the Python client library thread safe when using gRPC? #3272
Comments
Hi @AdamLazarus, The short answer is: We think so. :-) |
Just for information, it seems that you cannot share your
|
@philipperemy I'd love to see an example that reproduces this. I've used |
Sure! This is roughly the code where I have one from google.cloud import datastore
def get_data(symbol_):
print('Init...')
data_store_client = datastore.Client()
print('Done...')
query = data_store_client.query(kind=symbol_)
query_iter = query.fetch()
print_once = True
for entity in query_iter:
print(entity)
def parallel_function(f, sequence, num_threads=None):
from multiprocessing import Pool
pool = Pool(processes=num_threads)
result = pool.map(f, sequence)
cleaned = [x for x in result if x is not None]
pool.close()
pool.join()
return cleaned
def run_query():
[...]
parallel_function(f=get_data, sequence=symbols, num_threads=4) The other code is very similar except that I define a global variable Both code do not work. When |
Has this ever been addressed? Creating a new client for each thread can effectively double the number of threads in the system. |
@speedplane if you want something that can run in production, you might want to use something else. Those libs are not very stable unfortunately. |
@philipperemy what other options are there for accessing the datastore? Isn't this the official library? |
I'm looking at the code now, and it's much worse than 1 new thread per client. It seems that when using gRPC, there are 4 threads: a consumption thread, a channel spin thread, a delivering thread, and a polling thread. (I'm not sure what these threads do or if they're always used). This seems to be per client, and can get bad, take the following example:
That results in 720 threads ( |
@speedplane We expect that gRPC-based clients to be thread safe: the issues we know of are to do with multiprocessing (forking after creating a client). |
Hello all,
The documentation [1] makes it clear that http2lib objects aren't thread safe in the Python client library. Are clients that have gRPC support (such as Pubsub) thread safe when using gRPC? This question has been asked with the Java client library before [2] but I'd appreciate a firm answer for Python too.
Thank you.
[1] https://developers.google.com/api-client-library/python/guide/thread_safety
[2] googleapis/google-cloud-java#1320
The text was updated successfully, but these errors were encountered: