Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[s3_endpoint]: avoid problems arising from keeping endpoint reference alive in hash table #205

Closed
wants to merge 3 commits into from

Conversation

grrtrr
Copy link
Contributor

@grrtrr grrtrr commented Aug 23, 2022

This pull request fixes #202 and arose from comments in #203.

The client keeps references to s3_endpoints alive in a hash table. This requires complicated logic which is hard to
get right:

The conclusion is that the small gains in performance by hashing a frequently-used endpoint do not
justify the greatly increased complexity, which has caused race conditions, segmentation faults,
deadlocks - which are both difficult to debug and to get right.

Hence remove the hash table and use only one endpoint_release function.

Also fix the argument when calling s_s3_endpoint_http_connection_manager_shutdown_callback in
s_s3_endpoint_ref_count_zero (function requires an s3_endpoint).

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@grrtrr
Copy link
Contributor Author

grrtrr commented Aug 29, 2022

@TingDaoK - when you have time, could you please take a look?

@TingDaoK
Copy link
Contributor

TingDaoK commented Sep 1, 2022

Thank you for the deep dive.

I agree the hash map increased the complexity quit a lot. But, I doubt that it's a small gains in performance.

The approach you used here will create an s3_endpoint for every meta request created. And I think each s3_endpoint is not cheap. It has the connection pool under the hood.
Ideally, one client should be used for a large number of requests.

I'll look into the issue and see if we have a better solution.

@grrtrr
Copy link
Contributor Author

grrtrr commented Sep 1, 2022

I'll look into the issue and see if we have a better solution.

Until there is a better solution, can we remove the old one, since it currently creates problems (#202)?

@TingDaoK
Copy link
Contributor

TingDaoK commented Sep 2, 2022

#208 I created this PR for the fix. I think it's a more proper fix here.

@grrtrr
Copy link
Contributor Author

grrtrr commented Sep 5, 2022

#208 I created this PR for the fix. I think it's a more proper fix here.

Thank you. We need some time of testing before putting this into production.
I also thought whether the problem is architectural, i.e. whether a restructuring of the s3 client could make it simpler and more straightforward to retain references to s3 endpoints.

@grrtrr
Copy link
Contributor Author

grrtrr commented Sep 15, 2022

Has been replaced by #209 in v0.1.48.

@grrtrr grrtrr closed this Sep 15, 2022
@grrtrr grrtrr deleted the issue_202.revised branch September 15, 2022 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[s3 endpoint] aws_s3_client_endpoint_release race condition produces segmentation faults
2 participants