Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve async write (fewer copies, polling API) #430

Merged
merged 11 commits into from
May 13, 2024
Merged

Conversation

graebm
Copy link
Contributor

@graebm graebm commented May 9, 2024

Issue:
Mountpoint's upload throughput took a 20% hit when they started using the new async write API (PR #418).

We knew the new API could, in the worst case, do an additional copy (see TODOs in PR description and code.). Some quick experimentation showed this was the cause.

Description of changes:

  • Remove additional copy.
  • Add new poll_write() function, which is simpler to use from Mountpoint.
    • Mountpoint typically does 1MiB or 256KiB writes. So we pretty much always need to copy the data immediately. So let's optimize for that.
    • Rust needed tricky code to cope with the original write() API's demand that data stay alive until the write-future completes. And aws-c-s3 needed tricky code to guarantee cancel() would synchronously fire any pending write-futures. If we just offer a rust-polling-style API that always copies synchronously, we can simplify a lot of code. So let's do that.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@graebm graebm changed the title Asyncwrite poll Imrove async write (fewer copies, polling API) May 9, 2024
@graebm graebm changed the title Imrove async write (fewer copies, polling API) Improve async write (fewer copies, polling API) May 9, 2024
@codecov-commenter
Copy link

codecov-commenter commented May 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.61%. Comparing base (3647b4b) to head (fed0631).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #430      +/-   ##
==========================================
+ Coverage   89.54%   89.61%   +0.06%     
==========================================
  Files          20       20              
  Lines        6036     6045       +9     
==========================================
+ Hits         5405     5417      +12     
+ Misses        631      628       -3     
Files Coverage Δ
source/s3_auto_ranged_put.c 92.80% <100.00%> (+0.03%) ⬆️
source/s3_meta_request.c 93.47% <100.00%> (+0.33%) ⬆️

* The waker callback will be invoked when you can call poll_write() again.
* Do not call poll_write() again before the waker is invoked.
*
* 2) Else if `result.error_code != 0` then poll_write() did not succeed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there any recoverable errors, or error means everything is on fire and the only way forward is to kill things?

Copy link
Contributor Author

@graebm graebm May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"result.is_pending" is the only "recoverable" error, which is why I broke it out into its own variable, instead of it making it a special error code.

Full text is:

Else if result.error_code != 0 then poll_write() did not succeed
and you should not call it again. The meta request is guaranteed to finish soon
(you don't need to worry about canceling the meta request yourself after a failed write).
A common error code is AWS_ERROR_S3_REQUEST_HAS_COMPLETED, indicating
the meta request completed for reasons unrelated to the poll_write() call
(e.g. CreateMultipartUpload received a 403 Forbidden response).
AWS_ERROR_INVALID_STATE usually indicates that you're calling poll_write()
incorrectly (e.g. not waiting for waker callback from previous poll_write() call).

My personal philosophy on APIs where users need to call a sequence of functions correctly to complete the operation is: If anything goes wrong with a function in that sequence, the operation should cancel itself. We shouldn't require the user to inspect the result of every code, to determine whether or not they need to cancel the operation. It's a streaming operation, so random failure was always a possibility.

So users can write simple code like:

if meta_request.write() failed:
    # something went wrong, bail out of write loop, meta-request will finish soon
    return

Instead of:

if meta_request.write() failed:
    # something went wrong, be sure to cancel the meta-request
    # in case this is our fault, then bail out of write loop, meta-request will finish soon
    if error_code != AWS_ERROR_S3_REQUEST_HAS_COMPLETED:
        meta_request.cancel()

    return

* AWS_ERROR_INVALID_STATE usually indicates that you're calling poll_write()
* incorrectly (e.g. not waiting for waker callback from previous poll_write() call).
*
* 3) Else `result.bytes_processed` tells you how much data was processed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calls to write in this case can be done one after the other, without waiting for waker or any sort of backoff?

Copy link
Contributor Author

@graebm graebm May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. This works well for mountpoint because, if the user does a big write, the FUSE connector will just repeatedly feed Mountpoint 1MiB at a time, which they can keep passing to aws-c-s3 until the part is full.

It's not till they try and write the 9th MiB that they'll get result.is_pending

*
* @param user_data Pointer to be passed to the waker callback.
*
* WARNING: This feature is experimental.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side question: should we consider something like experimental annotation that warns people when they use it? openssl has a deprecated flag and users must opt in if they want to avoid warning from those functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷‍♀️

source/s3_auto_ranged_put.c Outdated Show resolved Hide resolved
source/s3_meta_request.c Outdated Show resolved Hide resolved
@graebm graebm merged commit 774999f into main May 13, 2024
30 checks passed
@graebm graebm deleted the asyncwrite-poll branch May 13, 2024 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants