Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use REAPI batch API for small blob writes #12537

Merged
merged 6 commits into from
Aug 13, 2021

Conversation

tdyas
Copy link
Contributor

@tdyas tdyas commented Aug 9, 2021

Problem

Pants currently uses the ByteStream API exposed by REAPI servers to write all blobs including small blobs. This incurs unnecessary network overhead when reading small blobs which could have been inlined into the request if BatchUpdateBlobs had been used.

Solution

  1. Add a --remote-store-batch-api-size-limit option which controls the size threshold for when to switch from BatchUpdateBlobs to BytesStream.Write
  2. If both local configuration (i.e., --remote-store-batch-api-size-limit) and REAPI server capabilities (i.e., max_batch_total_size_bytes) allow a write via the batch API, then write via the batch API otherwise continue to use ByteStream.Write.

Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

One thought - in terms of latency reduction, we probably want to spawn a thread doing a GetCapabilities in the background when a remote Store/ProcessExecutor is new'd up, rather than waiting on the first use. Or if not, we may want to race the first file upload using ByteStream against the GetCapabilities call.

Looking forward to seeing the batching!

src/python/pants/option/global_options.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome

@tdyas
Copy link
Contributor Author

tdyas commented Aug 10, 2021

One thought - in terms of latency reduction, we probably want to spawn a thread doing a GetCapabilities in the background when a remote Store/ProcessExecutor is new'd up, rather than waiting on the first use. Or if not, we may want to race the first file upload using ByteStream against the GetCapabilities call.

There may be runs where local cache is hit entirely, and thus there would be no need to always send the GetCapabilities RPC. The idea of racing the first write against GetCapabilities seems interesting, but I prefer to hold off on trying to prematurely optimize until there is a clearer view of what needs optimization.

For example, maybe Pants could remember the capabilities between runs (via pantsd or whatever) and just avoid making the call if it already has a recently-obtained ServerCapabilities response? This would also point to future work to not re-init the remote store / command runner on every run.

@tdyas
Copy link
Contributor Author

tdyas commented Aug 10, 2021

Does --batch-api-size-limit need to be limited by any Tonic message size configs?

@illicitonion
Copy link
Contributor

Does --batch-api-size-limit need to be limited by any Tonic message size configs?

I think the default in gRPC servers is a max size of 4MB, but that's technically configurable to be embiggened. I'd like to say trust that if a server has a limit they'll return it in capabilities?

hyperium/tonic#264 suggests tonic itself doesn't introduce its own limits by default, so I don't think we need to worry about it on the client-side.

@tdyas
Copy link
Contributor Author

tdyas commented Aug 13, 2021

Future Work

  • This PR does not port the read path to BatchReadBlobs. That will occur in a subsequent PR.
  • Merging multiple blob writes into a single call to BatchUpdateBlobs.

@tdyas tdyas merged commit 79cd70e into pantsbuild:main Aug 13, 2021
@tdyas tdyas deleted the remote_store_use_batch_update_blobs branch August 13, 2021 14:48
tdyas pushed a commit that referenced this pull request Jun 28, 2022
In #12537, Pants learned how to send small blob writes to a remote store via the `BatchUpdateBlobs` API of the REAPI instead of the `ByteStream.Write` streaming API. This PR similarly teaches Pants how to use the `BatchReadBlobs` API for small blob reads instead of the `ByteStream.Read` API. 

The goal is to reduce the overhead of small blob reads by avoiding the multiple message exchanges required by the streaming API.

Note: This PR does not batch multiple small blob reads together, which is an approach that could be worthwhile to evaluate in the future.
tdyas pushed a commit to tdyas/pants that referenced this pull request Jun 29, 2022
…d#15969)

In pantsbuild#12537, Pants learned how to send small blob writes to a remote store via the `BatchUpdateBlobs` API of the REAPI instead of the `ByteStream.Write` streaming API. This PR similarly teaches Pants how to use the `BatchReadBlobs` API for small blob reads instead of the `ByteStream.Read` API. 

The goal is to reduce the overhead of small blob reads by avoiding the multiple message exchanges required by the streaming API.

Note: This PR does not batch multiple small blob reads together, which is an approach that could be worthwhile to evaluate in the future.
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
tdyas pushed a commit that referenced this pull request Jun 29, 2022
…ck of #15969) (#15997)

In #12537, Pants learned how to send small blob writes to a remote store via the `BatchUpdateBlobs` API of the REAPI instead of the `ByteStream.Write` streaming API. This PR similarly teaches Pants how to use the `BatchReadBlobs` API for small blob reads instead of the `ByteStream.Read` API. 

The goal is to reduce the overhead of small blob reads by avoiding the multiple message exchanges required by the streaming API.

Note: This PR does not batch multiple small blob reads together, which is an approach that could be worthwhile to evaluate in the future.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants