Add workaround for Bazel committed_size check when uploading compressed blobs #1493

bduffany · 2022-01-27T23:51:23Z

Version bump: Patch

tylerwilliams · 2022-01-28T00:49:36Z

server/remote_cache/byte_stream_server/byte_stream_server.go

 			streamState, err = s.initStreamState(ctx, req)
+			if status.IsAlreadyExistsError(err) {
+				return handleAlreadyExists(stream, req)
+			}


does this jive well with the uploadTracker stuff below or does this mean we're missing data?

For compressed uploads requiring more than 1 message to upload, this does mean that now we are potentially letting the client upload a significant amount of data that is not tracked as an upload ("cache write" in the UI).

Fixed by creating a separate UploadTracker in handleAlreadyExists, but only in the case where we have to download the remaining stream.

Note that it's still tracking uncompressed size, but we have an existing issue to track the raw number of bytes sent by the client as opposed to uncompressed bytes.

Side note (probably not worth worrying about too much, since these short-circuiting cases should be relatively rare): I think we can do better in the case where the client sends only a single message that contains the entire payload, but then we short-circuit because the object already exists in cache. Currently (and also before this PR) we don't track those payloads at all, so we might present an inaccurate view of upload rate because it excludes these payloads from the total uploaded bytes. We also probably need to decide whether to present those as "cache writes" to the user.

We also probably need to decide whether to present those as "cache writes" to the user

If the user uploaded, even if it's a duplicate, I feel we should show it. It's confusing to see the number lower than what they could see with the gRPC log. Similar to when it's a read-only key, I feel we should show the number of items uploaded. Maybe we split those in two in the UI though? Uploaded versus written to cache?

+1 for separating upload vs. written -- I will file an issue for this.

Also, I think showing a warning when they upload but it's not written because of a read-only key can remove a lot of confusion as well.

tylerwilliams · 2022-01-28T00:52:02Z

server/remote_cache/byte_stream_server/byte_stream_server.go

+	return stream.SendAndClose(&bspb.WriteResponse{CommittedSize: committedSize})
+}
+
+func remainingUploadSize(stream bspb.ByteStream_WriteServer) (int64, error) {


nit: comment what this function does?

Done, and renamed to make it more clear that it's doing IO

tylerwilliams · 2022-01-28T00:55:54Z

server/remote_cache/byte_stream_server/byte_stream_server.go

+	//
+	// There are two cases where we can short-circuit, though:
+	//
+	// - If this is an uncompressed stream, we assume that the committed_size will


probably worth comment what's happening in the biggest part of the function below: if this is a compressed stream?

Reworded and shortened -- hopefully it's clearer now.

tylerwilliams

LGTM

nice job coming up with a workaround for this on the spot :)

Add workaround for Bazel committed_size check

9dcd0e0

bduffany requested a review from tylerwilliams January 27, 2022 23:54

bduffany marked this pull request as ready for review January 27, 2022 23:54

bduffany requested a review from vadimberezniker January 27, 2022 23:54

bduffany changed the title ~~Add workaround for Bazel committed_size check~~ Add workaround for Bazel committed_size check when uploading compressed blobs Jan 27, 2022

bduffany added 2 commits January 27, 2022 19:00

Add comment

7164269

Add more comments

c529a7b

tylerwilliams reviewed Jan 28, 2022

View reviewed changes

bduffany added 2 commits January 28, 2022 10:40

Address PR feedback

456386d

Fix

3a9c70f

tylerwilliams approved these changes Jan 28, 2022

View reviewed changes

bduffany merged commit 1cf409d into master Jan 28, 2022

bduffany deleted the committed-size-fix branch January 28, 2022 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add workaround for Bazel committed_size check when uploading compressed blobs #1493

Add workaround for Bazel committed_size check when uploading compressed blobs #1493

bduffany commented Jan 27, 2022

tylerwilliams Jan 28, 2022

bduffany Jan 28, 2022 •

edited

Loading

brentleyjones Jan 28, 2022

bduffany Jan 28, 2022

brentleyjones Jan 28, 2022

tylerwilliams Jan 28, 2022

bduffany Jan 28, 2022

tylerwilliams Jan 28, 2022

bduffany Jan 28, 2022

tylerwilliams left a comment

Add workaround for Bazel committed_size check when uploading compressed blobs #1493

Add workaround for Bazel committed_size check when uploading compressed blobs #1493

Conversation

bduffany commented Jan 27, 2022

Choose a reason for hiding this comment

bduffany Jan 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tylerwilliams left a comment

Choose a reason for hiding this comment

bduffany Jan 28, 2022 •

edited

Loading