-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: send ReplicaState in streaming snapshots #10931
storage: send ReplicaState in streaming snapshots #10931
Conversation
Needs more commenting and some additional performance testing. In my informal testing this speeds up |
Nice! Reviewed 8 of 8 files at r1. pkg/storage/raft.proto, line 132 at r1 (raw file):
pkg/storage/replica_raftstorage.go, line 639 at r1 (raw file):
The next beta is going to require a stop-the-world upgrade anyway, so if this goes in in time I don't think we need this backwards-compatibility mode at all. Comments from Reviewable |
38fc36d
to
cafc2b8
Compare
Added a benchmark and a basic test of write-only batches. I think this is good to go now. PTAL. Review status: 3 of 13 files reviewed at latest revision, 2 unresolved discussions, some commit checks pending. pkg/storage/raft.proto, line 132 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…> `ReplicaState` contains a `RangeDescriptor`, which we're already sending in the header. Should we move this into the header and remove the range descriptor field there?Comments from Reviewable |
@tamird Can you take a look at the teamcity test failures? The new benchmarks are failing but they are following the same pattern as existing benchmarks. |
I think you might need to rebase to pick up #10923, #10927, and #10933. Review status: 3 of 13 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed. Comments from Reviewable |
cafc2b8
to
d2c55f2
Compare
I haven't looked in detail, but looks like a hardcoded file name so multiple runs of the benchmark under stress are conflicting with one another. Review status: 3 of 13 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed. Comments from Reviewable |
Why are we trying to run benchmarks under stress? |
Perhaps we shouldn't, but sadly our benchmarks frequently rot, so I On Tue, Nov 22, 2016 at 2:44 PM, Peter Mattis notifications@github.com
|
d2c55f2
to
506b995
Compare
Ok. Fixed these benchmarks to use an in-mem RocksDB as they aren't actually writing anything to disk. |
506b995
to
b56e3c7
Compare
Reviewed 7 of 10 files at r2, 3 of 3 files at r3. pkg/storage/raft.proto, line 113 at r3 (raw file):
Ah, that's subtle. It also seems incorrect to me - we shouldn't leak out the post-add range descriptor until after the transaction is committed. I looked at all the places where Header.RangeDescriptor is used and it looks like we only use the start/end key and range ID; nothing ever looks at the replica set there. So it should be safe to just drop the range_descriptor field and use the one in the state instead. (we could do this in a separate PR, but I'd be inclined to group them together just to minimize the number of separate incompatible changes here) Comments from Reviewable |
LG again, although I'm realizing that we could use some better documentation about write-only and distinct batches. Should write-only imply distinct? Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. Comments from Reviewable |
Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. pkg/storage/raft.proto, line 113 at r3 (raw file): Previously, bdarnell (Ben Darnell) wrote…> Ah, that's subtle. It also seems incorrect to me - we shouldn't leak out the post-add range descriptor until after the transaction is committed. I looked at all the places where Header.RangeDescriptor is used and it looks like we only use the start/end key and range ID; nothing ever looks at the replica set there. So it should be safe to just drop the range_descriptor field and use the one in the state instead. (we could do this in a separate PR, but I'd be inclined to group them together just to minimize the number of separate incompatible changes here)Comments from Reviewable |
b56e3c7
to
07705f2
Compare
Review status: 12 of 13 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. pkg/storage/raft.proto, line 113 at r3 (raw file): Previously, petermattis (Peter Mattis) wrote…> We call `Replica.setDescWithProcessUpdate` at the end of `Replica.applySnapshot` using `Header.RangeDescriptor`. Is it correct to change this code to use `ReplicaState.Desc`? Doing so will mean the replica's RangeDescriptor won't contain itself for a brief period of time. Is that ok with the replica GC queue?Comments from Reviewable |
07705f2
to
2dc93bf
Compare
Review status: 7 of 15 files reviewed at latest revision, 2 unresolved discussions. pkg/storage/raft.proto, line 113 at r3 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Ok, I've removed Comments from Reviewable |
Reviewed 3 of 8 files at r1, 5 of 10 files at r2, 1 of 3 files at r3, 9 of 9 files at r4. pkg/storage/replica_raftstorage.go, line 394 at r4 (raw file):
is there any reason this is a pointer? looks like it made sense in an earlier version when the proto field was nullable. pkg/storage/replica_raftstorage.go, line 581 at r4 (raw file):
should this include the expected one? pkg/storage/engine/bench_test.go, line 548 at r4 (raw file):
merp this can just be pkg/storage/engine/engine.go, line 177 at r4 (raw file):
is it possible to have a new, narrower interface for this thing, rather than a bunch of runtime panics? Comments from Reviewable |
2dc93bf
to
b47f17c
Compare
Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful. pkg/storage/replica_raftstorage.go, line 394 at r4 (raw file): Previously, tamird (Tamir Duberstein) wrote…
No, there is no strong reason for it to be a pointer. There is also no strong reason for it not to be a pointer. pkg/storage/replica_raftstorage.go, line 581 at r4 (raw file): Previously, tamird (Tamir Duberstein) wrote…
The expected range ID is in the context. pkg/storage/engine/bench_test.go, line 548 at r4 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Done. pkg/storage/engine/engine.go, line 177 at r4 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Not easily. I just explored that and it requires a whole bunch of untangling. In particular, even defining a Comments from Reviewable |
Reviewed 1 of 1 files at r5. pkg/storage/replica_raftstorage.go, line 394 at r4 (raw file): Previously, petermattis (Peter Mattis) wrote…
In this case, I think there is - a pointer creates ambiguity because it is possible for it to be nil, and this field should never be nil. pkg/storage/engine/engine.go, line 177 at r4 (raw file): Previously, petermattis (Peter Mattis) wrote…
About what I expected. It'd be good to file an issue, given the number of external contributors we've been enjoying. Comments from Reviewable |
Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks pending. pkg/storage/replica_raftstorage.go, line 394 at r4 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Using a value requires copying which is cheap, but not free. It also consumes stack space in this instance. I think we need a better justification than documenting non-nilness. pkg/storage/engine/engine.go, line 177 at r4 (raw file): Previously, tamird (Tamir Duberstein) wrote…
This is low-priority but invasive and possibly destabilizing. Not a good combination. Comments from Reviewable |
b47f17c
to
4ea70c7
Compare
Review status: 14 of 15 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending. pkg/storage/replica_raftstorage.go, line 394 at r4 (raw file): Previously, petermattis (Peter Mattis) wrote…
Isn't this exactly the opposite of the usual wisdom? Do the thing that is clearest to human readers unless benchmarks justify doing otherwise? Comments from Reviewable |
4ea70c7
to
1d36778
Compare
Review status: 14 of 15 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending. pkg/storage/replica_raftstorage.go, line 394 at r4 (raw file): Previously, tamird (Tamir Duberstein) wrote…
#11616 was needed because benchmarks did show that growing the stack was problematic. And the reason our stacks are so large is our pervasive use of values instead of pointers: creating values and passing them on the stack even when a pointer was readily available. This is exactly an instance of that. The caller has a pointer readily available, we should pass it. Comments from Reviewable |
I'm not saying that growing the stack is never problematic, but without
demonstrating that it is problematic in this instance, that argument does
not hold water.
This thread is unlikely to change your mind, so please don't hold off on
proceeding with this change on account of it.
…On Mon, Nov 28, 2016 at 12:13 PM, Peter Mattis ***@***.***> wrote:
Review status: 14 of 15 files reviewed at latest revision, 4 unresolved
discussions, some commit checks pending.
------------------------------
*pkg/storage/replica_raftstorage.go, line 394 at r4
<https://reviewable.io:443/reviews/cockroachdb/cockroach/10931#-KXfiZaLPTP5aS4tp0vn:-KXfumcoMSQOfVY7XmUe:b41wqw5>
(raw file
<https://github.com/cockroachdb/cockroach/blob/2dc93bfb88f6693442e57e42e426a2ebe5df2987/pkg/storage/replica_raftstorage.go#L394>):*
*Previously, tamird (Tamir Duberstein) wrote…*
Isn't this exactly the opposite of the usual wisdom? Do the thing that is
clearest to human readers unless benchmarks justify doing otherwise?
#11616 <#11616> was needed
because benchmarks did show that growing the stack was problematic. And the
reason our stacks are so large is our pervasive use of values instead of
pointers: creating values and passing them on the stack even when a pointer
was readily available. This is exactly an instance of that. The caller has
a pointer readily available, we should pass it.
------------------------------
*Comments from Reviewable
<https://reviewable.io:443/reviews/cockroachdb/cockroach/10931>*
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10931 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABdsPGtHD95kXCmUXHKfbk2AAD4M4B6tks5rCwvCgaJpZM4K45h->
.
|
Justifying every use of pointer vs value via a benchmark will ensure that we'll almost always use values. |
...that's not an argument - either we make evidence-based decisions, or we
do not.
…On Mon, Nov 28, 2016 at 12:29 PM, Peter Mattis ***@***.***> wrote:
I'm not saying that growing the stack is never problematic, but without
demonstrating that it is problematic in this instance, that argument does
not hold water.
Justifying every use of pointer vs value via a benchmark will ensure that
we'll almost always use values.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10931 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABdsPG_iR1asMb-0d0vtOoU4Dx1hD8peks5rCw99gaJpZM4K45h->
.
|
I don't think decision making is as black and white as that. We use evidence to make some decisions and then fold that evidence and those decisions into an internal model and heuristics that we use to make other decisions. We certainly don't base all of our decisions on evidence. I'm curious what evidence you would use to justify changing |
On Mon, Nov 28, 2016 at 1:06 PM, Peter Mattis ***@***.***> wrote:
I'm curious what evidence you would use to justify changing
IncomingSnapshot.State from a pointer to a value.
My argument is based on the assumption that the "default" is a to use a
value. At least, it is my recollection that this is the best practice
documented in some document authored by the Go maintainers. It's a wise
default because it more strongly encodes programmer intent than does a
never-nil pointer.
If we agree on the above, then we need justification to deviate from that
default, and if we do not, then we need to adjust that default. The problem
with making decisions based on internal models and heuristics is that they
create FUD - someone will come along at some point and wonder why that
thing is a pointer, and for them to discover that it was because of this
stack growth problem (or worse yet, suspicion thereof) will be completely
impossible.
|
You're probably thinking of https://github.com/golang/go/wiki/CodeReviewComments#pass-values:
I'd be interested if there is other wisdom from the Go maintainers you're thinking of. Notice that last sentence. It is an important caveat. |
Yeah, I re-read this thing when we were discussing pointer-vs-value in
Bram's recent PR. I recall another document that specifically mentioned
using a value for clarity, but I couldn't find it the other day.
In any case, if you choose to leave this as-is, the contract (never nil)
and the intent (saving stack space) should be clearly documented.
…On Mon, Nov 28, 2016 at 1:17 PM, Peter Mattis ***@***.***> wrote:
My argument is based on the assumption that the "default" is a to use a
value. At least, it is my recollection that this is the best practice
documented in some document authored by the Go maintainers. It's a wise
default because it more strongly encodes programmer intent than does a
never-nil pointer.
You're probably thinking of https://github.com/golang/go/
wiki/CodeReviewComments#pass-values:
Don't pass pointers as function arguments just to save a few bytes. If a
function refers to its argument x only as *x throughout, then the argument
shouldn't be a pointer. Common instances of this include passing a pointer
to a string (*string) or a pointer to an interface value (*io.Reader). In
both cases the value itself is a fixed size and can be passed directly.
This advice does not apply to large structs, or even small structs that
might grow.
I'd be interested if there is other wisdom from the Go maintainers you're
thinking of. Notice that last sentence. It is an important caveat.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10931 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABdsPHm3HW3R7t02rXqAdAmopoRKUmIlks5rCxrXgaJpZM4K45h->
.
|
Reviewed 8 of 9 files at r4, 1 of 1 files at r5, 1 of 1 files at r6. pkg/storage/raft.proto, line 107 at r6 (raw file):
Nit: we don't generally leave comments about what reserved tags were formerly used for. Comments from Reviewable |
1d36778
to
520b7ed
Compare
Review status: 11 of 15 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending. pkg/storage/raft.proto, line 107 at r6 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Removed. Comments from Reviewable |
Reviewed 1 of 1 files at r6, 4 of 4 files at r7. pkg/storage/replica_raftstorage.go, line 394 at r7 (raw file):
Comments from Reviewable |
This removes the need to read ReplicaState from the batch when applying a snapshot, allowing the use of a write-only RocksDB batch. The write-only RocksDB batch does not index keys on insertion/deletion which is only necessary if those keys are read. This change requires a stop-the-world upgrade: new nodes will not be able to received snapshots generated by old nodes. name time/op BatchApplyBatchRepr-8 200ms ± 1% WriteOnlyBatchApplyBatchRepr-8 99.1ms ± 1% name speed BatchApplyBatchRepr-8 180MB/s ± 1% WriteOnlyBatchApplyBatchRepr-8 363MB/s ± 1% Fixes cockroachdb#10783
520b7ed
to
85f492b
Compare
Reviewed 1 of 1 files at r8. Comments from Reviewable |
This removes the need to read ReplicaState from the batch when applying
a snapshot, allowing the use of a write-only RocksDB batch. The
write-only RocksDB batch does not index keys on insertion/deletion which
is only necessary if those keys are read.
Fixes #10783
This change is