Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch API in storage extension #4145

Merged
merged 10 commits into from
Aug 9, 2021

Conversation

pmm-sumo
Copy link
Contributor

@pmm-sumo pmm-sumo commented Jul 9, 2021

Description:

Implementation of batch storage API, as discussed in open-telemetry/opentelemetry-collector#3274

Link to tracking Issue: N/A

Testing: Unit tests added

Documentation: README.md updated

Signed-off-by: Przemyslaw Maciolek pmaciolek@sumologic.com

@pmm-sumo
Copy link
Contributor Author

pmm-sumo commented Jul 9, 2021

@djaglowski @tigrannajaryan here comes the initial proposal of the batch API changes - I included it just in the contrib for now, as this made it easier to see the scope of the changes

@pmm-sumo pmm-sumo force-pushed the storage-batch-api branch 3 times, most recently from 320a01d to 5859873 Compare July 12, 2021 12:39
@tigrannajaryan
Copy link
Member

Thanks @pmm-sumo

A couple questions:

  1. Do we need to keep the non-batch versions of Get/Set/Delete or we can just have GetBatch/SetBatch/DeleteBatch and the caller can pass a single key/value if that's what they want?
  2. Do we think these batch operations are sufficient and there is no need for batching different types of operations? If I remember correctly one of the codepaths in the persistent queue needs to execute Set and Delete, which this PR's enhancement does not help with.

@pmm-sumo
Copy link
Contributor Author

pmm-sumo commented Jul 12, 2021

Thank you @tigrannajaryan for good questions!

  1. Do we need to keep the non-batch versions of Get/Set/Delete or we can just have GetBatch/SetBatch/DeleteBatch and the caller can pass a single key/value if that's what they want?

I left them for convenience, though since they call GetBatch/SetBatch/DeleteBatch underneath perhaps they can be removed and replaced with the Batch alternatives

  1. Do we think these batch operations are sufficient and there is no need for batching different types of operations? If I remember correctly one of the codepaths in the persistent queue needs to execute Set and Delete, which this PR's enhancement does not help with.

Set and Delete could be actually achieved using SetBatch currently (setting value to nil actually removes the key). This might be somewhat controversial and I wondered if BatchEntry should be extended to have operation type explicitly specified. However, when using Get/GetBatch it's impossible to tell if nil value is due to non-existing key or because the value was set to nil (which I think is OK)

Also, I was thinking if this could be made more generic and closures (or such) could be also passed to allow for more complex operations within a single transaction. However, I think this would make things much more complex for relatively small benefit

@pmm-sumo
Copy link
Contributor Author

Also, I am wondering if this method:

GetBatch(context.Context, []string) ([][]byte, error)

Should be instead changed to:

GetBatch(context.Context, []string) ([]BatchEntry, error)

So it perhaps would be easier to use the results

@djaglowski
Copy link
Member

This generally looks good to me, but I'm a little unsure about the usage of BatchEntry. At a high level, the storage.Client interface provides access to a key/value store. []BatchEntry would seem to allow for collisions between keys. Do you think that map[string][]byte might more naturally fit the model that the interface supports?

If we do stick with []BatchEntry, then I think we need to clarify the expected behavior when there is a collision between keys, and we have to accept that it is up to each implementation to adhere to this clarification.

@pmm-sumo
Copy link
Contributor Author

That's a good argument @djaglowski to keep it as map[string][]byte, let me update the PR

@pmm-sumo
Copy link
Contributor Author

pmm-sumo commented Jul 12, 2021

I updated the implementation. There seems to be a slight penalty when using the map though

UPDATE: I checked and the difference was due to the old benchmark incorrectly setting the same key in a batch. After fixing this the difference was gone

Using BatchEntry:

BenchmarkClientSet100
BenchmarkClientSet100-16    	   37966	     27505 ns/op

UPDATE: Using BatchEntry after benchmark fix:

BenchmarkClientSet100
BenchmarkClientSet100-16    	   19496	     52518 ns/op

Using map[string][]byte

BenchmarkClientSet100
BenchmarkClientSet100-16    	   22572	     54249 ns/op

@djaglowski
Copy link
Member

Do we need to keep the non-batch versions of Get/Set/Delete or we can just have GetBatch/SetBatch/DeleteBatch and the caller can pass a single key/value if that's what they want?

I am slightly in favor of keeping both sets:

  1. Convenience / readability. If I want to set one value, Set(ctx, key, val) is a lot nicer than SetBatch(ctx, map[string][]byte{ key: val })
  2. Some storage mechanisms may be better aligned with one set or the other. The author of an implementation can document whether there is a performance benefit to batching, or some other notable advantage one way or the other. And it's easy enough for the author to call one from the other as appropriate.

@pmm-sumo
Copy link
Contributor Author

Actually, I see a need for being capable of doing Get and Delete and the same time. Perhaps the simplest API (without introducing new structs) would be something alike:

GetSetBatch(context.Context, []string, map[string][]byte) ([][]byte, error)

@tigrannajaryan
Copy link
Member

Set and Delete could be actually achieved using SetBatch currently (setting value to nil actually removes the key).

Can we make this a formal requirement for SetBatch API? It does not seem to create a major difficulty for storage implementors. If we do this then I think we will have all use cases seen in open-telemetry/opentelemetry-collector#3274 covered.

The rest looks good to me.

@tigrannajaryan
Copy link
Member

Actually, I see a need for being capable of doing Get and Delete and the same time.

Can you describe a use case when this is needed? Do we have in the persistent queue implementation?

Perhaps the simplest API (without introducing new structs) would be something alike:

GetSetBatch(context.Context, []string, map[string][]byte) ([][]byte, error)

What is the ordering of operations here? Will this first do all Gets and then all Sets/Deletes?

@djaglowski
Copy link
Member

@pmm-sumo You mentioned Get and Delete, but the signature says GetSet. Also, Tigran previously mentioned Set and Delete, so it seems we've named all three possible pairs of Get/Set/Delete.

Would you mind clarifying what we're trying to support?

@pmm-sumo
Copy link
Contributor Author

pmm-sumo commented Jul 14, 2021

When item is pulled from the queue, it is retrieved and deleted at the same time (so Get and Delete operations). Since SetBatch was actually allowing to do Set/Delete, I thought that since we need Get as well, we could maybe name it GetSetBatch.

However, now when I think about it I believe we could replace GetBatch, SetBatch and DeleteBatch with something like:

Batch(context.Context, []string, map[string][]byte) ([][]byte, error)

This could handle all three types of operations. The order might be according to arguments (get first, set second). For bbolt, this would be executed as a single transaction. In the above example, delete is achieved through setting key value to nil. I think having explicit separate argument for delete might bring some confusion, unclarity what to do when key is both set and deleted and a general risk of messing up (not that delete-through-set-to-nil is avoiding that, but I find it less confusing)

An alternative approach might be creating a struct, like:

type Operation struct {
  Key string
  Value []byte // get result would be put here
  Type int // 0: get, 1: set, 2: delete
}

And then

ExecuteBatch(context.Context, ops []Operation) ([]Operation, error)

Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Batch method is almost too clever to be easily understood. It's nice that all of this can be executed in a single transaction, but I wonder if it's not complicating things a bit more than necessary.

I think that one of two things must be true, if we stick with the unified Batch method.

  1. A rule must be provided that explains the expected order of operations.
  2. The order of operations is ambiguous.

I as understand, you've suggested that Get will be performed first, and then Set/Delete. This is clear enough, but then there is actually not a lot of difference between this and just calling GetBatch and SetDeleteBatch consecutively. Of course, it would require independent transactions, but I wonder if this is really a problem. If not, I think that two method solution is a lot more intuitive than the unified one.

If we do need to support all in one transaction, then I think the Operation struct would be a lot more clear, though perhaps less performant.

extension/storage/README.md Outdated Show resolved Hide resolved
extension/storage/README.md Outdated Show resolved Hide resolved
@pmm-sumo
Copy link
Contributor Author

The Batch method is almost too clever to be easily understood. It's nice that all of this can be executed in a single transaction, but I wonder if it's not complicating things a bit more than necessary.

Yeah, this is worrying me too

If we do need to support all in one transaction, then I think the Operation struct would be a lot more clear, though perhaps less performant.

Actually, I ran quick tests and it seems the impact is not noticeable (run the benchmarks a couple of times and the results were always in the ballpark).

BenchmarkClientGet100
BenchmarkClientGet100-16              	   66300	     16842 ns/op
BenchmarkClientGet100_Operation
BenchmarkClientGet100_Operation-16    	   68199	     15859 ns/op
BenchmarkClientSet100
BenchmarkClientSet100-16              	   22588	     54377 ns/op
BenchmarkClientSet100_Operation
BenchmarkClientSet100_Operation-16    	   23744	     49903 ns/op

Let me update the PR with Operation, since I think it will be the clearest (and the order will be specified by a position in the array)

@pmm-sumo pmm-sumo force-pushed the storage-batch-api branch from 0a09ae6 to e064f6e Compare July 19, 2021 11:34
@pmm-sumo
Copy link
Contributor Author

@djaglowski @tigrannajaryan I have updated this PR with the new approach for Batch operations. I have followed with an update to Persistent Storage PR

@djaglowski
Copy link
Member

@pmm-sumo, This looks good to me.

I want to articulate one possible drawback of this design, but I'll preface this by saying that I can't find any concrete examples of the concern I have, so I'm not sure we need to put much weight behind it. Curious what others think though.

The bbolt implementation looks simple enough, but I believe this may be due to the fact that bbolt supports arbitrary code execution in transactions.

I wonder if a more restricted API might struggle to take advantage of batching, when given an arbitrary sequence of operations. For example, suppose we need to execute the sequence [Get, Set, Delete, Get, Set, Delete] using an API that looks very much like our initial design (BatchGet, BatchSet, and BatchDelete). In that case, the implementation would need to make a call for each operation, which means it cannot really make any benefit from this batching API.

Again, I think the tradeoff is probably worth it, but I want to make sure we are considering this point before locking in the interface.

extension/storage/storage.go Outdated Show resolved Hide resolved
Type int
}

type Operation *operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason to make this a "pointer type"? Pointers are "less to move around" but they create indirection (and obviously put pressure on GC) and I wonder if this was benchmarked against using plain (non-pointer) type?

Copy link
Contributor Author

@pmm-sumo pmm-sumo Jul 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because for `Get, the Value is updated in-place. The version with pointers had slightly better throughput actually when I benchmarked it locally

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can the update-in-place be achieved by making just the Value field a pointer *[]byte while passing Operation by value (and thus reducing GC pressure for Set and Delete cases)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can the update-in-place be achieved by making just the Value field a pointer *[]byte while passing Operation by value (and thus reducing GC pressure for Set and Delete cases)?

Perhaps Golang experts could provide more accurate explanation but my understanding is it would still not work due to the way Operation is passed. The update would be only present in the local context (within the function where the update was made). I actually tried this and that was the outcome.

If the collection of operations would be sent as a slice (rather than variadic function argument) and then the result would be taken from that slice, it would work, but I think also would make things more complex

Comment on lines 70 to 73
// Key specifies key which is going to be get/set/deleted
Key string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Food for thought:

Making this a string will enforce copying the string in many places whereas making this a []byte would not. I'm talking about places like these:

...
bucket.Get([]byte(op.Key))
...

This won't be a huge difference but might save some allocations (it depends what's the priority: interface ease of use or performance).

In the end, bbolt's implementation accepts []byte as key.

Doing some ad hoc benchmark on my machine, baseline:

goos: darwin
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage/filestorage
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkClientGet
BenchmarkClientGet-16             254341              9822 ns/op            5433 B/op         29 allocs/op
BenchmarkClientGet100
BenchmarkClientGet100-16          136087             15871 ns/op            7816 B/op        128 allocs/op
BenchmarkClientSet
BenchmarkClientSet-16             203229             11949 ns/op            6276 B/op         42 allocs/op
BenchmarkClientSet100
BenchmarkClientSet100-16           47556             47913 ns/op           18340 B/op        343 allocs/op
BenchmarkClientDelete
BenchmarkClientDelete-16          254775              9744 ns/op            5433 B/op         29 allocs/op
PASS
ok      github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage/filestorage 13.301s

version with ... key []byte, ...

goos: darwin
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage/filestorage
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkClientGet
BenchmarkClientGet-16             237909              9203 ns/op            5434 B/op         29 allocs/op
BenchmarkClientGet100
BenchmarkClientGet100-16          158772             15487 ns/op            7816 B/op        128 allocs/op
BenchmarkClientSet
BenchmarkClientSet-16             194180             12017 ns/op            6268 B/op         41 allocs/op
BenchmarkClientSet100
BenchmarkClientSet100-16           49833             45893 ns/op           16739 B/op        243 allocs/op
BenchmarkClientDelete
BenchmarkClientDelete-16          252370              9527 ns/op            5433 B/op         29 allocs/op
PASS
ok      github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage/filestorage 17.786s

which gives the following results:

benchstat -delta-test none old.txt new.txt
name             old time/op    new time/op    delta
ClientGet-16       9.82µs ± 0%    9.20µs ± 0%   -6.30%
ClientGet100-16    15.9µs ± 0%    15.5µs ± 0%   -2.42%
ClientSet-16       11.9µs ± 0%    12.0µs ± 0%   +0.57%
ClientSet100-16    47.9µs ± 0%    45.9µs ± 0%   -4.22%
ClientDelete-16    9.74µs ± 0%    9.53µs ± 0%   -2.23%

name             old alloc/op   new alloc/op   delta
ClientGet-16       5.43kB ± 0%    5.43kB ± 0%   +0.02%
ClientGet100-16    7.82kB ± 0%    7.82kB ± 0%    0.00%
ClientSet-16       6.28kB ± 0%    6.27kB ± 0%   -0.13%
ClientSet100-16    18.3kB ± 0%    16.7kB ± 0%   -8.73%
ClientDelete-16    5.43kB ± 0%    5.43kB ± 0%    0.00%

name             old allocs/op  new allocs/op  delta
ClientGet-16         29.0 ± 0%      29.0 ± 0%    0.00%
ClientGet100-16       128 ± 0%       128 ± 0%    0.00%
ClientSet-16         42.0 ± 0%      41.0 ± 0%   -2.38%
ClientSet100-16       343 ± 0%       243 ± 0%  -29.15%
ClientDelete-16      29.0 ± 0%      29.0 ± 0%    0.00%

ref: https://stackoverflow.com/a/43470344

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good find. I think it's largely a question of what the API should accept. I think keeping keys as strings simplifies a lot of thing, though making them []byte would open new possibilities

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think the API this introduces strikes a good balance between ease of use, performance and ease of implementing a storage.

extension/storage/filestorage/client.go Outdated Show resolved Hide resolved
Type int
}

type Operation *operation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can the update-in-place be achieved by making just the Value field a pointer *[]byte while passing Operation by value (and thus reducing GC pressure for Set and Delete cases)?

@pmm-sumo pmm-sumo force-pushed the storage-batch-api branch 3 times, most recently from 3b3e4ff to 43fab71 Compare August 4, 2021 16:58
@pmm-sumo pmm-sumo force-pushed the storage-batch-api branch from 43fab71 to 0f87604 Compare August 9, 2021 08:32
@tigrannajaryan
Copy link
Member

@pmm-sumo do you plan to update this to use the new core storage API?

@pmm-sumo
Copy link
Contributor Author

pmm-sumo commented Aug 9, 2021

@pmm-sumo do you plan to update this to use the new core storage API?

@tigrannajaryan Sure, I kept on the same codebase and was waiting for the core changes to be migrated to contrib and just rebased the PR

@pmm-sumo pmm-sumo marked this pull request as ready for review August 9, 2021 09:19
@pmm-sumo pmm-sumo requested review from a team and dmitryax August 9, 2021 09:19
Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tigrannajaryan tigrannajaryan merged commit cfb95d0 into open-telemetry:main Aug 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants