Batched write support #1644

otoolep · 2015-02-19T06:54:31Z

No description provided.

otoolep · 2015-02-19T06:55:18Z

Most definitely work-in-progress. It compiles, but that's about it, and is not complete. The corresponding "apply" commands have yet to be coded.

otoolep · 2015-02-19T06:55:32Z

I have also not performed any testing, short of the new unit tests.

otoolep · 2015-02-19T22:00:29Z

server.go

+				if err != nil {
+					return err
+				}
+				db.addSeriesToIndex(cm.Name, series)


@benbjohnson @pauldix -- this is in the proposed "apply batched series and field creation" code. This has a race such that if an addition of a series fails to be committed to the Metastore, any changes to in-memory indexes, due to the calls to db.addSeriesToIndex(), are not rolled back. Pre-existing code also suffered from this issue with relation to fields.

This means, in a failure scenario, we could end up with a different data in memory versus in the Metatsore. We can avoid this by cloning the in-RAM indexes for the purposes of the loop, and then swapping the modified clone and in-RAM authoritative copy at the very end of the function, assuming no errors.

Make sense? Is there another approach? Do we care?

mustUpdate() should cause the server to crash hard so that when it comes back up it'll rebuild its in-memory index.

Ah yes, indeed it will. Nice.

another thing to do is to put the update of the in memory structures after the metastore update, but before it releases the database schema lock

otoolep · 2015-02-20T08:58:55Z

This patch can be summarized by two key changes:

the creation of the createMeasurementsIfNotExistsCommand, which allows multiple Measurements, Series, and Fields, to be created in 1 Raft message. Other Series and Field related Raft messages have been removed.
the write path now loops through all points in a batch, demultiplexes the data into separate buffers (byte slices) for each shard on the cluster. These buffers are then sent one after the other to the Brokers (room for optimization here), for routing to the shards. On receipt of the buffers, the data nodes then write each of these buffers to Bolt, with a transaction per buffer. Previously each point resulted in a transaction.

Best case is a batch comes in that results in a single buffer i.e. all points in the batch are destined for the same shard. This would mean a single batch is routed around the cluster in a single message, and that single batch is written to Bolt in a single transaction.

benbjohnson · 2015-02-20T16:08:01Z

@otoolep the changes lgtm. we'll need to handle batching coming from different clients as well in the future but we can do that in a separate PR. 🚢

pauldix · 2015-02-20T16:36:44Z

server.go

+	for _, p := range points {
+		// Local function makes lock management foolproof.
+		measurement, series, err := func() (*Measurement, *Series, error) {
+			s.mu.RLock()


this lock should be outside the points loop

pauldix · 2015-02-20T17:22:09Z

A few comments, but otherwise looks good.

Full unit tests added for happy paths.

This test passes, but only because it is checking for the wrong results. Once batching is implemented this test will fail (as long as it is unaltered).

Unit tests need updating since some tests are no longer valid.

Using maps was resulting in unpredicatable ordering of columns and tags.

Its presence is making Bolt-level batching quite awkward, and since it's not used, just remove it.

In addition, memomize the Field codecs.

Batched write support

otoolep added the 2 - Working label Feb 19, 2015

otoolep force-pushed the batched_series_create branch from 0304aee to 404b221 Compare February 19, 2015 21:50

otoolep reviewed Feb 19, 2015
View reviewed changes

otoolep force-pushed the batched_series_create branch 2 times, most recently from 85628c2 to b445457 Compare February 20, 2015 06:22

pauldix reviewed Feb 20, 2015
View reviewed changes

otoolep and others added 12 commits February 20, 2015 11:28

Batch Measurement creation

55fbd7e

Correctly initialize createMeasurement maps

3435da9

Full unit tests added for happy paths.

Test failure cases for measurement create command

93dea5d

Create-measurement command in local function

0d3ab9f

WriteSeries now uses batching

25c3b10

Implement applyCreateMeasurementsIfNotExists

ff76579

Add simple batch write test at handler level

3cb9398

This test passes, but only because it is checking for the wrong results. Once batching is implemented this test will fail (as long as it is unaltered).

Encode real batches for shards

2585a9e

Always ensure measurement exists in command

0e3e223

Unit tests need updating since some tests are no longer valid.

Add new fields even when Measurement exists

5c61b7d

Restore helpful field type conflict error message

4c28e63

Incorporate inital code review feedback

f5b2962

otoolep and others added 5 commits February 20, 2015 11:28

Tighten batching unit test

c3f4eb0

Move to slices for creating measurements

169409a

Using maps was resulting in unpredicatable ordering of columns and tags.

Remove unused shardsBySeriesID

bd4352c

Its presence is making Bolt-level batching quite awkward, and since it's not used, just remove it.

Write batch in a single BoltDB transaction

612ef1f

Simplify locking in WriteSeries()

9c4174a

In addition, memomize the Field codecs.

otoolep force-pushed the batched_series_create branch from d39db6a to 9c4174a Compare February 20, 2015 22:26

benbjohnson mentioned this pull request Feb 20, 2015

Add server-level write batching. #1572

Closed

otoolep added a commit that referenced this pull request Feb 21, 2015

Merge pull request #1644 from influxdb/batched_series_create

1522842

Batched write support

otoolep merged commit 1522842 into master Feb 21, 2015

otoolep removed the 2 - Working label Feb 21, 2015

otoolep deleted the batched_series_create branch February 21, 2015 00:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batched write support #1644

Batched write support #1644

otoolep commented Feb 19, 2015

otoolep commented Feb 19, 2015

otoolep commented Feb 19, 2015

otoolep Feb 19, 2015

benbjohnson Feb 19, 2015

otoolep Feb 19, 2015

pauldix Feb 19, 2015

otoolep commented Feb 20, 2015

benbjohnson commented Feb 20, 2015

pauldix Feb 20, 2015

pauldix commented Feb 20, 2015

Batched write support #1644

Batched write support #1644

Conversation

otoolep commented Feb 19, 2015

otoolep commented Feb 19, 2015

otoolep commented Feb 19, 2015

otoolep Feb 19, 2015

Choose a reason for hiding this comment

benbjohnson Feb 19, 2015

Choose a reason for hiding this comment

otoolep Feb 19, 2015

Choose a reason for hiding this comment

pauldix Feb 19, 2015

Choose a reason for hiding this comment

otoolep commented Feb 20, 2015

benbjohnson commented Feb 20, 2015

pauldix Feb 20, 2015

Choose a reason for hiding this comment

pauldix commented Feb 20, 2015