models: tolerate empty field names when unpacking binary points #5716

jonseymour · 2016-02-17T06:13:12Z

Prior to #5697 influx crashed with a panic if a binary point contained an empty field name. After #5697, influx entered an infinite loop in the same circumstance. With this commit, we silently elide any field with an empty field name when unmarshaling a binary point. A related PR (#5714) prevents SELECT INTO trying to create binary points with empty field names.

jonseymour · 2016-02-17T06:13:26Z

/cc @jwilder @e-dard

jwilder · 2016-02-17T06:36:58Z

This isn't quite the right way to fix this. A point should never contain fields without a name.

NewPoint should ensure that fields passed in do not have any empty field names and return an error if they do.
AddField should be removed since points should be created fully via NewPoint.
select queries with math that produces columns without a name should not happen. It needs to generate a column name if an alias is not used.

jonseymour · 2016-02-17T06:41:23Z

Ok, np. I'll fix accordingly.

jonseymour · 2016-02-17T06:50:40Z

@jwilder - should I trim the field name of leading and trailing whitespace before testing for equality with the empty string?

jonseymour · 2016-02-17T07:08:45Z

@jwilder - I have implemented the first two suggestions with this series. I'll investigate what's required to fix the queries and may submit that as a separate series. I am going to leave 930fc42 in the series because the existing code didn't handle the case gracefully (it should have scanned to the end of the value buffer before skipping to the next field).

I'll check the result of the CI build later in the night.

e-dard · 2016-02-17T11:51:12Z

@jonseymour thanks for this. I'm down with the following changes: (1) check for empty fields in NewPoint; (2) removing AddFields.

Can you explain why 930fc42 needs to be accepted though? newFieldsFromBinary is only called when retrieving the fields of a Point created via NewPoint, and NewPoint will now check for empty fields.

jonseymour · 2016-02-17T11:59:48Z

@e-dard - the algorithm is just wrong - if the data is corrupt, the existing algorithm an infinite loop is guaranteed, as illustrated by the test cases that begin the series.. With the version I propose, it is guaranteed that the loop will terminate.

updated: e.g. if a corrupt point is read from disk with points.NewPointFromBytes then it might read a data structure which has an empty field name. It is true that the API may not create any new points that are corrupt in this way but it possible that either a) old versions of influx wrote corrupt structures to disk or b) some kind of corruption to disk files might result in an invalid structure. It doesn't cost much to write an algorithm that is guaranteed to terminate, so it seems worth doing when the alternative is one that some times enters an infinite loop due to factors outside of programmer control.

if necessary, I can write a test case that shows how points.NewPointFromBytes can be used to create an infinite loop inside influxd.

e-dard · 2016-02-17T12:03:50Z

@jonseymour which test case are you referring to? Could you throw a link my way? Cheers!

jonseymour · 2016-02-17T12:14:57Z

This commit 23be938. A transcript of executing this test, merged into both #5697 (88937ab) and the commit immediately preceding (b8bb32e) is as follows:

greywedge2:influxdb jonseymour (1)$ cd models
greywedge2:models jonseymour (1)$ git checkout 88937ab0f
HEAD is now at 88937ab... Fixes #5664
greywedge2:models jonseymour (1)$ git merge --no-commit 23be938
Automatic merge went well; stopped before committing as requested
greywedge2:models jonseymour (1)$ go test ./...
--- FAIL: TestAddFieldWithEmptyName (1.00s)
    points_test.go:1790: failed: probable infite loop
FAIL
FAIL    github.com/influxdb/influxdb/models 1.021s
greywedge2:models jonseymour (1)$ git reset --hard HEAD
HEAD is now at 88937ab Fixes #5664
greywedge2:models jonseymour (1)$ git checkout 88937ab0f^1
Previous HEAD position was 88937ab... Fixes #5664
HEAD is now at b8bb32e... Merge pull request #5470 from influxdata/ross-packaging-updates
greywedge2:models jonseymour (1)$ git merge --no-commit 23be938
Updating b8bb32e..23be938
Fast-forward
 models/points_test.go | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)
greywedge2:models jonseymour (1)$ go test ./...
panic: runtime error: index out of range

goroutine 77 [running]:
github.com/influxdb/influxdb/models.scanTo(0x82049e930, 0x8, 0x10, 0x0, 0x82049663d, 0x0, 0x0, 0x82049e920, 0x1)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points.go:892 +0xa7
github.com/influxdb/influxdb/models.newFieldsFromBinary(0x82049e930, 0x8, 0x10, 0x0)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points.go:1396 +0x108
github.com/influxdb/influxdb/models.(*point).unmarshalBinary(0x82049cb40, 0xffffffffffffffff)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points.go:1298 +0x40
github.com/influxdb/influxdb/models.(*point).Fields(0x82049cb40, 0x82049e930)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points.go:1211 +0x3a
github.com/influxdb/influxdb/models.(*point).AddField(0x82049cb40, 0x1c7d60, 0x1, 0x15a540, 0x82049e940)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points.go:1217 +0x25
github.com/influxdb/influxdb/models_test.TestAddFieldWithEmptyName.func1(0x82049cab0, 0x82049a3c0)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points_test.go:1784 +0x35b
created by github.com/influxdb/influxdb/models_test.TestAddFieldWithEmptyName
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points_test.go:1786 +0x6c

goroutine 1 [chan receive]:
testing.RunTests(0x216df8, 0x2b3980, 0x48, 0x48, 0x101)
    /usr/local/Cellar/go/1.5.3/libexec/src/testing/testing.go:562 +0x8ad
testing.(*M).Run(0x820355ef8, 0x820392280)
    /usr/local/Cellar/go/1.5.3/libexec/src/testing/testing.go:494 +0x70
main.main()
    github.com/influxdb/influxdb/models/_test/_testmain.go:218 +0x116

goroutine 76 [select]:
github.com/influxdb/influxdb/models_test.TestAddFieldWithEmptyName(0x82049cab0)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points_test.go:1787 +0x16a
testing.tRunner(0x82049cab0, 0x2b4028)
    /usr/local/Cellar/go/1.5.3/libexec/src/testing/testing.go:456 +0x98
created by testing.RunTests
    /usr/local/Cellar/go/1.5.3/libexec/src/testing/testing.go:561 +0x86d
FAIL    github.com/influxdb/influxdb/models 0.017s

e-dard · 2016-02-17T12:22:41Z

OK cool, I see. But that test is now irrelevant because you're removing AddFields and also doing the check for empty field names inside of NewPoint.

With the changes you have made to NewPoint it should no longer be possible to get to newFieldsFromBinary with an empty field name.

I think 930fc42 can be dropped from the PR. Hope that makes sense?

jonseymour · 2016-02-17T12:29:05Z

As I said, i can create a test case with points.NewPointFromBytes (which does not involve any calls to the now deleted points.AddField) method which will force influxd into an infinite loop if you really want me to prove that the algorithm prior to 930fc42 is vulnerable.

jonseymour · 2016-02-17T14:53:53Z

@e-dard I've created a test case which shows that points.NewPointFromBytes is vulnerable to an infinite loop if corrupt data is read from the binary encoding of the point.

Given that it costs so little to protect the system from an infinite loop caused by a data corruption, there seems to be no sound reason not to accept (8490c6e) (which replaces 930fc42).

jonseymour · 2016-02-17T14:56:40Z

To see the vulnerablity, checkout 9b56025 which does not include the fix and then run the tests in the model package.

e-dard · 2016-02-17T15:29:24Z

models/points_test.go

+
+func TestNewPointsRejectsEmptyFieldNames(t *testing.T) {
+	if _, err := models.NewPoint("foo", nil, models.Fields{"": 1}, time.Now()); err == nil {
+		t.Fatalf("expected error if field with empty field name is created - empty string")


nit: the failure message could be something like got nil, expected error, i.e., it's preferable to write what you got and then what you expected.

e-dard · 2016-02-17T15:29:50Z

@jonseymour Now I see. It wasn't clear to me that this issue relied on a successful call to NewPointFromBytes and then a call to p.Fields...

OK, just a small nit on the test case then squash all these down?

AddField doesn't allow us to report errors, but we need to validate some field names. Since we have to change the interface anyway, we simply insist that all callers use NewPoint, rather than AddField. Per advice from Jason Wilder, see influxdata#5716 (comment) Signed-off-by: Jon Seymour <jon@wildducktheories.com>

jonseymour · 2016-02-19T07:08:16Z

@e-dard I have squashed most of the commits in this series. There is one commit for the tests showing the problems that needed to be fixed and another for the fixes. The RHS of the merge commit merges cleanly with the 0.10.0 branch (which is in the form I need it for my production system). I have added a CHANGELOG update.

jonseymour · 2016-02-19T17:07:10Z

I've removed the CHANGELOG.md update until I get a +1.

jwilder · 2016-02-19T21:49:19Z

models/points.go

@@ -1079,6 +1081,9 @@ func NewPoint(name string, tags Tags, fields Fields, time time.Time) (Point, err
 				return nil, fmt.Errorf("NaN is an unsupported value for field %s", key)
 			}
 		}
+		if len(key) == 0 {
+			return nil, fmt.Errorf("All fields must have non-empty names")


This error text should start be all lowercase.

jwilder · 2016-02-19T21:51:46Z

One small thing, but 👍 otherwise.

…fields Influx does not support fields with empty names or points with no fields. NewPoint is changed to validate that all field names are non-empty. AddField is removed because we now require that all fields are specified on construction. NewPointFromByte is changed to return an error if a unmarshaled binary point does not have any fields. newFieldsFromBinary is changed to prevent an infinite loop that can arise while attempting to parse corrupt binary point data. TestNewPointsWithBytesWithCorruptData is changed to reflect the change in the behaviour of NewPointFromByte. Signed-off-by: Jon Seymour <jon@wildducktheories.com>

RHS merges cleanly with 0.10.0 maintenance branch. Signed-off-by: Jon Seymour <jon@wildducktheories.com>

jonseymour · 2016-02-20T11:26:48Z

@jwilder - done. Thanks for the review!

Note that this series also includes cherry-pick of influxdata#5697. Signed-off-by: Jon Seymour <jon@wildducktheories.com>

e-dard · 2016-02-20T17:08:08Z

LGTM. I'll merge this Monday morning @jonseymour

jonseymour · 2016-02-20T17:11:02Z

Thank you, @e-dard

models: tolerate empty field names when unpacking binary points

Signed-off-by: Jon Seymour <jon@wildducktheories.com>

…d-names-0.10.0 [0.10.2] Backport #5716 to 0.10.0

Fixes influxdata#5664

88937ab

This was referenced Feb 17, 2016

This test illustrates a problem with Points.newFieldsFromBinary. #5712

Closed

Fixes #5664 #5697

Closed

jonseymour force-pushed the js-tolerate-empty-field-names branch from d350ed9 to 8229b6f Compare February 17, 2016 07:04

jonseymour mentioned this pull request Feb 17, 2016

tsdb: prevent fields with empty names being written by SELECT INTO #5714

Closed

jonseymour force-pushed the js-tolerate-empty-field-names branch from f297c5a to 116c704 Compare February 17, 2016 11:42

jonseymour force-pushed the js-tolerate-empty-field-names branch from 116c704 to f92a899 Compare February 17, 2016 14:37

jonseymour force-pushed the js-tolerate-empty-field-names branch from f92a899 to 0186085 Compare February 17, 2016 14:48

jonseymour force-pushed the js-tolerate-empty-field-names branch from 0186085 to 8490c6e Compare February 17, 2016 14:49

e-dard reviewed Feb 17, 2016
View reviewed changes

jonseymour force-pushed the js-tolerate-empty-field-names branch from 8490c6e to d6bf5fc Compare February 17, 2016 16:05

jonseymour force-pushed the js-tolerate-empty-field-names branch from 9e10d35 to 8879d1d Compare February 19, 2016 17:06

jwilder reviewed Feb 19, 2016
View reviewed changes

jwilder added this to the 0.11.0 milestone Feb 19, 2016

jonseymour added 2 commits February 20, 2016 22:22

Merge influxdata#5716

d46e040

RHS merges cleanly with 0.10.0 maintenance branch. Signed-off-by: Jon Seymour <jon@wildducktheories.com>

jonseymour force-pushed the js-tolerate-empty-field-names branch from 8879d1d to 095a35b Compare February 20, 2016 11:26

Update CHANGELOG for influxdata#5716, influxdata#5664

a8877ba

Note that this series also includes cherry-pick of influxdata#5697. Signed-off-by: Jon Seymour <jon@wildducktheories.com>

jonseymour force-pushed the js-tolerate-empty-field-names branch from 095a35b to a8877ba Compare February 20, 2016 16:57

e-dard added a commit that referenced this pull request Feb 22, 2016

Merge pull request #5716 from jonseymour/js-tolerate-empty-field-names

10b9bef

models: tolerate empty field names when unpacking binary points

e-dard merged commit 10b9bef into influxdata:master Feb 22, 2016

This was referenced Mar 4, 2016

v0.10.0 PANIC #5908

Closed

[0.10.2] Backport #5716 to 0.10.0 #5911

Merged

jonseymour added a commit to jonseymour/influxdb that referenced this pull request Mar 8, 2016

Merge influxdata#5716 into 0.10.0

c0e9a45

jonseymour added a commit to jonseymour/influxdb that referenced this pull request Mar 9, 2016

Update CHANGELOG for influxdata#5716 (in 0.10.0 branch)

ffa5f89

Signed-off-by: Jon Seymour <jon@wildducktheories.com>

jwilder added a commit that referenced this pull request Mar 9, 2016

Merge pull request #5911 from jonseymour/jss-5716-tolerate-empty-fiel…

b6f91ea

…d-names-0.10.0 [0.10.2] Backport #5716 to 0.10.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models: tolerate empty field names when unpacking binary points #5716

models: tolerate empty field names when unpacking binary points #5716

jonseymour commented Feb 17, 2016

jonseymour commented Feb 17, 2016

jwilder commented Feb 17, 2016

jonseymour commented Feb 17, 2016

jonseymour commented Feb 17, 2016

jonseymour commented Feb 17, 2016

e-dard commented Feb 17, 2016

jonseymour commented Feb 17, 2016

e-dard commented Feb 17, 2016

jonseymour commented Feb 17, 2016

e-dard commented Feb 17, 2016

jonseymour commented Feb 17, 2016

jonseymour commented Feb 17, 2016

jonseymour commented Feb 17, 2016

e-dard Feb 17, 2016

e-dard commented Feb 17, 2016

jonseymour commented Feb 19, 2016

jonseymour commented Feb 19, 2016

jwilder Feb 19, 2016

jwilder commented Feb 19, 2016

jonseymour commented Feb 20, 2016

e-dard commented Feb 20, 2016

jonseymour commented Feb 20, 2016

models: tolerate empty field names when unpacking binary points #5716

models: tolerate empty field names when unpacking binary points #5716

Conversation

jonseymour commented Feb 17, 2016

jonseymour commented Feb 17, 2016

jwilder commented Feb 17, 2016

jonseymour commented Feb 17, 2016

jonseymour commented Feb 17, 2016

jonseymour commented Feb 17, 2016

e-dard commented Feb 17, 2016

jonseymour commented Feb 17, 2016

e-dard commented Feb 17, 2016

jonseymour commented Feb 17, 2016

e-dard commented Feb 17, 2016

jonseymour commented Feb 17, 2016

jonseymour commented Feb 17, 2016

jonseymour commented Feb 17, 2016

e-dard Feb 17, 2016

Choose a reason for hiding this comment

e-dard commented Feb 17, 2016

jonseymour commented Feb 19, 2016

jonseymour commented Feb 19, 2016

jwilder Feb 19, 2016

Choose a reason for hiding this comment

jwilder commented Feb 19, 2016

jonseymour commented Feb 20, 2016

e-dard commented Feb 20, 2016

jonseymour commented Feb 20, 2016