Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

models: tolerate empty field names when unpacking binary points #5716

Merged
merged 5 commits into from
Feb 22, 2016

Conversation

jonseymour
Copy link
Contributor

Prior to #5697 influx crashed with a panic if a binary point contained an empty field name. After #5697, influx entered an infinite loop in the same circumstance. With this commit, we silently elide any field with an empty field name when unmarshaling a binary point. A related PR (#5714) prevents SELECT INTO trying to create binary points with empty field names.

@jonseymour
Copy link
Contributor Author

/cc @jwilder @e-dard

@jwilder
Copy link
Contributor

jwilder commented Feb 17, 2016

This isn't quite the right way to fix this. A point should never contain fields without a name.

  • NewPoint should ensure that fields passed in do not have any empty field names and return an error if they do.
  • AddField should be removed since points should be created fully via NewPoint.
  • select queries with math that produces columns without a name should not happen. It needs to generate a column name if an alias is not used.

@jonseymour
Copy link
Contributor Author

Ok, np. I'll fix accordingly.

@jonseymour
Copy link
Contributor Author

@jwilder - should I trim the field name of leading and trailing whitespace before testing for equality with the empty string?

@jonseymour jonseymour force-pushed the js-tolerate-empty-field-names branch from d350ed9 to 8229b6f Compare February 17, 2016 07:04
@jonseymour
Copy link
Contributor Author

@jwilder - I have implemented the first two suggestions with this series. I'll investigate what's required to fix the queries and may submit that as a separate series. I am going to leave 930fc42 in the series because the existing code didn't handle the case gracefully (it should have scanned to the end of the value buffer before skipping to the next field).

I'll check the result of the CI build later in the night.

@e-dard
Copy link
Contributor

e-dard commented Feb 17, 2016

@jonseymour thanks for this. I'm down with the following changes: (1) check for empty fields in NewPoint; (2) removing AddFields.

Can you explain why 930fc42 needs to be accepted though? newFieldsFromBinary is only called when retrieving the fields of a Point created via NewPoint, and NewPoint will now check for empty fields.

@jonseymour
Copy link
Contributor Author

@e-dard - the algorithm is just wrong - if the data is corrupt, the existing algorithm an infinite loop is guaranteed, as illustrated by the test cases that begin the series.. With the version I propose, it is guaranteed that the loop will terminate.

updated: e.g. if a corrupt point is read from disk with points.NewPointFromBytes then it might read a data structure which has an empty field name. It is true that the API may not create any new points that are corrupt in this way but it possible that either a) old versions of influx wrote corrupt structures to disk or b) some kind of corruption to disk files might result in an invalid structure. It doesn't cost much to write an algorithm that is guaranteed to terminate, so it seems worth doing when the alternative is one that some times enters an infinite loop due to factors outside of programmer control.

if necessary, I can write a test case that shows how points.NewPointFromBytes can be used to create an infinite loop inside influxd.

@e-dard
Copy link
Contributor

e-dard commented Feb 17, 2016

@jonseymour which test case are you referring to? Could you throw a link my way? Cheers!

@jonseymour
Copy link
Contributor Author

This commit 23be938. A transcript of executing this test, merged into both #5697 (88937ab) and the commit immediately preceding (b8bb32e) is as follows:

greywedge2:influxdb jonseymour (1)$ cd models
greywedge2:models jonseymour (1)$ git checkout 88937ab0f
HEAD is now at 88937ab... Fixes #5664
greywedge2:models jonseymour (1)$ git merge --no-commit 23be938
Automatic merge went well; stopped before committing as requested
greywedge2:models jonseymour (1)$ go test ./...
--- FAIL: TestAddFieldWithEmptyName (1.00s)
    points_test.go:1790: failed: probable infite loop
FAIL
FAIL    github.com/influxdb/influxdb/models 1.021s
greywedge2:models jonseymour (1)$ git reset --hard HEAD
HEAD is now at 88937ab Fixes #5664
greywedge2:models jonseymour (1)$ git checkout 88937ab0f^1
Previous HEAD position was 88937ab... Fixes #5664
HEAD is now at b8bb32e... Merge pull request #5470 from influxdata/ross-packaging-updates
greywedge2:models jonseymour (1)$ git merge --no-commit 23be938
Updating b8bb32e..23be938
Fast-forward
 models/points_test.go | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)
greywedge2:models jonseymour (1)$ go test ./...
panic: runtime error: index out of range

goroutine 77 [running]:
github.com/influxdb/influxdb/models.scanTo(0x82049e930, 0x8, 0x10, 0x0, 0x82049663d, 0x0, 0x0, 0x82049e920, 0x1)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points.go:892 +0xa7
github.com/influxdb/influxdb/models.newFieldsFromBinary(0x82049e930, 0x8, 0x10, 0x0)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points.go:1396 +0x108
github.com/influxdb/influxdb/models.(*point).unmarshalBinary(0x82049cb40, 0xffffffffffffffff)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points.go:1298 +0x40
github.com/influxdb/influxdb/models.(*point).Fields(0x82049cb40, 0x82049e930)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points.go:1211 +0x3a
github.com/influxdb/influxdb/models.(*point).AddField(0x82049cb40, 0x1c7d60, 0x1, 0x15a540, 0x82049e940)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points.go:1217 +0x25
github.com/influxdb/influxdb/models_test.TestAddFieldWithEmptyName.func1(0x82049cab0, 0x82049a3c0)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points_test.go:1784 +0x35b
created by github.com/influxdb/influxdb/models_test.TestAddFieldWithEmptyName
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points_test.go:1786 +0x6c

goroutine 1 [chan receive]:
testing.RunTests(0x216df8, 0x2b3980, 0x48, 0x48, 0x101)
    /usr/local/Cellar/go/1.5.3/libexec/src/testing/testing.go:562 +0x8ad
testing.(*M).Run(0x820355ef8, 0x820392280)
    /usr/local/Cellar/go/1.5.3/libexec/src/testing/testing.go:494 +0x70
main.main()
    github.com/influxdb/influxdb/models/_test/_testmain.go:218 +0x116

goroutine 76 [select]:
github.com/influxdb/influxdb/models_test.TestAddFieldWithEmptyName(0x82049cab0)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/models/points_test.go:1787 +0x16a
testing.tRunner(0x82049cab0, 0x2b4028)
    /usr/local/Cellar/go/1.5.3/libexec/src/testing/testing.go:456 +0x98
created by testing.RunTests
    /usr/local/Cellar/go/1.5.3/libexec/src/testing/testing.go:561 +0x86d
FAIL    github.com/influxdb/influxdb/models 0.017s

@e-dard
Copy link
Contributor

e-dard commented Feb 17, 2016

OK cool, I see. But that test is now irrelevant because you're removing AddFields and also doing the check for empty field names inside of NewPoint.

With the changes you have made to NewPoint it should no longer be possible to get to newFieldsFromBinary with an empty field name.

I think 930fc42 can be dropped from the PR. Hope that makes sense?

@jonseymour
Copy link
Contributor Author

As I said, i can create a test case with points.NewPointFromBytes (which does not involve any calls to the now deleted points.AddField) method which will force influxd into an infinite loop if you really want me to prove that the algorithm prior to 930fc42 is vulnerable.

@jonseymour jonseymour force-pushed the js-tolerate-empty-field-names branch from 116c704 to f92a899 Compare February 17, 2016 14:37
@jonseymour jonseymour force-pushed the js-tolerate-empty-field-names branch from f92a899 to 0186085 Compare February 17, 2016 14:48
@jonseymour jonseymour force-pushed the js-tolerate-empty-field-names branch from 0186085 to 8490c6e Compare February 17, 2016 14:49
@jonseymour
Copy link
Contributor Author

@e-dard I've created a test case which shows that points.NewPointFromBytes is vulnerable to an infinite loop if corrupt data is read from the binary encoding of the point.

Given that it costs so little to protect the system from an infinite loop caused by a data corruption, there seems to be no sound reason not to accept (8490c6e) (which replaces 930fc42).

@jonseymour
Copy link
Contributor Author

To see the vulnerablity, checkout 9b56025 which does not include the fix and then run the tests in the model package.


func TestNewPointsRejectsEmptyFieldNames(t *testing.T) {
if _, err := models.NewPoint("foo", nil, models.Fields{"": 1}, time.Now()); err == nil {
t.Fatalf("expected error if field with empty field name is created - empty string")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the failure message could be something like got nil, expected error, i.e., it's preferable to write what you got and then what you expected.

@e-dard
Copy link
Contributor

e-dard commented Feb 17, 2016

@jonseymour Now I see. It wasn't clear to me that this issue relied on a successful call to NewPointFromBytes and then a call to p.Fields...

OK, just a small nit on the test case then squash all these down?

@jonseymour jonseymour force-pushed the js-tolerate-empty-field-names branch from 8490c6e to d6bf5fc Compare February 17, 2016 16:05
jonseymour added a commit to jonseymour/influxdb that referenced this pull request Feb 17, 2016
AddField doesn't allow us to report errors, but we need to validate
some field names. Since we have to change the interface anyway,
we simply insist that all callers use NewPoint, rather than AddField.

Per advice from Jason Wilder, see

influxdata#5716 (comment)

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
@jonseymour
Copy link
Contributor Author

@e-dard I have squashed most of the commits in this series. There is one commit for the tests showing the problems that needed to be fixed and another for the fixes. The RHS of the merge commit merges cleanly with the 0.10.0 branch (which is in the form I need it for my production system). I have added a CHANGELOG update.

@jonseymour jonseymour force-pushed the js-tolerate-empty-field-names branch from 9e10d35 to 8879d1d Compare February 19, 2016 17:06
@jonseymour
Copy link
Contributor Author

I've removed the CHANGELOG.md update until I get a +1.

@@ -1079,6 +1081,9 @@ func NewPoint(name string, tags Tags, fields Fields, time time.Time) (Point, err
return nil, fmt.Errorf("NaN is an unsupported value for field %s", key)
}
}
if len(key) == 0 {
return nil, fmt.Errorf("All fields must have non-empty names")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error text should start be all lowercase.

@jwilder jwilder added this to the 0.11.0 milestone Feb 19, 2016
@jwilder
Copy link
Contributor

jwilder commented Feb 19, 2016

One small thing, but 👍 otherwise.

…fields

Influx does not support fields with empty names or points
with no fields.

NewPoint is changed to validate that all field names are non-empty.

AddField is removed because we now require that all fields are
specified on construction.

NewPointFromByte is changed to return an error if a unmarshaled
binary point does not have any fields.

newFieldsFromBinary is changed to prevent an infinite loop that
can arise while attempting to parse corrupt binary point data.

TestNewPointsWithBytesWithCorruptData is changed to reflect the
change in the behaviour of NewPointFromByte.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
RHS merges cleanly with 0.10.0 maintenance branch.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
@jonseymour jonseymour force-pushed the js-tolerate-empty-field-names branch from 8879d1d to 095a35b Compare February 20, 2016 11:26
@jonseymour
Copy link
Contributor Author

@jwilder - done. Thanks for the review!

Note that this series also includes cherry-pick of influxdata#5697.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
@jonseymour jonseymour force-pushed the js-tolerate-empty-field-names branch from 095a35b to a8877ba Compare February 20, 2016 16:57
@e-dard
Copy link
Contributor

e-dard commented Feb 20, 2016

LGTM. I'll merge this Monday morning @jonseymour

@jonseymour
Copy link
Contributor Author

Thank you, @e-dard

e-dard added a commit that referenced this pull request Feb 22, 2016
models: tolerate empty field names when unpacking binary points
@e-dard e-dard merged commit 10b9bef into influxdata:master Feb 22, 2016
This was referenced Mar 4, 2016
jonseymour added a commit to jonseymour/influxdb that referenced this pull request Mar 8, 2016
jonseymour added a commit to jonseymour/influxdb that referenced this pull request Mar 9, 2016
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
jwilder added a commit that referenced this pull request Mar 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants