Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow NaN as a valid value on the graphite service #4846

Merged
merged 1 commit into from
Nov 20, 2015
Merged

Allow NaN as a valid value on the graphite service #4846

merged 1 commit into from
Nov 20, 2015

Conversation

jsternberg
Copy link
Contributor

The canonical graphite implementation will read and discard NaN values
instead of throwing an error when reading on the line receiver protocol.
Since this is the default behavior for graphite, InfluxDB should have
the same behavior for compatibility.

Previously, a NaN value would result in an error printed to the console.
When you have a large number of NaN values being sent every minute, this
results in the log file filling with useless messages.

- [x] CHANGELOG.md updated
- [x] Rebased/mergable
- [x] Tests pass
- [x] Sign [CLA](http://influxdb.com/community/cla.html) (if not already signed)

@@ -117,7 +117,8 @@ func (p *Parser) Parse(line string) (models.Point, error) {
}

if math.IsNaN(v) || math.IsInf(v, 0) {
return nil, fmt.Errorf(`field "%s" value: '%v" is unsupported`, fields[0], v)
// Graphite ignores NaN values with no error
return nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return a specific error type and be handled (and ignored) higher up. Returning a nil error here is problematic with the standard go if err != nil {} idiom. You now have to check both err and point not being nil which is not what a caller would typically expect or do. This func should either return a Point or an error.

@jsternberg
Copy link
Contributor Author

I've updated this to continue returning an error, but it creates an UnsupportedValueError. It then catches that error in the service and only ignores the error when the value is NaN. This way, an inf value will still be an error as graphite does not appear to suppress those.

I wasn't sure if the error value should be exported or not. I also tried to add a test, but unfortunately I didn't find a way to access the error reporting mechanism from the service through a public API.


import "fmt"

type UnsupportedValueError struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use a naming convention with an Err prefix. So this should be ErrUnsupportedValue.

@jwilder
Copy link
Contributor

jwilder commented Nov 19, 2015

@jsternberg Yes. This error should be exported. Not sure I understand the test issue you ran into though.

@jsternberg
Copy link
Contributor Author

@jwilder all I meant was if you remove my change, the tests won't fail. Here's an example of running the tests without my change:

[graphite] 2015/11/19 17:17:53 Starting graphite service, batch size 5000, batch timeout 1s
[graphite] 2015/11/19 17:17:53 Starting graphite service, batch size 5000, batch timeout 1s
[graphite] 2015/11/19 17:17:53 Listening on UDP: [::]:10000
[graphite] 2015/11/19 17:17:53 Listening on TCP: [::]:59406
[graphite] 2015/11/19 17:17:53 unable to parse line: memory NaN 1447971473: error value
PASS
ok      github.com/influxdb/influxdb/services/graphite  1.021s

Here's with my change:

[graphite] 2015/11/19 17:18:16 Starting graphite service, batch size 5000, batch timeout 1s
[graphite] 2015/11/19 17:18:16 Starting graphite service, batch size 5000, batch timeout 1s
[graphite] 2015/11/19 17:18:16 Listening on UDP: [::]:10000
[graphite] 2015/11/19 17:18:16 Listening on TCP: [::]:59427
PASS
ok      github.com/influxdb/influxdb/services/graphite  1.018s

It's fixed, but if it gets messed up in some way, the tests won't fail. Unfortunately, the error only appears to be reported into an internal structure (the statMap) which I don't seem to be able to check during the test.

If there's a better way to check for the error condition, I'll include it in this PR.

@jwilder
Copy link
Contributor

jwilder commented Nov 19, 2015

I see. I would add a test to parser_test.go that tests that error is returned for now and keep what you have in server_test.go. It would require restructuring the graphite service code a bit to test/assert this case more in that code. I don't think that is necessary though.

@jsternberg
Copy link
Contributor Author

Updated the PR.

switch err := err.(type) {
case *ErrUnsupportedValue:
// Graphite ignores NaN values with no error.
if math.IsNaN(err.Value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's count these, so we have visibility into how many points InfluxDB is dropping due to IsNan values. I suggest statPointsNanFail, with the text pointsNanFail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it as statPointsNaNFail (as NaN is usually capitalized that way instead of Nan). If you want the other capitalization, I can amend the commit again.

@otoolep
Copy link
Contributor

otoolep commented Nov 20, 2015

Makes sense to me, +1.

@jwilder ?

@jsternberg
Copy link
Contributor Author

For reference, this is the behavior I'm attempting to replicate: https://github.com/graphite-project/carbon/blob/d3822aa44d637b55839b734f5dcd2de2ce10bef8/lib/carbon/protocols.py#L65

@@ -58,6 +58,7 @@ There are breaking changes in this release:
- [#4681](https://github.com/influxdb/influxdb/pull/4681): Increase default buffer size for collectd and graphite listeners
- [#4659](https://github.com/influxdb/influxdb/pull/4659): Support IF EXISTS for DROP DATABASE
- [#4685](https://github.com/influxdb/influxdb/pull/4685): Automatically promote node to raft peer if drop server results in removing a raft peer.
- [#4846](https://github.com/influxdb/influxdb/pull/4846): Allow NaN as a valid value on the graphite service; discard these points silently (graphite compatibilty).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be sure to give yourself credit here, by thanking yourself. :-) Add "thanks @jsternberg" to the end of this line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compatibilty -> compatibility

@jwilder
Copy link
Contributor

jwilder commented Nov 20, 2015

👍

@otoolep
Copy link
Contributor

otoolep commented Nov 20, 2015

Thanks @jwilder

@jsternberg -- just give yourself credit and we will merge this -- thanks again.

The canonical graphite implementation will read and discard NaN values
instead of throwing an error when reading on the line receiver protocol.
Since this is the default behavior for graphite, InfluxDB should have
the same behavior for compatibility.

Previously, a NaN value would result in an error printed to the console.
When you have a large number of NaN values being sent every minute, this
results in the log file filling with useless messages.
@jsternberg
Copy link
Contributor Author

Done!

otoolep added a commit that referenced this pull request Nov 20, 2015
…e-service

Allow NaN as a valid value on the graphite service
@otoolep otoolep merged commit b1c5a36 into influxdata:master Nov 20, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants