-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent NaN float values from being stored #4587
Conversation
Thanks, this is a great fix, @jwilder. Regarding "HTTP line protocol handler will now write all successfully parsed points instead of failing the entire batch if one failed. When some points fail to parse and the remaining are written successfully, the endpoint returns a partial write error with the failed lines in the response." this sounds like a global change to the system, yes? We need to update the docs to reflect this, I believe. |
Not currently supported.
@beckettsean For the HTTP line protocol handler, previously you would get an error and not points written. With this change you would still get an error, but part of you batch would be written. We already return this error in some cluster failure scenarios (e.g. replication of 3 but only 1 succeeded would return a partial write error depending on the write consistency level). |
Float values are not supported in the existing engine and the tsm1 engines. This changes NewPoint to return an error if a field value contains a NaN field. It also allows us to validate fields to prevent other unsupported types from sneaking in through other input plugins.
Mainly for debugging as since this should not happen going forward. Since there may be points with NaN already stored in the WAL, this is helpful for troubleshooting panics.
This changes the HTTP line protocol handler to behave similar to the other handler in that they will write as many points as possible. Previously, we would fail the entire batch if one point failed. This can happen more frequently now with NaN being more explicitly unsupported. Now it will write as many points that parse successfully and return a "partial write" error to the client with the lines that failed to parse.
name, tags, map[string]interface{}{"value": value}, timestamp, | ||
)) | ||
) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider changing the return value of this function to support sending back an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I didn't. This func is only called in the tests.
@jwilder does the update break the client? Possible to just do it on the v2 client? |
} | ||
|
||
func (s *FloatEncoder) Finish() { | ||
if !s.finished { | ||
// write an end-of-stream record | ||
s.finished = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? Did it matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The Finish
call writes a NaN
as sentinel value using Push
. I need to mark the stream as finished so I could allow the NaN
value to be written via Push
. See https://github.com/influxdb/influxdb/pull/4587/files#diff-4fcee93b376ee05d583de8af541239b8R71
All generally makes sense, some minor feedback and questions. +1 from me. |
@pauldix Should not affect the current client. The only change is https://github.com/influxdb/influxdb/pull/4587/files#diff-12d7ec39292af18c588a073414f3e853R477 which would return a commented out point with an error message as the comment. That will parse on the server and be ignored. I didn't want to change the signature or panic since either of those would break existing clients. |
The v2 client will have a compile breakage when updating as they will need to handle the error returned by |
@jwilder sounds good, +1 |
Prevent NaN float values from being stored
Wonderful !! I will try this tomorrow with next nighlty. Thanks a lot guys ! |
This PR has a few changes to how points are accepted and stored.
NaN
values were not supported, but were able to be written through the line protocol, collectd, and other inputs. These values caused problem during query time as well as with the thetsm1
engine since they were not intended to be stored. This change returns parse error forNaN
float values as well as preventsNewPoint
from returning a valid point if given aNaN
float value. Thetsm1
float encoding will now also return an error if aNaN
value is attempted to be encoded.NewPoint
returning an error propagated up to the client packages which will require upgrades to users that are not controlling the version of the client that they are using. They will need to handle the error and determine the best thing to do with points that can't be stored.NaN
can be stored, the inputs have been updated to drop and log (when appropriate) points with invalid field values. Graphite, UDP, Collectd, and OpenTSDB inputs already dropped points so this change makes the behavior consistent across all inputs.partial write
error with the failed lines in the response. This makes the behavior more consistent with the other inputs.NaN
values cannot be parsed correctly now, TSM WAL entries will fail to parse and the WAL will abort startup. TSM data will need to be removed, but we'll log the point data that failed to parse to help with troubleshooting.Fixes #4521 #4513