Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: hard limit on field size while parsing line protocol #21843

Merged
merged 1 commit into from
Jul 14, 2021

Conversation

lesam
Copy link
Contributor

@lesam lesam commented Jul 13, 2021

Per https://docs.influxdata.com/enterprise_influxdb/v1.9/write_protocols/line_protocol_reference/
we only support 64KB, but 1MB is a more realistic practical limit. Before this commit there was
no enforcement of field value size.

Closes #21841

Describe your proposed changes here.

@lesam
Copy link
Contributor Author

lesam commented Jul 13, 2021

We don't catch everywhere that can build a point, but this covers the write ingress which is the most important one.

@lesam lesam requested a review from gwossum July 13, 2021 21:35
@lesam lesam force-pushed the limit-field-size branch from 87a385e to 83b1f0e Compare July 13, 2021 21:37
gwossum
gwossum previously approved these changes Jul 13, 2021
Copy link
Member

@gwossum gwossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation and test cases looks good, have test cases for maximum possible field size and 1 greater.

It'd be cool if the test checked the error message in the HTTP body since there's multiple possible errors, but none of the other tests in server_test.go seem to check error messages. I won't hold that against this test.

@lesam
Copy link
Contributor Author

lesam commented Jul 14, 2021

Closing as we want to make this configurable

Per https://docs.influxdata.com/enterprise_influxdb/v1.9/write_protocols/line_protocol_reference/
we only support 64KB, but 1MB is a more realistic practical limit. Before this commit there was
no enforcement of field value size.

Closes influxdata#21841
@lesam lesam force-pushed the limit-field-size branch from 370656a to efb073a Compare July 14, 2021 15:44
@@ -186,9 +186,8 @@ type EngineOptions struct {
// nil will allow all combinations to pass.
ShardFilter func(database, rp string, id uint64) bool

Config Config
SeriesIDSets SeriesIDSets
FieldValidator FieldValidator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eliminating Engine.FieldValidator seems reasonable, since I can't find any use other than the default validator. If we get rid of Engine.FieldValidator, should we also get rid of EngineOptions.FieldValidator? There might be confusion in the future if someone thinks they can set that to change the field validation behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in EngineOptions - I don't see any FieldValidator remaining in the code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why I thought there was an Engine.FieldValidator. I thought there was a Shard.FieldValidator, typed Engine.FieldValidator, but it was really setting the value in EngineOptions.FieldValidator. My mistake, carry on!

Comment on lines +17 to +24
pointSize := point.StringSize()
iter := point.FieldIterator()
for iter.Next() {
if !skipSizeValidation {
// Check for size of field too large. Note it is much cheaper to check the whole point size
// than checking the StringValue size (StringValue potentially takes an allocation if it must
// unescape the string, and must at least parse the string)
if pointSize > MaxFieldValueLength && iter.Type() == models.String {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A possible optimization here is to compare pointSize to MaxFieldValueLength before the loop, and set skipSizeValidation = true before starting the loop if pointSize <= MaxFieldValueLength. It's possible the compiler is already performing this loop invariant hoisting, but we can make sure it is. Even if the compiler is already doing the work, this might improve readability. It would look something like:

    pointSize := point.StringSize()
    if pointSize <= MaxFieldValueLength {
        skipSizeValidation = true
    }
    iter := point.FieldIterator()
    for iter.Next() {
        if !skipSizeValidation {
            if iter.Type() == models.String {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code as written is simpler, and I don't want to make it less simple for a single comparison - I doubt that would show up in any profiling.

It is debatable whether I should even be doing the pointSize optimization, but I think it is a clear enough win to not re-allocate a ton of strings during StringValue that it is ok to add the check.

Copy link
Member

@gwossum gwossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Test cases for both sides of boundary condition.

@lesam lesam merged commit 6d22e69 into influxdata:master-1.x Jul 14, 2021
lesam added a commit to lesam/influxdb that referenced this pull request Aug 6, 2021
…#21843)

Per https://docs.influxdata.com/enterprise_influxdb/v1.9/write_protocols/line_protocol_reference/
we only support 64KB, but 1MB is a more realistic practical limit. Before this commit there was
no enforcement of field value size.

Closes influxdata#22094

(cherry picked from commit 6d22e69)
lesam added a commit to lesam/influxdb that referenced this pull request Aug 6, 2021
…#21843)

Per https://docs.influxdata.com/enterprise_influxdb/v1.9/write_protocols/line_protocol_reference/
we only support 64KB, but 1MB is a more realistic practical limit. Before this commit there was
no enforcement of field value size.

Closes influxdata#22094

(cherry picked from commit 6d22e69)
lesam added a commit that referenced this pull request Aug 6, 2021
…22095)

Per https://docs.influxdata.com/enterprise_influxdb/v1.9/write_protocols/line_protocol_reference/
we only support 64KB, but 1MB is a more realistic practical limit. Before this commit there was
no enforcement of field value size.

Closes #22094

(cherry picked from commit 6d22e69)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants