-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/elasticsearch] Handle errors for 413 Entity Too Large #36163
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
It appears that there is a similar, more general proposal available at open-telemetry/opentelemetry-collector#8272. Allowing the batcher config to have a max size in bytes seems to be a more straightforward solution. |
I have opened #36396 to reduce the chance of 413 for sync bulk indexer (used when exporterhelper batcher is active), instead of handling the error, because there is no good way to handle such an error other than to send them in appropriately sized chunks in the first place. |
/label -needs-triage |
…Too Large (open-telemetry#36396) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Limit the bulk request size to roughly `flush::bytes` for sync bulk indexer. Sync bulk indexer is used when `batcher::enabled` is either true or false. In order words, sync bulk indexer is not used when batcher config is undefined. Change `flush::bytes` to always measure in uncompressed bytes. Change default `batcher::max_size_items` to `0` as bulk request size limit is now more effectively enforced by `flush::bytes`. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#36163 <!--Describe what testing was performed and which tests were added.--> #### Testing Modified BenchmarkExporter to run with `{name: "xxlarge_batch", batchSize: 1000000},` and removed `batcher::max_size_items` and added a log line for compressed and uncompressed buffer size to reproduce the error. ``` logger.go:146: 2024-11-19T17:16:40.060Z ERROR Flush {"s.bi.Len": 10382932, "s.bi.UncompressedLen": 532777786} logger.go:146: 2024-11-19T17:16:40.312Z ERROR bulk indexer flush error {"error": "flush failed (413): [413 Request Entity Too Large] "} ``` With this PR, every flush logs and there is no error. ``` logger.go:146: 2024-11-19T17:23:52.574Z ERROR Flush {"s.bi.Len": 99148, "s.bi.UncompressedLen": 5000007} ``` <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.--> --------- Co-authored-by: Christos Markou <chrismarkou92@gmail.com>
Component(s)
exporter/elasticsearch
Is your feature request related to a problem? Please describe.
The exporter has experimental batching support with the syncronous bulk inserter which can circunvent the request size limitation from Elasticsearch. Unfortunately if the batch is whithin the configured range but the documents are quite large (for example logs containing stacktraces) we can still hit the
http.max_content_length
, which is 100mb by default.The underlying
go-docappender
library used by the exporter does not seem to handle this case, resorting to the client to handle it. I have opened an upstream issue to see if the library can do anything to help in this case.Another problem is that error 413 is not returned as permanent, the client will be instructed to retry it with no effect.
Describe the solution you'd like
I believe the exporter should do a best effort to circunvent the 413, maybe by splitting the batch again internally to make it fit within the limit and then retry, if possible.
Describe alternatives you've considered
Extend the exporterhelper batcher to support batch splitting or by a max_size config.
Another option is to provide a sentinel error for Entity Too Large, allowing the receiver/processor to split the data and retry again. Currently if a 413 happens the returned error is:
This error is specific to Elasticsearch, I believe we should have a more universal error that other exporters can return when facing a similar error.
Additional context
Relates to https://github.com/elastic/ingest-dev/issues/3677
The text was updated successfully, but these errors were encountered: