-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pkg/stanza] Make batching behavior configurable #21184
[pkg/stanza] Make batching behavior configurable #21184
Conversation
a517f3e
to
50cdb15
Compare
Introduce new configuration options to control the batching behavior of the stanza receivers: - `batch.max_batch_size`: The maximum number of spans to batch together. - `batch.timeout`: The maximum amount of time to wait before sending a batch. Providing configuration options for batching in the receivers makes it possible to replace the batch processor and prevent data loss when an exporter returns a retryable error.
50cdb15
to
da09176
Compare
These setting were removed prior to declaring the filelog receiver beta, because it was unclear whether we should be batching at all. (context) I think before we re-add these parameters, we should resolve the question of whether pkg/stanza at all. It's not clear to me that there are any real benefits, but it certainly complicates the codebase quite a bit. |
One more reason we may want to remove the complicated conversion logic altogether: #21740 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to make sure we do not expose configuration for batching until we determine whether or not we are keeping the behavior. See #21184 (comment).
I will open an issue to propose that we remove batching from the stanza adapter. If that is rejected, then we will have to add configuration.
| `encoding` | `utf-8` | The encoding of the file being read. See the list of supported encodings below for available options | | ||
| `operators` | [] | An array of [operators](../../pkg/stanza/docs/operators/README.md#what-operators-are-available). See below for more details | | ||
| `batch.max_batch_size` | `100` | Maximum number of log records per emitted batch. | | ||
| `batch.timeout` | `100 milliseconds` | Maximum duration to wait before sending out the log records batch. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `batch.timeout` | `100 milliseconds` | Maximum duration to wait before sending out the log records batch. | | |
| `batch.timeout` | `100ms` | Maximum duration to wait before sending out the log records batch. | |
My motivation for this was to establish a pipeline that cannot drop any logs. After #20511, the only part of a typical pipeline dropping data is the batch processor. Removing it drops the batch size to 100 (hardcoded in stanza receivers), which significantly slows down the whole pipeline. This change would bring me a short-term solution. The long-term one would be open-telemetry/opentelemetry-collector#4646. I'm good with rejecting this PR in favor of the long-term solution. |
@dmitryax, I've done some preliminary benchmarking in #21929 to evaluate the impact of the batch size. While I tentatively agree we will need to add batching controls, my intuition is that the adapter codebase can be simplified and that doing so may have performance benefits or impacts on the eventual configuration. To that end, I think it would be reasonable to add the new parameters behind a feature gate for now, but I'd like to spend a little more time evaluating this possibility before we commit to adding configuration that may later be changed. If you're able to review #21928 and #21929, this would help me move forward on the refactoring I'd like to try. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
I'm still intending to evaluate ways to simplify the adapter codebase, but I've been sidetracked by some more urgent issues. In the meantime, I'd like to reiterate that adding these settings behind a feature gate seems like a reasonable way to move forward if these controls are presently urgent for anyone. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Was there any update on these changes @dmitryax , @djaglowski ? |
Unfortunately, I have not found time for this. The package remains a high priority for me but my priority lately has been on making the |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
@djaglowski any updates? |
@orloffv, I wanted to get this in to remove the batch processor and establish a resilient logs pipeline. The default batching of the filelog receiver is not as performant as the batch processor. This would be a short-term solution. Now, I'm working on a long-term solution that will be applicable to all data types open-telemetry/opentelemetry-collector#8122. So I'm not interested in this PR anymore |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
Expose configuration options to control the batching behavior of the stanza receivers:
batch.max_batch_size
: The maximum number of spans to batch together.batch.timeout
: The maximum amount of time to wait before sending a batch.Providing configuration options for batching in the receivers makes it possible to replace the batch processor and prevent data loss when an exporter returns a retriable error.