Make UDP-receiver/operator asynchronous & concurrent #27613

hzahav · 2023-10-11T05:55:28Z

Component(s)

pkg/stanza, receiver/udplog

Is your feature request related to a problem? Please describe.

TL;DR: In high scale scenarios, UDP-receiver has bursts of data-loss due to its working synchronously and single-threadedly.

In high scale scenarios, it's easy to lose UDP packets If the receiving side slows down even for a very short time (for example, the otel-exporter sends data to an endpoint that shortly takes longer to respond), the sender doesn't get any indication about it (in distinction from TCP), and keeps sending data in the same rate.
During that time, the receiver's network buffer gets full and you get data-loss. Also, if there's a short burst of more data than usual (bigger than usual), it also causes data loss due to same reason.
This happens in part because the current UDP receiver works synchronously. If exporter slows down for even a short time, there's data-loss during high scale scenarios.

Describe the solution you'd like

The UDP-receiver (more accurately, the udp input operator in stanza [stanza\operator\input\udp]) needs to process logs in an asynchronous manner to reduce data-loss and increase processing rate. That's important for high-rate scenarios.
Code is already ready for PR, btw.

Our stress tests indicate that changing the UDP stanza input operator to have 2 go-routines solved continuous data-loss issues (not to mention, increase the processing rate of the otel collector).
a. 1st go routine ('reader') only reads from UDP and puts the data into a channel - no processing is done there at all (including splitting, adding attributes, etc.).
2. 2nd go routine ('processor') reads from that channel, performs the processing offered by the UDP-operator, and pushes into the next otel step (in our case, it would be a batch processor).

It's better to add concurrency to the mix (for example, allow the 'processor' to run with 5 go routines) since our tests indicate it improved processing rate even further. The internal processing in the udp receiver may be a bit complicated, since it involves splitting, adding attributes. It might help some consumers to have multiple such 'processors' routines that work concurrently before sending the data downstream.
This would require a graceful shutdown mechanism that allows the receiver to finish handling the items already read and pushed to the channel, so they can be pushed downstream during shutdown (while stopping the 'reader' routine from reading more items from the UDP port).

The suggested feature allows the customer to "pay" with available memory (which can be much bigger than the max size you can set the network buffer to be) to reduce the risk of data-loss due to these issues. Of course, this won't help if our otel collector can handle X EPS, but consistently gets 1.1X EPS. The intention is only to prevent data loss in scenarios when the otel-collector gets data-rate it's usually able to handle, but has short term latency.
Our tests indicate that using more go-routines here (2+) didn't have a major affect on CPU usage overall (but there was a small one, obviously). Again, it should be the consumer's choice to "pay" with more CPU, to reduce risk of data-loss.

Describe alternatives you've considered

Reduce scale of data being sent to each instance - we ran stress tests to find out what are the limits of the otel-collector in our environment, so we know not to send too much data to each instance. Let's say the limit is X. Even when we send 0.8X or 0.7X, we still get data-loss from time to time. Sometimes up to 3-4% of data is just not received by the UDP-receiver. Our metrics indicate that network buffer on those nodes got full and dropped and as a result, dropped the data. If we dramatically reduce the data being sent to 0.4X, we get data-loss much more rarely, but this is a significant waste of resources, and not reasonable in very high scale scenarios.
Increase batch-processor max items - didn't help. we're already using a pretty big number.
add concurrency to the custom otel-exporter - didn't solve the issue. we're already running with 15+ go routines there. seem to have reached max optimization there.
Increase the node's network buffer - we increased (x100) all the relevant kernel buffers (netdev_max_backlog, rmem_max, rmem_default, etc.) - while it did somewhat reduce data-loss, it didn't solve the issue. increasing further didn't help above that. Note that there's a limit to increasing those variables, as these are kernel buffers, so we can't treat them like "regular" memory.
persistent-queue helper (https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md#persistent-queue) - this wouldn't help here if the endpoint is still accepting data from our exporter, but just does it a bit more slowly for a few seconds, that alone will cause data-loss in high scale scenarios (as our tests indicate).
General UDP processing in high rate tips - we basically tried most of these steps indicated here (https://blog.cloudflare.com/how-to-receive-a-million-packets/), didn't solve the problem completely.
Tried to add more pods running otel-collector instances on same node (the network buffer is per node, and not per pod) - didn't solve the problem, sometimes worsened it.
Can't have 2 separate pipelines (both reading from same UDP port) on same otel-collector instance - otel blocks that option since 2 receivers can't be configured to read from same port.

Additional context

We have a scenario that requires our otel collector to process high scale data that's read from from a UDP port.
Along with the UDP-receiver with have an otel-batch-processor, and our otel-exporter sends the logs over the network (after being compressed). Our custom otel-exporter is maximally optimized (including using lots of concurrent channels, putting as much data as possible in each network request, compressing, etc.).

github-actions · 2023-10-11T05:55:47Z

Pinging code owners:

pkg/stanza: @djaglowski
receiver/udplog: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

hovavza · 2023-10-11T13:14:04Z

Created PR - #27620

hovavza · 2023-10-12T09:18:15Z

Created PR - #27620

Closed previous PR - will add the changes gradually.
1st step is in following PR - #27647

github-actions · 2023-12-18T03:29:16Z

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

receiver/udplog: @djaglowski
pkg/stanza: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

djaglowski · 2024-01-23T19:42:59Z

Closing as resolved by #27647 and #28901

hzahav added enhancement New feature or request needs triage New item requiring triage labels Oct 11, 2023

github-actions bot added pkg/stanza receiver/udplog labels Oct 11, 2023

atoulme mentioned this issue Oct 12, 2023

Add async/concurrency to udp receiver (stanza udp input operator) #27620

Closed

Frapschen removed the needs triage New item requiring triage label Oct 16, 2023

Frapschen assigned hovavza Oct 16, 2023

github-actions bot mentioned this issue Oct 17, 2023

Weekly Report: 2023-10-10 - 2023-10-17 #27791

Closed

github-actions bot added the Stale label Dec 18, 2023

djaglowski closed this as completed Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make UDP-receiver/operator asynchronous & concurrent #27613

Make UDP-receiver/operator asynchronous & concurrent #27613

hzahav commented Oct 11, 2023

github-actions bot commented Oct 11, 2023

hovavza commented Oct 11, 2023

hovavza commented Oct 12, 2023

github-actions bot commented Dec 18, 2023

djaglowski commented Jan 23, 2024