Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/loadbalancing] Support consistency between scale-out events #33959

Open
jamesmoessis opened this issue Jul 9, 2024 · 4 comments
Open

Comments

@jamesmoessis
Copy link
Contributor

Component(s)

exporter/loadbalancing

Is your feature request related to a problem? Please describe.

When a scale-out event occurs, the loadbalancing exporter goes from having n endpoints to n+1 endpoints. So now, the exporting is divided differently after the scaling event is complete.

Consider the case where a trace has 2 spans, and the load balancing exporter is configured to route by trace ID. Span (a) arrives and is routed a given host. The scaling event occurs and now span (b) arrives, being routed to a different host.

We need a way to effectively scale-out our backend without these inconsistencies, while maintaining performance.

This is a separate problem to a scale-in event which in my opinion presents a different set of problems, and requires the terminating node to flush any data it's statefully holding onto. It may be worth discussing here but I want to focus on the simpler case of a scale-out event.

Describe the solution you'd like

I don't know exactly what the solution should be yet, and I'm hoping that this thread will provide discussion so we can reach a solution that works.

Essentially any solution that I've seen discussed for this problem includes some kind of cache which holds trace ID as the key and the backend as the value. My question to community is: how should this cache be implemented?

Describe alternatives you've considered

No response

Additional context

No response

@jamesmoessis jamesmoessis added enhancement New feature or request needs triage New item requiring triage labels Jul 9, 2024
Copy link
Contributor

github-actions bot commented Jul 9, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@jpkrohling
Copy link
Member

jpkrohling commented Jul 9, 2024

This is a problem inherent to distributed systems: there's only so much we can do before documenting the use-cases we won't support. When designing the load-balancing exporter, the trade-off was: either scaling events aren't frequent and we need fewer sync stages (periodic intervals for resolvers can be longer), or scaling events are frequent and need more/shorter refresh intervals.

This still requires no coordination between the nodes, acknowledging that they might end up making a different decision for the same trace ID during the moments where one load balancer is out of sync with the others. Hopefully, this is a short period of time, but this will happen. To alleviate some of the pain, I chose an algorithm that is a bit more expensive than the alternatives, but brought some stability in it: my recollection from the paper was that changes to the circle would affect only about a third of the hashes. I think it was (and still is, for most cases) a good compromise.

If we want to make it even better in terms of consistency, I considered a few alternatives in the past, which should be doable as different implementations of the resolver interface. My favorite is to use a distributed key/value store, like etcd or zookeeper, that would allow all nodes to get the same data at the same time, including their updates. Consensus would be handled there, but we'd still need to handle split brain scenarios (which is what I was trying to avoid in the first place). Another thing I considered in the past was to implement a gossip extension and allow load-balancer instances to communicate with each other, and we'd implement the consensus algorithm ourselves (likely raft, using an external library?). Again, split-brain would probably be something we'll have to handle ourselves.

If we want to have consistency even on split-brain scenarios... well, then I don't know :-) I'm not ready to think about a solution for that if we don't have that problem yet.

Anyway: thank you for opening this issue! I finally got those things out of my head :-)

Copy link
Contributor

github-actions bot commented Sep 9, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Sep 9, 2024
@atoulme atoulme removed the needs triage New item requiring triage label Oct 12, 2024
@github-actions github-actions bot removed the Stale label Oct 13, 2024
jpkrohling pushed a commit that referenced this issue Nov 27, 2024
…e and timeout settings (#36094)

#### Description
##### Problem statement
`loadbalancing` exporter is actually a wrapper that's creates and
manages set of actual `otlp` exporters
Those `otlp` exporters technically shares same configuration parameters
that are defined on `loadbalancing` exporter level, including
`sending_queue` configuration. The only difference is `endpoint`
parameter that are substituted by `loadbalancing` exporter itself
This means, that `sending_queue`, `retry_on_failure` and `timeout`
settings can be defined only on `otlp` sub-exporters, while top-level
`loadbalancing` exporter is missing all those settings
This configuration approach produces several issue, that are already
reported by users:
* Impossibility to use Persistent Queue in `loadbalancing` exporter (see
#16826). That's happens because `otlp` sub-exporters are sharing the
same configurations, including configuration of the queue, i.e. they all
are using the same `storage` instance at the same time which is not
possible at the moment
* Data loss even using `sending_queue` configuration (see #35378).
That's happens because Queue is defined on level of `otlp` sub-exporters
and if this exporter cannot flush data from queue (for example, endpoint
is not available anymore) there is no other options that just to discard
data from queue, i.e. there is no higher level queue and persistent
storage where data can be returned is case of permanent failure

There might be some other potential issue that was already tracked and
related to current configuration approach

##### Proposed solution
The easiest way to solve issues above - is to use standard approach for
queue, retry and timeout configuration using `exporterhelper`
This will bring queue, retry and timeout functionality to the top-level
of `loadbalancing` exporter, instead of `otlp` sub-exporters
Related to mentioned issues it will bring:
* Single Persistent Queue, that is used by all `otlp` sub-exporters (not
directly of course)
* Queue will not be discarded/destroyed if any (or all) of endpoint that
are unreachable anymore, top-level queue will keep data until new
endpoints will be available
* Scale-up and scale-down event for next layer of OpenTelemetry
Collectors in K8s environments will be more predictable, and will not
include data loss anymore (potential fix for #33959). There is still a
big chance of inconsistency when some data will be send to incorrect
endpoint, but it's already better state that we have right now

##### Noticeable changes
* `loadbalancing` exporter on top-level now uses `exporterhelper` with
all supported functionality by it
* `sending_queue` will be automatically disabled on `otlp` exporters
when it already present on top-level `loadbalancing` exporter. This
change is done to prevent data loss on `otlp` exporters because queue
there doesn't provide expected result. Also it will prevent potential
misconfiguration from user side and as result - irrelevant reported
issues
* `exporter` attribute for metrics generated from `otlp` sub-exporters
now includes endpoint for better visibility and to segregate them from
top-level `loadbalancing` exporter - was `"exporter": "loadbalancing"`,
now `"exporter": "loadbalancing/127.0.0.1:4317"`
* logs, generated by `otlp` sub-exporters now includes additional
attribute `endpoint` with endpoint value with the same reasons as for
metrics

#### Link to tracking issue
Fixes #35378
Fixes #16826

#### Testing
Proposed changes was heavily tested on large K8s environment with set of
different scale-up/scale-down event using persistent queue configuration
- no data loss were detected, everything works as expected

#### Documentation
`README.md` was updated to reflect new configuration parameters
available. Sample `config.yaml` was updated as well
shivanthzen pushed a commit to shivanthzen/opentelemetry-collector-contrib that referenced this issue Dec 5, 2024
…e and timeout settings (open-telemetry#36094)

#### Description
##### Problem statement
`loadbalancing` exporter is actually a wrapper that's creates and
manages set of actual `otlp` exporters
Those `otlp` exporters technically shares same configuration parameters
that are defined on `loadbalancing` exporter level, including
`sending_queue` configuration. The only difference is `endpoint`
parameter that are substituted by `loadbalancing` exporter itself
This means, that `sending_queue`, `retry_on_failure` and `timeout`
settings can be defined only on `otlp` sub-exporters, while top-level
`loadbalancing` exporter is missing all those settings
This configuration approach produces several issue, that are already
reported by users:
* Impossibility to use Persistent Queue in `loadbalancing` exporter (see
open-telemetry#16826). That's happens because `otlp` sub-exporters are sharing the
same configurations, including configuration of the queue, i.e. they all
are using the same `storage` instance at the same time which is not
possible at the moment
* Data loss even using `sending_queue` configuration (see open-telemetry#35378).
That's happens because Queue is defined on level of `otlp` sub-exporters
and if this exporter cannot flush data from queue (for example, endpoint
is not available anymore) there is no other options that just to discard
data from queue, i.e. there is no higher level queue and persistent
storage where data can be returned is case of permanent failure

There might be some other potential issue that was already tracked and
related to current configuration approach

##### Proposed solution
The easiest way to solve issues above - is to use standard approach for
queue, retry and timeout configuration using `exporterhelper`
This will bring queue, retry and timeout functionality to the top-level
of `loadbalancing` exporter, instead of `otlp` sub-exporters
Related to mentioned issues it will bring:
* Single Persistent Queue, that is used by all `otlp` sub-exporters (not
directly of course)
* Queue will not be discarded/destroyed if any (or all) of endpoint that
are unreachable anymore, top-level queue will keep data until new
endpoints will be available
* Scale-up and scale-down event for next layer of OpenTelemetry
Collectors in K8s environments will be more predictable, and will not
include data loss anymore (potential fix for open-telemetry#33959). There is still a
big chance of inconsistency when some data will be send to incorrect
endpoint, but it's already better state that we have right now

##### Noticeable changes
* `loadbalancing` exporter on top-level now uses `exporterhelper` with
all supported functionality by it
* `sending_queue` will be automatically disabled on `otlp` exporters
when it already present on top-level `loadbalancing` exporter. This
change is done to prevent data loss on `otlp` exporters because queue
there doesn't provide expected result. Also it will prevent potential
misconfiguration from user side and as result - irrelevant reported
issues
* `exporter` attribute for metrics generated from `otlp` sub-exporters
now includes endpoint for better visibility and to segregate them from
top-level `loadbalancing` exporter - was `"exporter": "loadbalancing"`,
now `"exporter": "loadbalancing/127.0.0.1:4317"`
* logs, generated by `otlp` sub-exporters now includes additional
attribute `endpoint` with endpoint value with the same reasons as for
metrics

#### Link to tracking issue
Fixes open-telemetry#35378
Fixes open-telemetry#16826

#### Testing
Proposed changes was heavily tested on large K8s environment with set of
different scale-up/scale-down event using persistent queue configuration
- no data loss were detected, everything works as expected

#### Documentation
`README.md` was updated to reflect new configuration parameters
available. Sample `config.yaml` was updated as well
ZenoCC-Peng pushed a commit to ZenoCC-Peng/opentelemetry-collector-contrib that referenced this issue Dec 6, 2024
…e and timeout settings (open-telemetry#36094)

#### Description
##### Problem statement
`loadbalancing` exporter is actually a wrapper that's creates and
manages set of actual `otlp` exporters
Those `otlp` exporters technically shares same configuration parameters
that are defined on `loadbalancing` exporter level, including
`sending_queue` configuration. The only difference is `endpoint`
parameter that are substituted by `loadbalancing` exporter itself
This means, that `sending_queue`, `retry_on_failure` and `timeout`
settings can be defined only on `otlp` sub-exporters, while top-level
`loadbalancing` exporter is missing all those settings
This configuration approach produces several issue, that are already
reported by users:
* Impossibility to use Persistent Queue in `loadbalancing` exporter (see
open-telemetry#16826). That's happens because `otlp` sub-exporters are sharing the
same configurations, including configuration of the queue, i.e. they all
are using the same `storage` instance at the same time which is not
possible at the moment
* Data loss even using `sending_queue` configuration (see open-telemetry#35378).
That's happens because Queue is defined on level of `otlp` sub-exporters
and if this exporter cannot flush data from queue (for example, endpoint
is not available anymore) there is no other options that just to discard
data from queue, i.e. there is no higher level queue and persistent
storage where data can be returned is case of permanent failure

There might be some other potential issue that was already tracked and
related to current configuration approach

##### Proposed solution
The easiest way to solve issues above - is to use standard approach for
queue, retry and timeout configuration using `exporterhelper`
This will bring queue, retry and timeout functionality to the top-level
of `loadbalancing` exporter, instead of `otlp` sub-exporters
Related to mentioned issues it will bring:
* Single Persistent Queue, that is used by all `otlp` sub-exporters (not
directly of course)
* Queue will not be discarded/destroyed if any (or all) of endpoint that
are unreachable anymore, top-level queue will keep data until new
endpoints will be available
* Scale-up and scale-down event for next layer of OpenTelemetry
Collectors in K8s environments will be more predictable, and will not
include data loss anymore (potential fix for open-telemetry#33959). There is still a
big chance of inconsistency when some data will be send to incorrect
endpoint, but it's already better state that we have right now

##### Noticeable changes
* `loadbalancing` exporter on top-level now uses `exporterhelper` with
all supported functionality by it
* `sending_queue` will be automatically disabled on `otlp` exporters
when it already present on top-level `loadbalancing` exporter. This
change is done to prevent data loss on `otlp` exporters because queue
there doesn't provide expected result. Also it will prevent potential
misconfiguration from user side and as result - irrelevant reported
issues
* `exporter` attribute for metrics generated from `otlp` sub-exporters
now includes endpoint for better visibility and to segregate them from
top-level `loadbalancing` exporter - was `"exporter": "loadbalancing"`,
now `"exporter": "loadbalancing/127.0.0.1:4317"`
* logs, generated by `otlp` sub-exporters now includes additional
attribute `endpoint` with endpoint value with the same reasons as for
metrics

#### Link to tracking issue
Fixes open-telemetry#35378
Fixes open-telemetry#16826

#### Testing
Proposed changes was heavily tested on large K8s environment with set of
different scale-up/scale-down event using persistent queue configuration
- no data loss were detected, everything works as expected

#### Documentation
`README.md` was updated to reflect new configuration parameters
available. Sample `config.yaml` was updated as well
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 13, 2024
sbylica-splunk pushed a commit to sbylica-splunk/opentelemetry-collector-contrib that referenced this issue Dec 17, 2024
…e and timeout settings (open-telemetry#36094)

#### Description
##### Problem statement
`loadbalancing` exporter is actually a wrapper that's creates and
manages set of actual `otlp` exporters
Those `otlp` exporters technically shares same configuration parameters
that are defined on `loadbalancing` exporter level, including
`sending_queue` configuration. The only difference is `endpoint`
parameter that are substituted by `loadbalancing` exporter itself
This means, that `sending_queue`, `retry_on_failure` and `timeout`
settings can be defined only on `otlp` sub-exporters, while top-level
`loadbalancing` exporter is missing all those settings
This configuration approach produces several issue, that are already
reported by users:
* Impossibility to use Persistent Queue in `loadbalancing` exporter (see
open-telemetry#16826). That's happens because `otlp` sub-exporters are sharing the
same configurations, including configuration of the queue, i.e. they all
are using the same `storage` instance at the same time which is not
possible at the moment
* Data loss even using `sending_queue` configuration (see open-telemetry#35378).
That's happens because Queue is defined on level of `otlp` sub-exporters
and if this exporter cannot flush data from queue (for example, endpoint
is not available anymore) there is no other options that just to discard
data from queue, i.e. there is no higher level queue and persistent
storage where data can be returned is case of permanent failure

There might be some other potential issue that was already tracked and
related to current configuration approach

##### Proposed solution
The easiest way to solve issues above - is to use standard approach for
queue, retry and timeout configuration using `exporterhelper`
This will bring queue, retry and timeout functionality to the top-level
of `loadbalancing` exporter, instead of `otlp` sub-exporters
Related to mentioned issues it will bring:
* Single Persistent Queue, that is used by all `otlp` sub-exporters (not
directly of course)
* Queue will not be discarded/destroyed if any (or all) of endpoint that
are unreachable anymore, top-level queue will keep data until new
endpoints will be available
* Scale-up and scale-down event for next layer of OpenTelemetry
Collectors in K8s environments will be more predictable, and will not
include data loss anymore (potential fix for open-telemetry#33959). There is still a
big chance of inconsistency when some data will be send to incorrect
endpoint, but it's already better state that we have right now

##### Noticeable changes
* `loadbalancing` exporter on top-level now uses `exporterhelper` with
all supported functionality by it
* `sending_queue` will be automatically disabled on `otlp` exporters
when it already present on top-level `loadbalancing` exporter. This
change is done to prevent data loss on `otlp` exporters because queue
there doesn't provide expected result. Also it will prevent potential
misconfiguration from user side and as result - irrelevant reported
issues
* `exporter` attribute for metrics generated from `otlp` sub-exporters
now includes endpoint for better visibility and to segregate them from
top-level `loadbalancing` exporter - was `"exporter": "loadbalancing"`,
now `"exporter": "loadbalancing/127.0.0.1:4317"`
* logs, generated by `otlp` sub-exporters now includes additional
attribute `endpoint` with endpoint value with the same reasons as for
metrics

#### Link to tracking issue
Fixes open-telemetry#35378
Fixes open-telemetry#16826

#### Testing
Proposed changes was heavily tested on large K8s environment with set of
different scale-up/scale-down event using persistent queue configuration
- no data loss were detected, everything works as expected

#### Documentation
`README.md` was updated to reflect new configuration parameters
available. Sample `config.yaml` was updated as well
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants