Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Producer] Ensure that delivery callbacks are processed in timely manner (requires some discovery) #31

Closed
timmc-edx opened this issue Aug 25, 2022 · 3 comments
Assignees
Labels
event-bus Work related to the Event Bus.

Comments

@timmc-edx
Copy link
Contributor

timmc-edx commented Aug 25, 2022

We should ensure that polling is happening at a consistent enough rate on the producer.

Currently, we call poll(0) every time we send an event, which means we will eventually call it for almost every ack that comes back. But there are two issues we suspect with this method:

  • If events are sent at a slow or variable rate, acks may take a long time to be processed as callbacks/stats/???
  • The last event sent before server shutdown is guaranteed not to have its ack processed

The other recommended method for handling producer polling is to call poll(0) every N seconds in a separate thread. This would decouple acks/error reporting reliability from signal production rate and ensure that the last events have their acks processed (unless server shutdown happens too soon after; see #11 for details there.)

A/C:

  • Discovery:
    • Understand the implications of:
      • Failing to process some callbacks (at end of server lifecycle)
      • Delayed processing (due to infrequent events)
      • Just not calling poll at all
        • Answer: Messages are kept in the queue until their delivery report callback is run, so failing to call poll at all will eventually produce errors:

          To avoid ERR__QUEUE_FULL errors, make sure to call poll regularily.

    • Pick a method, or decide we don't care about acks and delivery callbacks
  • Implementation:
    • Implement the decision, along with an ADR

References:

@timmc-edx timmc-edx changed the title Ensure that delivery callbacks are processed in timely manner (requires some discovery) [Producer] Ensure that delivery callbacks are processed in timely manner (requires some discovery) Aug 25, 2022
@timmc-edx timmc-edx added the event-bus Work related to the Event Bus. label Aug 25, 2022
@timmc-edx timmc-edx moved this to Todo in Arch-BOM Aug 25, 2022
@timmc-edx timmc-edx self-assigned this Aug 25, 2022
@timmc-edx timmc-edx moved this from Todo to In Progress in Arch-BOM Aug 25, 2022
timmc-edx added a commit that referenced this issue Aug 26, 2022
Purpose:

- Revisit #16 since I
  finally figured out a clean way to have a single producer.
- Reduce the burden on future code that will need to adjust how polling
  is done (#31) and
  maybe handle shutdown (#11)
- Prepare for configurable implementation loading, which will need a
  singleton and getter: openedx/openedx-events#87
- Get rid of the `sync` argument (which didn't fit the abstraction) and
  move it to a dedicated method.

Relying code should now call `get_producer().send(...)` rather than
`send_to_event_bus(...)`. The return value is an object that wraps a
`Producer` instance (not a `SerializingProducer`) and that handles the
serialization itself.

Serialization logic is moved to a cached `get_serializers(...)` that
expands upon the previous `get_serializer` function; it now returns a pair
of key and value serializers. This also acts as a patch point for mocking.

`send_to_event_bus` gets a shorter name (now it's just a `send` method)
and loses the `sync` keyword argument; there is instead now a
`pre_shutdown` method.
timmc-edx added a commit that referenced this issue Aug 26, 2022
Purpose:

- Revisit #16 since I
  finally figured out a clean way to have a single producer.
- Reduce the burden on future code that will need to adjust how polling
  is done (#31) and
  maybe handle shutdown (#11)
- Prepare for configurable implementation loading, which will need a
  singleton and getter: openedx/openedx-events#87
- Get rid of the `sync` argument (which didn't fit the abstraction) and
  move it to a dedicated method.

Relying code should now call `get_producer().send(...)` rather than
`send_to_event_bus(...)`. The return value is an object that wraps a
`Producer` instance (not a `SerializingProducer`) and that handles the
serialization itself.

Serialization logic is moved to a cached `get_serializers(...)` that
expands upon the previous `get_serializer` function; it now returns a pair
of key and value serializers. This also acts as a patch point for mocking.

`send_to_event_bus` gets a shorter name (now it's just a `send` method)
and loses the `sync` keyword argument; there is instead now a
`pre_shutdown` method.
timmc-edx added a commit that referenced this issue Aug 26, 2022
Purpose:

- Revisit #16 since I
  finally figured out a clean way to have a single producer.
- Reduce the burden on future code that will need to adjust how polling
  is done (#31) and
  maybe handle shutdown (#11)
- Prepare for configurable implementation loading, which will need a
  singleton and getter: openedx/openedx-events#87
- Get rid of the `sync` argument (which didn't fit the abstraction) and
  move it to a dedicated method.

Relying code should now call `get_producer().send(...)` rather than
`send_to_event_bus(...)`. The return value is an object that wraps a
`Producer` instance (not a `SerializingProducer`) and that handles the
serialization itself.

Serialization logic is moved to a cached `get_serializers(...)` that
expands upon the previous `get_serializer` function; it now returns a pair
of key and value serializers. This also acts as a patch point for mocking.

`send_to_event_bus` gets a shorter name (now it's just a `send` method)
and loses the `sync` keyword argument; there is instead now a
`pre_shutdown` method.
timmc-edx added a commit that referenced this issue Aug 26, 2022
Purpose:

- Revisit #16 since I
  finally figured out a clean way to have a single producer.
- Reduce the burden on future code that will need to adjust how polling
  is done (#31) and
  maybe handle shutdown (#11)
- Prepare for configurable implementation loading, which will need a
  singleton and getter: openedx/openedx-events#87
- Get rid of the `sync` argument (which didn't fit the abstraction) and
  move it to a dedicated method.

Relying code should now call `get_producer().send(...)` rather than
`send_to_event_bus(...)`. The return value is an object that wraps a
`Producer` instance (not a `SerializingProducer`) and that handles the
serialization itself.

Serialization logic is moved to a cached `get_serializers(...)` that
expands upon the previous `get_serializer` function; it now returns a pair
of key and value serializers. This also acts as a patch point for
mocking. I'd like to test the serializers themselves, but they want to
talk to a server.

`send_to_event_bus` gets a shorter name (now it's just a `send` method)
and loses the `sync` keyword argument; there is instead now a
`pre_shutdown` method.
timmc-edx added a commit that referenced this issue Aug 31, 2022
Purpose:

- Revisit #16 since I
  finally figured out a clean way to have a single producer.
- Reduce the burden on future code that will need to adjust how polling
  is done (#31) and
  maybe handle shutdown (#11)
- Prepare for configurable implementation loading, which will need a
  singleton and getter: openedx/openedx-events#87
- Get rid of the `sync` argument (which didn't fit the abstraction) and
  move it to a dedicated method.

Relying code should now call `get_producer().send(...)` rather than
`send_to_event_bus(...)`. The return value is an object that wraps a
`Producer` instance (not a `SerializingProducer`) and that handles the
serialization itself.

Serialization logic is moved to a cached `get_serializers(...)` that
expands upon the previous `get_serializer` function; it now returns a pair
of key and value serializers. This also acts as a patch point for
mocking. I'd like to test the serializers themselves, but they want to
talk to a server.

`send_to_event_bus` gets a shorter name (now it's just a `send` method)
and loses the `sync` keyword argument; there is instead now a
`prepare_for_shutdown` method.

Other refactoring:

- Cache `create_schema_registry_client` and rename to `get_...`
- Lift producer test data to be instance variables
@timmc-edx
Copy link
Contributor Author

Rather than an ADR, I ended up with a docstring on the polling loop: https://github.com/openedx/event-bus-kafka/pull/39/files#diff-3134bce95bcf050a2ac1747f3bceb1acad0b4f6ac393bea3d2bf9b133241dcdeR228

Not sure it's worth a full ADR since it's a pretty localized decision.

@timmc-edx
Copy link
Contributor Author

(But we also didn't establish exactly what the consequences would be of not having this polling loop, and decided it would be easier to include than to decide whether to include it, which might be ADR-worthy...)

timmc-edx added a commit that referenced this issue Sep 7, 2022
@timmc-edx
Copy link
Contributor Author

Completed work with:

Repository owner moved this from In Progress to Done in Arch-BOM Sep 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
event-bus Work related to the Event Bus.
Projects
None yet
Development

No branches or pull requests

1 participant