Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make backfill resilient #45

Merged
merged 5 commits into from
Sep 22, 2022
Merged

Make backfill resilient #45

merged 5 commits into from
Sep 22, 2022

Conversation

duncanjbrown
Copy link
Contributor

Objectives:

  • don't overflow queues
  • don't hit BQ byte limits

Hopefully this isn't too complicated.

Still to do: find out the actual byte limit and set it

@misaka
Copy link
Contributor

misaka commented Sep 7, 2022

Where is this 10k limit coming from?

It's just that this approach seems reasonable if we have a vague limit of about 10k, but if this is a vague limit then what is the limitation, really?

@duncanjbrown
Copy link
Contributor Author

Where is this 10k limit coming from?

I made it up! See description. I want to experiment to confirm the actual limit — docs haven't been helpful.

@misaka
Copy link
Contributor

misaka commented Sep 7, 2022

What errors are we seeing?

@duncanjbrown
Copy link
Contributor Author

duncanjbrown commented Sep 16, 2022

OK I've simplified this on the basis of https://cloud.google.com/bigquery/quotas#streaming_inserts

new routine:

break everything up into 500-row chunks as suggested by google
measure size before sending, and if we end up going over 10mb, split the batch

@duncanjbrown duncanjbrown marked this pull request as ready for review September 16, 2022 16:13
if payload_byte_size > BQ_BATCH_MAX_BYTES
events.each_slice((events.size / 2.0).round).to_a.each do |half_batch|
Rails.logger.info "Halving batch of size #{payload_byte_size} for #{model_class.name}"
DfE::Analytics::SendEvents.perform_now(half_batch.as_json)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if instead of sending these half-batches immediately, we re-queued them? This would give us a little bit of extra insurance in case some entities are truly HHUUUGGGEEE!

Suggested change
DfE::Analytics::SendEvents.perform_now(half_batch.as_json)
DfE::Analytics::LoadEntityBatch.perform_later(model_class.to_s, half_batch.map(&:id))

This means we don't take up queue space when we're processing backfills
BQ accepts max 10MB per request and recommends batches of 500.

This allows 20kb per event.

Batch everything in 500s. If a given batch payload exceeds 10MB, split
it before sending.
When the block threw an exception (as in a spec for SendEvents) this
method didn't have a chance to clean up and we got stuck in webmock mode
@duncanjbrown duncanjbrown merged commit 8a181ff into main Sep 22, 2022
@duncanjbrown duncanjbrown deleted the backfill branch September 22, 2022 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants