Skip to content
This repository has been archived by the owner on Jun 19, 2022. It is now read-only.

Drop non-event messages #1212

Merged
merged 7 commits into from
Jun 5, 2020
Merged

Conversation

liu-cong
Copy link
Contributor

@liu-cong liu-cong commented Jun 4, 2020

Fixes #1178

Proposed Changes

  • If a message is not an event, we drop it, otherwise we will keep retrying indefinitely.
  • For now we just log and drop it, we could consider moving it to a DLQ or some other persistence storage.

Release Note

Drop non-event messages.

Docs

@liu-cong liu-cong requested a review from yolocs June 4, 2020 16:41
@googlebot googlebot added the cla: yes (override cla status due to multiple authors bug) label Jun 4, 2020
@liu-cong
Copy link
Contributor Author

liu-cong commented Jun 4, 2020

Need to wait for #1208 to be merged so the unit test failure will fix itself

Copy link
Contributor

@grantr grantr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this isn't ready for review quite yet, but I have a couple comments anyway :)

/hold

// The following errors can be returned by ToEvent and are not retryable.
// TODO Should ToEvent consolidate them and return the genric ErrCannotConvertToEvent?
if errors.Is(err, binding.ErrCannotConvertToEvent) || errors.Is(err, binding.ErrNotStructured) || errors.Is(err, binding.ErrUnknownEncoding) || errors.Is(err, binding.ErrNotBinary) {
logging.FromContext(ctx).Error("failed to convert received message to an event, check the msg format", zap.Any("message", msg), zap.Error(err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Max message data size is 10MB. Worst case, if all messages in the topic are the wrong format and very large we use a lot of resources logging them. Maybe print the message id and first N bytes of data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe log both error and debug. Error has the shorter message and debug has the full message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debating whether to log the full msg. This should really not happen in normal conditions. Since we don't have a DLQ, without logging it's difficult for the user to debug.

Maybe picking a big N such as 5000 is good enough to cover most cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log full msg in debug, and truncate the Data field to 2000 chars in error (other fields are either fixed size or a map of attributes)

pkg/broker/handler/handler.go Outdated Show resolved Hide resolved
pkg/broker/handler/handler.go Outdated Show resolved Hide resolved
pkg/broker/handler/handler_test.go Outdated Show resolved Hide resolved
pkg/broker/handler/handler.go Outdated Show resolved Hide resolved
// The following errors can be returned by ToEvent and are not retryable.
// TODO Should ToEvent consolidate them and return the genric ErrCannotConvertToEvent?
if errors.Is(err, binding.ErrCannotConvertToEvent) || errors.Is(err, binding.ErrNotStructured) || errors.Is(err, binding.ErrUnknownEncoding) || errors.Is(err, binding.ErrNotBinary) {
logging.FromContext(ctx).Error("failed to convert received message to an event, check the msg format", zap.Any("message", msg), zap.Error(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe log both error and debug. Error has the shorter message and debug has the full message?

pkg/broker/handler/handler.go Outdated Show resolved Hide resolved
@liu-cong
Copy link
Contributor Author

liu-cong commented Jun 4, 2020

/unhold as #1208 is merged

@grantr
Copy link
Contributor

grantr commented Jun 4, 2020

/hold cancel

Copy link
Member

@yolocs yolocs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Contributor

@grantr grantr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/hold
Holding for optional nit

pkg/broker/handler/handler.go Outdated Show resolved Hide resolved
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: grantr, liu-cong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-metrics-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-google-knative-gcp-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/broker/handler/handler.go 86.4% 82.9% -3.5

Copy link
Contributor

@grantr grantr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@liu-cong
Copy link
Contributor Author

liu-cong commented Jun 4, 2020

/retest

@liu-cong
Copy link
Contributor Author

liu-cong commented Jun 4, 2020

/unhold

@knative-test-reporter-robot

The following jobs failed:

Test name Triggers Retries
pull-google-knative-gcp-wi-tests 0/3

Failed non-flaky tests preventing automatic retry of pull-google-knative-gcp-wi-tests:

github.com/google/knative-gcp/test/e2e.TestSmokeCloudSchedulerSource
github.com/google/knative-gcp/test/e2e.TestCloudPubSubSourceWithGCPBroker

@liu-cong
Copy link
Contributor Author

liu-cong commented Jun 5, 2020

/retest

@knative-prow-robot knative-prow-robot merged commit 3541d2d into google:master Jun 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved cla: yes (override cla status due to multiple authors bug) lgtm size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-event message in pubsub causes fanout/retry to panic (SDK bug)
7 participants