Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove all overrides of upstream errors #558

Merged
merged 1 commit into from
Aug 25, 2021

Conversation

robertknight
Copy link
Member

@robertknight robertknight commented Aug 24, 2021

Overriding upstream 4xx and 5xx responses with generic 400 responses made it difficult to debug interactions between Via and upstreams in production because all information about the original upstream response was lost.

This commit completes the change started in #557 by removing all such overrides.

We will find another way to solve the problem described in the comments where proxied 4xx or 5xx responses from upstream could confuse monitoring tools about the health of the service.

One possibility that I haven't implemented here would be to return a fixed status code for an upstream 4xx or 5xx error but include the original response status in an HTTP header or body. I don't know if this is possible to do in nginx.


On a broader note, this leads me to ask whether we should consider replacing nginx as the proxying system entirely with something Python-based. That will give us much easier control over all details of the response and error logging. It would also enable us to use New Relic's Python integration for more advanced alerting (eg. observing baseline deviations in the error response rate).

Overriding upstream 4xx and 5xx responses with generic 400 responses made it
difficult to debug interactions between Via and upstreams in production because
all information about the original upstream response was lost.

This commit completes the change started in
#557 by removing all such overrides.

We will find another way to solve the problem described in the comments where
proxied 4xx or 5xx responses from upstream could confuse monitoring tools about
the health of the service.
Copy link
Contributor

@seanh seanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this.

I think we're gonna need to re-implement Via's non-proxying of third-party error statuses so that users proxying broken third-party sites doesn't trigger Pager Duty false alarms. But it needs to be done in a way that preserves the third-party response's status and body somewhere for debugging. I don't think we'll be reviving the implementation that this PR deletes.

@robertknight robertknight merged commit 31dabac into master Aug 25, 2021
@robertknight robertknight deleted the enable-upstream-error-proxy-v2 branch August 25, 2021 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants