Remove all overrides of upstream errors #558
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overriding upstream 4xx and 5xx responses with generic 400 responses made it difficult to debug interactions between Via and upstreams in production because all information about the original upstream response was lost.
This commit completes the change started in #557 by removing all such overrides.
We will find another way to solve the problem described in the comments where proxied 4xx or 5xx responses from upstream could confuse monitoring tools about the health of the service.
One possibility that I haven't implemented here would be to return a fixed status code for an upstream 4xx or 5xx error but include the original response status in an HTTP header or body. I don't know if this is possible to do in nginx.
On a broader note, this leads me to ask whether we should consider replacing nginx as the proxying system entirely with something Python-based. That will give us much easier control over all details of the response and error logging. It would also enable us to use New Relic's Python integration for more advanced alerting (eg. observing baseline deviations in the error response rate).