Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

400 status codes should not be considered errors in the context telemetry #16

Closed
wants to merge 1 commit into from

Conversation

tj-cisco
Copy link

@tj-cisco tj-cisco commented May 1, 2024

All codes in the 200-499 range should be considered a success in the context of telemetry.

@tj-cisco
Copy link
Author

tj-cisco commented May 2, 2024

From the Dancer plugin thread, it seems like these should be errors according to the spec. I wonder if we can/should make a config option to set 'expected' error responses in the client that don't get treated as errors. The following REST example the 404 is not an error its expected behavior:

  1. Client GET /resource/
  2. Server 404 - Does not exist
  3. Client PUT /resource/
  4. Server 201 - Created

Maybe add a config option to allow 400's to be accepted as ok?

@jjatria
Copy link
Owner

jjatria commented May 2, 2024

I'm not sure you are correct here. Again, from the specification (emphasis mine):

Span Status MUST be left unset if HTTP status code was in the 1xx, 2xx or 3xx ranges, unless there was another error (e.g., network error receiving the response body; or 3xx codes with max redirects exceeded), in which case status MUST be set to Error.

For HTTP status codes in the 4xx range span status MUST be left unset in case of SpanKind.SERVER and MUST be set to Error in case of SpanKind.CLIENT.

Source: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#status

While you were right in the case of the Dancer2 plugin, because those were SERVER spans, these are CLIENT spans (see here for example). So I don't think this applies in this case.

@jjatria jjatria closed this May 2, 2024
@tj-cisco
Copy link
Author

tj-cisco commented May 3, 2024

Would you accept a PR to make this behavior configurable, including a warning in the documentation that this is none standard? I have an API that is generating millions of 404's a day, none of which are failures, all of which will muddy the telemetry data and help hide legitimate errors.

@jjatria
Copy link
Owner

jjatria commented May 3, 2024

I'd be happy to accept a PR if I could see some documentation showing that this was "non-standard". Because at least as far as the OpenTelemetry specification quoted above, this is precisely what the standard mandates: 4XX errors must be set as errors in client spans, and must be left unset in server spans.

Without that, then I do not think this should be configurable because the spec is very clear: there'd be nothing to configure.

@tj-cisco
Copy link
Author

tj-cisco commented May 3, 2024

I have opened a ticket to get MUST changed to SHOULD in the spec, this change would match the elastic.co APM specification for http client spans.

open-telemetry/semantic-conventions#1003

@jjatria
Copy link
Owner

jjatria commented May 3, 2024

Sweet. Let me know if that gets changed and we can think of how to allow it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants