Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DefaultExecutionRequestObservationConvention does not produce INTERNAL_ERROR outcomes #1058

Closed
lukasGemela opened this issue Sep 19, 2024 · 1 comment
Assignees
Labels
in: core Issues related to config and core support status: backported An issue that has been backported to maintenance branches type: bug A general bug
Milestone

Comments

@lukasGemela
Copy link

lukasGemela commented Sep 19, 2024

Project details:
Spring Boot project, having:

org.springframework.boot' version '3.3.2'
spring-graphql:1.3.2

According to this doc: https://docs.spring.io/spring-graphql/reference/observability.html#observability.server.request, we should get graphql.outcome INTERNAL_ERROR if no valid GraphQL response could be produced. It's not really specified what exactly that means. When I first read the docu I understood that's when there is unchecked exception thrown and not being handled by exception handler.

from the code I see that's actually in the case when there is general throwable error set in observable or there are no data to produce.

Main "throwable error" from ExecutionRequestObservationContext never gets set in case there was an unchecked exception thrown from application code. In that scenario the error is handled via ExceptionResolversExceptionHandler and translated to 'regular error' from executionResult:

ExecutionResultImpl{errors=[INTERNAL_ERROR for 2ee7db16-1015-ae28-d84e-768be03a8e6b], data={calculateProductionCost=null}, dataPresent=true, extensions=null}

Though it is reported as part of GraphQlObservationInstrumentation as "graphql.request.INTERNAL_SERVER_ERROR" counter metric.

As a result we can see this behaviour in datadog:

graphql.request.count metric which has two tags: graphql.outcome either SUCCESS or REQUEST_ERROR, which is consisting of all possible error responses, even for internal server errors. OUTCOME_INTERNAL_ERROR Is never really reported in our case.

graphql.request.INTERNAL_SERVER_ERROR metric reported as count for all INTERNAL_SERVER_ERRORs.

I'm finding this really confusing. It would be much more useful to produce INTERNAL_ERROR observation outcome in case the actual server internal error (aka uncatched exception, not covered by user-defined exception handlers) happens.

It would be also more inline with HTTP world and status codes (I know gql always respond with http 200, it's just an inspiration for metric outcomes):
2xx codes -> outcome success
4xx codes -> outcome request error
5xx codes -> outcome internal/server error

If that's not the desired behaviour, I'd make it very explicit in documentation as current description is IMHO misleading.

EDIT://

for now, we found a workaround by doing this:

  @Bean
  public ExecutionRequestObservationConvention executionConvention() {
    return new DefaultExecutionRequestObservationConvention() {

      @Override
      @NonNull
      protected KeyValue outcome(@NonNull ExecutionRequestObservationContext context) {

        final var executionResult = context.getExecutionResult();
        final var errors = executionResult != null ? executionResult.getErrors() : null;
        final boolean isInternalServerError =
            errors != null
                && context.getExecutionResult().getErrors().stream()
                    .anyMatch(error -> error.getErrorType().equals(ErrorType.INTERNAL_ERROR));

        if (isInternalServerError) {
          return KeyValue.of(ExecutionRequestLowCardinalityKeyNames.OUTCOME, "INTERNAL_ERROR");
        }
        return super.outcome(context);
      }
    };
  }
@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Sep 19, 2024
@spring-projects spring-projects deleted a comment Sep 19, 2024
@lukasGemela lukasGemela changed the title DefaultExecutionRequestObservationConvention behaviuor for INTERNAL_SERVER_ERRORs DefaultExecutionRequestObservationConvention behavior for INTERNAL_SERVER_ERRORs Sep 19, 2024
@bclozel bclozel changed the title DefaultExecutionRequestObservationConvention behavior for INTERNAL_SERVER_ERRORs DefaultExecutionRequestObservationConvention does not produce INTERNAL_ERROR outcomes Oct 9, 2024
@bclozel bclozel self-assigned this Oct 9, 2024
@bclozel bclozel added type: bug A general bug in: core Issues related to config and core support and removed status: waiting-for-triage An issue we've not yet triaged labels Oct 9, 2024
@bclozel bclozel modified the milestones: 1.2.9, 1.3.3 Oct 9, 2024
@bclozel bclozel added the for: backport-to-1.2.x Marks an issue as a candidate for backport to 1.2.x label Oct 9, 2024
@github-actions github-actions bot added status: backported An issue that has been backported to maintenance branches and removed for: backport-to-1.2.x Marks an issue as a candidate for backport to 1.2.x labels Oct 9, 2024
@bclozel bclozel closed this as completed in d7ed850 Oct 9, 2024
@bclozel
Copy link
Member

bclozel commented Oct 9, 2024

Thanks for this report @lukasGemela - this is now fixed in 1.2.x and 1.3.x SNAPSHOTs. I have also clarified the documentation and that change should be live soon here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: core Issues related to config and core support status: backported An issue that has been backported to maintenance branches type: bug A general bug
Projects
None yet
Development

No branches or pull requests

3 participants