Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override default response failed error response in logs from TraceLayer in tower-http #1146

Closed
josecelano opened this issue Dec 19, 2024 · 0 comments · Fixed by #1151
Closed
Assignees
Labels

Comments

@josecelano
Copy link
Member

Relates to: #1069

I'm working on issue #1069

We have some errors in the logs automatically added by the TraceLayer we use in Axum. Errors loke like this:

  2024-11-01T09:27:25.749262Z ERROR tower_http::trace::on_failure: response failed, classification: Status code: 500 Internal Server Error, latency: 0 ms
    at /home/josecelano/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tower-http-0.6.1/src/trace/on_failure.rs:93

As I explained on the other issue the error comes from using the TraceLayer in the router definition:

    router
        .layer(middleware::from_fn_with_state(state, v1::middlewares::auth::auth))
        .route(&format!("{api_url_prefix}/health_check"), get(health_check_handler))
        .layer(CompressionLayer::new())
        .layer(SetRequestIdLayer::x_request_id(MakeRequestUuid))
        .layer(PropagateHeaderLayer::new(HeaderName::from_static("x-request-id")))
        .layer(
            TraceLayer::new_for_http()
                .make_span_with(DefaultMakeSpan::new().level(Level::INFO))
                .on_request(|request: &Request<axum::body::Body>, _span: &Span| {
                    let method = request.method().to_string();
                    let uri = request.uri().to_string();
                    let request_id = request
                        .headers()
                        .get("x-request-id")
                        .map(|v| v.to_str().unwrap_or_default())
                        .unwrap_or_default();

                    tracing::span!(
                        target: API_LOG_TARGET,
                        tracing::Level::INFO, "request", method = %method, uri = %uri, request_id = %request_id);
                })
                .on_response(|response: &Response, latency: Duration, _span: &Span| {
                    let status_code = response.status();
                    let request_id = response
                        .headers()
                        .get("x-request-id")
                        .map(|v| v.to_str().unwrap_or_default())
                        .unwrap_or_default();
                    let latency_ms = latency.as_millis();

                    tracing::span!(
                        target: API_LOG_TARGET,
                        tracing::Level::INFO, "response", latency = %latency_ms, status = %status_code, request_id = %request_id);
                }),
        )
        .layer(SetRequestIdLayer::x_request_id(MakeRequestUuid))
        .layer(
            ServiceBuilder::new()
                // this middleware goes above `TimeoutLayer` because it will receive
                // errors returned by `TimeoutLayer`
                .layer(HandleErrorLayer::new(|_: BoxError| async { StatusCode::REQUEST_TIMEOUT }))
                .layer(TimeoutLayer::new(DEFAULT_TIMEOUT)),
        )

TraceLayer uses a default "on failure" function that prints that error when the response is a 500 response.

We have the possibility to override the error message calling the on_failure on the router TraceLayer builder:

.on_failure(
    |failure_classification: ServerErrorsFailureClass, _latency: Duration, _span: &Span| {
        tracing::error!(target: HTTP_TRACKER_LOG_TARGET, "response failed, classification: {failure_classification}");
    },
),

I would do it because:

  1. We can control what's written into the logs. We need that to write assertions in tests. See Remove tracing errors from tests execution output #1142. If we don't control the error message, the test could be broken by third-party changes.
  2. It's easier to find the code that is writing the message. It's our code, not external code.
  3. We migth need to add some extra fields to the log line to identify the request/response in order to implement this.
@josecelano josecelano self-assigned this Dec 19, 2024
@josecelano josecelano changed the title Override default response failed error response from TraceLayer in tower-http Override default response failed error response in logs from TraceLayer in tower-http Dec 19, 2024
josecelano added a commit to josecelano/torrust-tracker that referenced this issue Dec 23, 2024
From:

```
2024-12-23T15:54:25.842837Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=0 ms
```

To:

```
2024-12-23T16:06:53.553023Z ERROR API: response failed classification=Status code: 500 Internal Server Error latency=0 ms
```

The target has been changed:

```
2024-12-23T15:54:25.842837Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=0 ms
2024-12-23T16:06:53.553023Z ERROR API:                           response failed classification=Status code: 500 Internal Server Error latency=0 ms
```

It was changed to:

- Easily identify the origin of the error in our code.
- Allow to insert more fields in the future, for example, to write
  assertions about logs.

2024-12-23T16:06:53.553023Z ERROR API: response failed classification=Status code: 500 Internal Server Error latency=0 ms
@josecelano josecelano linked a pull request Dec 23, 2024 that will close this issue
josecelano added a commit to josecelano/torrust-tracker that referenced this issue Dec 23, 2024
From:

```
2024-12-23T15:54:25.842837Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=0 ms
```

To:

```
2024-12-23T16:06:53.553023Z ERROR API: response failed classification=Status code: 500 Internal Server Error latency=0 ms
```

The target has been changed:

```
2024-12-23T15:54:25.842837Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=0 ms
2024-12-23T16:06:53.553023Z ERROR API:                           response failed classification=Status code: 500 Internal Server Error latency=0 ms
```

It was changed to:

- Easily identify the origin of the error in our code.
- Allow to insert more fields in the future, for example, to write
  assertions about logs.
josecelano added a commit that referenced this issue Dec 23, 2024
9ac676c feat: [#1146] override tower-http tracing error (Jose Celano)

Pull request description:

  This overrides the default log errors when a 500 response is sent to the client.

  From:

  ```
  2024-12-23T15:54:25.842837Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=0 ms
  ```

  To:

  ```
  2024-12-23T16:06:53.553023Z ERROR API: response failed classification=Status code: 500 Internal Server Error latency=0 ms
  ```

  The target has been changed:

  ```
  2024-12-23T15:54:25.842837Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=0 ms
  2024-12-23T16:06:53.553023Z ERROR API:                           response failed classification=Status code: 500 Internal Server Error latency=0 ms
  ```

  It was changed to:

  - Easily identify the origin of the error in our code.
  - Allow for the inserting of more fields in the future, for example, to write assertions about logs.

ACKs for top commit:
  josecelano:
    ACK 9ac676c

Tree-SHA512: 4e2eb9057ea15bf28f2a5ab6a8d8becc072520d428fe6718ea2275a7ccd028e21d9907fb7d9626b8beba80ccfd72a60c040bdae902c1ec4b41fa31a8e8014974
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant