Skip to content

Commit

Permalink
Add a fix to mitigate the changes from the opentelemetry crate (#3471)
Browse files Browse the repository at this point in the history
When producing prometheus statistics the otel crate (0.19.0) now
automatically appends "_total" which is unhelpful.

This fix remove duplicated "_total_total" from our statistics.

fixes: #3443

<!-- start metadata -->

**Checklist**

Complete the checklist (and note appropriate exceptions) before a final
PR is raised.

- [x] Changes are compatible[^1]
- [x] Documentation[^2] completed
- [x] Performance impact assessed and acceptable
- Tests added and passing[^3]
    - [ ] Unit Tests
    - [x] Integration Tests
    - [ ] Manual Tests

**Exceptions**

*Note any exceptions here*

**Notes**

[^1]. It may be appropriate to bring upcoming changes to the attention
of other (impacted) groups. Please endeavour to do this before seeking
PR approval. The mechanism for doing this will vary considerably, so use
your judgement as to how and when to do this.
[^2]. Configuration is an important part of many changes. Where
applicable please try to document configuration examples.
[^3]. Tick whichever testing boxes are applicable. If you are adding
Manual Tests:
- please document the manual testing (extensively) in the Exceptions.
- please raise a separate issue to automate the test and label it (or
ask for it to be labeled) as `manual test`
  • Loading branch information
garypen authored Jul 19, 2023
1 parent 3d0911e commit 77499cc
Show file tree
Hide file tree
Showing 4 changed files with 30 additions and 1 deletion.
7 changes: 7 additions & 0 deletions .changesets/fix_garypen_3443_fix_prom.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
### Fix prometheus statistics issues with _total_total names([Issue #3443](https://github.com/apollographql/router/issues/3443))

When producing prometheus statistics the otel crate (0.19.0) now automatically appends "_total" which is unhelpful.

This fix remove duplicated "_total_total" from our statistics.

By [@garypen](https://github.com/garypen) in https://github.com/apollographql/router/pull/3471
6 changes: 5 additions & 1 deletion apollo-router/src/plugins/telemetry/metrics/prometheus.rs
Original file line number Diff line number Diff line change
Expand Up @@ -155,11 +155,15 @@ impl Service<router::Request> for PrometheusService {
let encoder = TextEncoder::new();
let mut result = Vec::new();
encoder.encode(&metric_families, &mut result)?;
// otel 0.19.0 started adding "_total" onto various statistics.
// Let's remove any problems they may have created for us.
let stats = String::from_utf8_lossy(&result);
let modified_stats = stats.replace("_total_total{", "_total{");
Ok(router::Response {
response: http::Response::builder()
.status(StatusCode::OK)
.header(http::header::CONTENT_TYPE, "text/plain; version=0.0.4")
.body::<hyper::Body>(result.into())
.body::<hyper::Body>(modified_stats.into())
.map_err(BoxError::from)?,
context: req.context,
})
Expand Down
15 changes: 15 additions & 0 deletions apollo-router/tests/common.rs
Original file line number Diff line number Diff line change
Expand Up @@ -517,6 +517,21 @@ impl IntegrationTest {
panic!("'{text}' not detected in metrics\n{last_metrics}");
}

#[allow(dead_code)]
pub async fn assert_metrics_does_not_contain(&self, text: &str) {
if let Ok(metrics) = self
.get_metrics_response()
.await
.expect("failed to fetch metrics")
.text()
.await
{
if metrics.contains(text) {
panic!("'{text}' detected in metrics\n{metrics}");
}
}
}

#[allow(dead_code)]
pub async fn assert_shutdown(&mut self) {
let router = self.router.as_mut().expect("router must have been started");
Expand Down
3 changes: 3 additions & 0 deletions apollo-router/tests/metrics_tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ async fn test_metrics_reloading() -> Result<(), BoxError> {
router
.assert_metrics_contains(r#"custom_header="test_custom""#, None)
.await;
router
.assert_metrics_does_not_contain(r#"_total_total{"#)
.await;

if std::env::var("APOLLO_KEY").is_ok() && std::env::var("APOLLO_GRAPH_REF").is_ok() {
router.assert_metrics_contains(r#"apollo_router_uplink_fetch_duration_seconds_count{kind="unchanged",query="License",service_name="apollo-router",url="https://uplink.api.apollographql.com/",otel_scope_name="apollo/router",otel_scope_version=""}"#, Some(Duration::from_secs(120))).await;
Expand Down

0 comments on commit 77499cc

Please sign in to comment.