Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve connection state logging for Jaeger exporter #2239

Conversation

jpkrohling
Copy link
Member

Description:
Adds a connection state reporter to the Jaeger exporter sending state changes to the log and as a metric.

Link to tracking Issue: Closes #1861, as it's the best we can do at this exporter level. On a more general level, the collector could provide more verbose info about the underlying gRPC connections. See #2237 for more information.

Testing: unit and manual tests.

Documentation: None.

Here are the complete logs for an OpenTelemetry Collector that got started without a Jaeger backend available. Note that the config.yaml contains two exporters, one (jaeger) with the insecure: true option, and a second one (jaeger/2) without this option. A Jaeger all-in-one is then started, without TLS support. This ends with the first exporter eventually transitioning to "READY", but the second won't.

2020-12-02T11:17:26.116+0100	INFO	service/service.go:409	Starting OpenTelemetry Collector...	{"Version": "latest", "GitHash": "<NOT PROPERLY GENERATED>", "NumCPU": 12}
2020-12-02T11:17:26.116+0100	INFO	service/service.go:253	Setting up own telemetry...
2020-12-02T11:17:26.130+0100	INFO	service/telemetry.go:101	Serving Prometheus metrics	{"address": "localhost:8888", "level": 0, "service.instance.id": "b4f3bc7c-2593-48e5-9ef9-8b585bbcf4fc"}
2020-12-02T11:17:26.131+0100	INFO	service/service.go:290	Loading configuration...
2020-12-02T11:17:26.131+0100	INFO	service/service.go:301	Applying configuration...
2020-12-02T11:17:26.131+0100	INFO	service/service.go:322	Starting extensions...
2020-12-02T11:17:26.132+0100	INFO	builder/exporters_builder.go:306	Exporter is enabled.	{"component_kind": "exporter", "exporter": "jaeger"}
2020-12-02T11:17:26.132+0100	INFO	builder/exporters_builder.go:306	Exporter is enabled.	{"component_kind": "exporter", "exporter": "jaeger/2"}
2020-12-02T11:17:26.132+0100	INFO	service/service.go:337	Starting exporters...
2020-12-02T11:17:26.132+0100	INFO	builder/exporters_builder.go:92	Exporter is starting...	{"component_kind": "exporter", "component_type": "jaeger", "component_name": "jaeger"}
2020-12-02T11:17:26.132+0100	INFO	jaegerexporter/exporter.go:183	State of the connection with the Jaeger Collector backend	{"component_kind": "exporter", "component_type": "jaeger", "component_name": "jaeger", "state": "CONNECTING"}
2020-12-02T11:17:26.132+0100	INFO	builder/exporters_builder.go:97	Exporter started.	{"component_kind": "exporter", "component_type": "jaeger", "component_name": "jaeger"}
2020-12-02T11:17:26.132+0100	INFO	builder/exporters_builder.go:92	Exporter is starting...	{"component_kind": "exporter", "component_type": "jaeger", "component_name": "jaeger/2"}
2020-12-02T11:17:26.132+0100	INFO	jaegerexporter/exporter.go:183	State of the connection with the Jaeger Collector backend	{"component_kind": "exporter", "component_type": "jaeger", "component_name": "jaeger/2", "state": "CONNECTING"}
2020-12-02T11:17:26.132+0100	INFO	builder/exporters_builder.go:97	Exporter started.	{"component_kind": "exporter", "component_type": "jaeger", "component_name": "jaeger/2"}
2020-12-02T11:17:26.132+0100	INFO	builder/pipelines_builder.go:207	Pipeline is enabled.	{"pipeline_name": "traces", "pipeline_datatype": "traces"}
2020-12-02T11:17:26.132+0100	INFO	service/service.go:350	Starting processors...
2020-12-02T11:17:26.132+0100	INFO	builder/pipelines_builder.go:51	Pipeline is starting...	{"pipeline_name": "traces", "pipeline_datatype": "traces"}
2020-12-02T11:17:26.132+0100	INFO	builder/pipelines_builder.go:61	Pipeline is started.	{"pipeline_name": "traces", "pipeline_datatype": "traces"}
2020-12-02T11:17:26.132+0100	INFO	builder/receivers_builder.go:235	Receiver is enabled.	{"component_kind": "receiver", "component_type": "otlp", "component_name": "otlp", "datatype": "traces"}
2020-12-02T11:17:26.132+0100	INFO	service/service.go:362	Starting receivers...
2020-12-02T11:17:26.132+0100	INFO	builder/receivers_builder.go:70	Receiver is starting...	{"component_kind": "receiver", "component_type": "otlp", "component_name": "otlp"}
2020-12-02T11:17:26.132+0100	INFO	otlpreceiver/otlp.go:93	Starting GRPC server on endpoint 0.0.0.0:4317	{"component_kind": "receiver", "component_type": "otlp", "component_name": "otlp"}
2020-12-02T11:17:26.132+0100	INFO	otlpreceiver/otlp.go:130	Setting up a second GRPC listener on legacy endpoint 0.0.0.0:55680	{"component_kind": "receiver", "component_type": "otlp", "component_name": "otlp"}
2020-12-02T11:17:26.132+0100	INFO	otlpreceiver/otlp.go:93	Starting GRPC server on endpoint 0.0.0.0:55680	{"component_kind": "receiver", "component_type": "otlp", "component_name": "otlp"}
2020-12-02T11:17:26.132+0100	INFO	builder/receivers_builder.go:75	Receiver started.	{"component_kind": "receiver", "component_type": "otlp", "component_name": "otlp"}
2020-12-02T11:17:26.132+0100	INFO	service/service.go:265	Everything is ready. Begin running and processing data.
2020-12-02T11:17:27.132+0100	INFO	jaegerexporter/exporter.go:183	State of the connection with the Jaeger Collector backend	{"component_kind": "exporter", "component_type": "jaeger", "component_name": "jaeger/2", "state": "TRANSIENT_FAILURE"}
2020-12-02T11:17:28.132+0100	INFO	jaegerexporter/exporter.go:183	State of the connection with the Jaeger Collector backend	{"component_kind": "exporter", "component_type": "jaeger", "component_name": "jaeger", "state": "TRANSIENT_FAILURE"}
2020-12-02T11:17:43.132+0100	INFO	jaegerexporter/exporter.go:183	State of the connection with the Jaeger Collector backend	{"component_kind": "exporter", "component_type": "jaeger", "component_name": "jaeger", "state": "READY"}

And here are the metrics for the final state:

# HELP otelcol_jaegerexporter_conn_state Last connection state: 0 = Idle, 1 = Connecting, 2 = Ready, 3 = TransientFailure, 4 = Shutdown
# TYPE otelcol_jaegerexporter_conn_state gauge
otelcol_jaegerexporter_conn_state{exporter_name="jaeger",service_instance_id="341f6179-0c34-4ad1-b2e5-19b2bed4c9d1"} 2
otelcol_jaegerexporter_conn_state{exporter_name="jaeger/2",service_instance_id="341f6179-0c34-4ad1-b2e5-19b2bed4c9d1"} 3

Signed-off-by: Juraci Paixão Kröhling juraci@kroehling.de

@jpkrohling jpkrohling requested a review from a team December 2, 2020 10:24
@codecov
Copy link

codecov bot commented Dec 2, 2020

Codecov Report

Merging #2239 (1632b15) into master (96b226a) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2239      +/-   ##
==========================================
+ Coverage   91.90%   91.92%   +0.02%     
==========================================
  Files         269      270       +1     
  Lines       15239    15295      +56     
==========================================
+ Hits        14005    14060      +55     
- Misses        855      856       +1     
  Partials      379      379              
Impacted Files Coverage Δ
exporter/jaegerexporter/exporter.go 94.80% <100.00%> (+6.56%) ⬆️
exporter/jaegerexporter/metrics.go 100.00% <100.00%> (ø)
service/telemetry.go 85.71% <100.00%> (+0.25%) ⬆️
internal/processor/filterlog/filterlog.go 89.47% <0.00%> (+1.97%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 96b226a...1632b15. Read the comment docs.

@jpkrohling jpkrohling force-pushed the jpkrohling/1861-improve-connection-state-logging-for-jaeger-exporter branch from de4901c to e4d2278 Compare December 2, 2020 10:50
@jpkrohling jpkrohling force-pushed the jpkrohling/1861-improve-connection-state-logging-for-jaeger-exporter branch from e4d2278 to a83f04e Compare December 2, 2020 13:37
@jpkrohling jpkrohling force-pushed the jpkrohling/1861-improve-connection-state-logging-for-jaeger-exporter branch 5 times, most recently from e9a0d08 to f33f064 Compare December 3, 2020 08:07
@jpkrohling
Copy link
Member Author

@bogdandrutu would you please review it again?

@tigrannajaryan
Copy link
Member

It would be great to make this work for all gRPC exporters. Not necessarily in this PR, but in the future.

@jpkrohling
Copy link
Member Author

Sure, let's get this one merged and see the feedback from users. I'll then refactor the main logic from this PR into a generic facility, to be used by gRPC-based exporters.

@jpkrohling
Copy link
Member Author

ping @bogdandrutu

1 similar comment
@jpkrohling
Copy link
Member Author

ping @bogdandrutu

@jpkrohling
Copy link
Member Author

@bogdandrutu @tigrannajaryan ping?

}
s.AddStateChangeCallback(s.onStateChange)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there ever more than one callback needed? Also, perhaps initialize it in the above struct initialization. It is not clear why this gets a special treatment while other fields are initialized directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there ever more than one callback needed?

No, perhaps in tests only. In any case, I think it's a good general pattern to allow multiple callbacks to exist. I'm happy to make it a simple property if you prefer.

It is not clear why this gets a special treatment while other fields are initialized directly.

It's because it needs to refer the instance s.

exporter/jaegerexporter/exporter.go Outdated Show resolved Hide resolved
exporter/jaegerexporter/exporter.go Outdated Show resolved Hide resolved
@jpkrohling
Copy link
Member Author

I won't be able to work on the review comments until I'm back from the holiday break (Jan 10).

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jan 14, 2021
@jpkrohling jpkrohling force-pushed the jpkrohling/1861-improve-connection-state-logging-for-jaeger-exporter branch from f33f064 to dea3b2f Compare January 14, 2021 15:46
@jpkrohling
Copy link
Member Author

PR updated. There's only one pending decision, about the callbacks.

@github-actions github-actions bot removed the Stale label Jan 15, 2021
@jpkrohling
Copy link
Member Author

@tigrannajaryan do you know what's wrong with the CI? Can you start it again?

@jpkrohling jpkrohling force-pushed the jpkrohling/1861-improve-connection-state-logging-for-jaeger-exporter branch 2 times, most recently from 745f47c to a12ca44 Compare January 21, 2021 12:06
Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>
@jpkrohling jpkrohling force-pushed the jpkrohling/1861-improve-connection-state-logging-for-jaeger-exporter branch from a12ca44 to 1632b15 Compare January 21, 2021 12:32
@jpkrohling
Copy link
Member Author

CI is failing with an unrelated issue:

FAIL	github.com/open-telemetry/opentelemetry-collector-contrib/processor/spanmetricsprocessor [setup failed]
config_test.go:31:2: module go.opentelemetry.io/collector@latest found (v0.18.0, replaced by /__w/opentelemetry-collector/opentelemetry-collector), but does not contain package go.opentelemetry.io/collector/processor/queuedprocessor

@jpkrohling
Copy link
Member Author

ping @bogdandrutu , @tigrannajaryan

@tigrannajaryan tigrannajaryan merged commit bfca1eb into open-telemetry:master Jan 22, 2021
MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this pull request Nov 11, 2021
hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this pull request Apr 27, 2023
Bumps [lycheeverse/lychee-action](https://github.com/lycheeverse/lychee-action) from 1.5.1 to 1.5.2.
- [Release notes](https://github.com/lycheeverse/lychee-action/releases)
- [Commits](lycheeverse/lychee-action@v1.5.1...v1.5.2)

---
updated-dependencies:
- dependency-name: lycheeverse/lychee-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Jaeger exporter causes Jaeger to throw "received bogus greeting"
4 participants