-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose metric for zipkin spans that could not be parsed #2264
Comments
Thinking about it more, it might be a bug, as really this counter should be being incremented:
|
This metric can only be used once the span data has been correctly parsed, as it needs to identify the service. So either a new metric should be defined, or the service label would need a specific value to indicate the failure occurred on deserialization. |
Ahhh good point and makes sense, thanks for clarifying |
I'd like to try doing this one. Is it still relevant, @yurishkuro, @Stono? I managed to have a quick look but will need some guidance since I haven't worked on Jaeger before:
|
If we do not have any metrics on the HTTP endpoints, this is where I would start, i.e. emit a classic RED set where E(rrors) are labeled by the HTTP status code. The OP does not ask for metrics by the originating service, and I think that's OK. |
Thanks. OP wrote just about the collector's endpoints. Should I add these
onto the query & agent's endpoints as well?
…On Sun, Nov 22, 2020 at 8:23 PM Yuri Shkuro ***@***.***> wrote:
If we do not have any metrics on the HTTP endpoints, this is where I would
start, i.e. emit a classic RED set where E(rrors) are labeled by the HTTP
status code.
The OP does not ask for metrics by the originating service, and I think
that's OK.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2264 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFAL3A3J7I4SX4MV3ROJGRTSRFQLHANCNFSM4NNI7PBA>
.
|
query & agent do not have HTTP endpoints receiving spans. The UDP endpoint in the agent already emit metrics. |
Hello! At the moment - spans which contain invalid data are rejected with a 400 bad request, but there is no associated metric we can monitor and alert off. We're simply looking for that basic metric to tell us bad data is arriving, so we can then debug further. |
Requirement - what kind of business use case are you trying to solve?
I would like to detect spans that are getting rejected with a 400BAD Request (see istio/istio#24177) with prometheus metrics.
There is currently nothing on
:14269/metrics
which captures such a failure (in this case, a span tag that was not a string).Problem - what in Jaeger blocks you from solving the requirement?
We build a platform which our product teams use Jaeger to debug, we would like to detect problems before they get raised to us as missing spans in the UI.
Proposal - what do you suggest to solve the problem or improve the existing situation?
Expose a prometheus metric which tracks failed zipkin span reports, so we're able to alert on it.
Any open questions to address
The text was updated successfully, but these errors were encountered: