-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document limitations on span recording #3152
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it impossible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just asked the same question 😆 #3152 (review)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watching spans in real time. When you have a long running task, and you can see its span and parent span before they are finished. See them the middle of the run. To trace long running jobs.
EDIT: Also to trace long running jobs that were in the end terminated by timeout or in any other way where end of span was not reached, not recorded, not sent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @reyang said "it is just the current exporter and protocol which don't support the concept of having separate data that represents start and stop events". Protocol doesn't support sending span with no end attribute. That's why it is impossible to trace spans that were not completed using this protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, separate concepts for start and stop events are not required for that. I feel like the only limitation is the restrictions placed on the
end_time_unix_nano
field, i.e. if it's allowed to be 0 to indicate an incomplete span, then no other changes in the protocol are required to support the use case you describe.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks @abitrolly! Maybe this can be a solution https://github.com/open-telemetry/opentelemetry-specification/blob/main/experimental/trace/zpages.md?
I believe spans are not designed for long running operations (e.g. I don't feel span is the right tool to track a batch job which runs for 5 hours).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro maybe there is a better page to document the OpenTelementry limitations? I thought that protocol encompasses all interactions between tools. I would actually prefer to define actual solution rather than document limitations. For me the good solution is that makes end time optional, not relying on specific value to be set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reyang zPages is a bad alternative, because requires polling over HTTP. Not all long running processes that needs to be traced expose web server. Think CI/CD pipelines for example.
If OpenTelemetry is a replacement for all other tracing protocols, it should support this tracing scenario firsthand (#2930).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abitrolly I am not sure documenting limitations is a particularly productive exercise, especially in this case where I don't even see it as a limitation but rather a missing feature that simply has not been high on the priority list. I think there is an opportunity here to spec a new feature to support long-running processes better. I am not saying that this is the only way to implement such feature (i.e. you could go with a completely different, event-based protocol, aka streaming implementation of the OTEL API), but in practice sticking with the existing protocol is a much easier path. For example, Jaeger already has ability to receive multiple instances under the same span ID and merge them at query time, but it's probably not completely sufficient for this use case, I would prefer a better definition of the merge semantics and clear spec of the protocol that indicates partial spans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro if OLTP is going to be labelled stable, this feature won't find its place in the spec, and spec limitations need to be described.