-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inter-operability of tracing systems operating with shorter trace-id #349
Comments
Thank you @SergeyKanzhelev, this looks good. I think the thing we've learned from recent experience is that providing clear guidance on interoperability with On recommendation: for clarity, I suggest we focus solely on the legacy |
@tedsuo your comment is that problem 5 in the list is very important. Not a new problem, correct? Thank you! |
Are we planning on addressing this prior to the in-person meeting, or should we save this as a topic for discussion then? |
@mtwo if we do, people that have a stake in that should be present so that we can reach a broad consent. Alternatively, I would hope that @adriancole @tylerbenson @jcchavezs @yurishkuro @nicmunroe @tedsuo @reyang join the weekly call next week. As much I like the process and others weighing in, I find it a bit unfortunate that this normative change will now delay the standard. So having everyone at the table and sorting it out for good would be a goal. |
fyi I'm not planning on joining any call. I don't think there is new information to present that I've not presented over the last years and also recently. |
@adriancole can I ask you to still do a final read-through after the change to make sure that that we |
please have an end user who's piped up read through. nic did a great job last time. you don't need me if you have more end users involved. |
I think we should attempt fixing it in small targeted PRs before the call. Unfortunately so far only @tedsuo confirmed that this list seems to be complete list of problems. Also @tedsuo suggests that this is in fact important problem to address in spec and (as I read it) we shouldn't just remove this paragraph. @nicmunroe @tylerbenson can you please comment on the completeness of the list. |
@danielkhan one option to approach the problem of normative change vs. completeness of a doc is to move the entire paragraph in non-normative section. This way recommendation will still be in a spec, but we can get into more details on how systems operate without contradicting "MUST" language of propagating the entire |
@SergeyKanzhelev the list items and general descriptions here look pretty good. I largely agree with the list. I am confused on a few sentences though. • In item 2 ("Backfilling on left vs. right"):
I'm not sure how it follows that the backfill logic must do right-backfill, ever. On a It's possible I'm missing something here, so assuming there are reasons that the conflict in item 2 forces backfilling right, then ultimately I think the proper way to solve it is to avoid the contradiction by deterministically backfilling left with zeros, not randomness. If we have to do surprising, confusing, and even more broken things to meet the spec requirement of left-side randomness, then let's change that spec requirement. • In item 3 ("Left padding of randomness"):
I don't understand what this means. When would having constant rightmost bytes ever be a problem, even if the leftmost bytes are random instead of zeros? I may need a concrete example to understand this sentence. But other than those two things ☝️ , I think I'm on board with how this is framed. And I agree with @tedsuo and item 5 that there should be a focus on the I'm not going to be able to be heavily involved in this unfortunately, but I don't think I really need to be. IMO the best way to resolve the major sticking points is to have the spec recommend that |
I think @nicmunroe summarized things very well and concur with his response above. (Nice job @nicmunroe!) |
Agreed. The solution of left-padding zeros will work well for us at New Relic.
|
I'd recommend proceeding with this and basically integrating Nic's
comments. There's no need for more ceremony than that.
…On Fri, Nov 1, 2019 at 7:18 AM Justin Foote ***@***.***> wrote:
Agreed. The solution of left-padding zeros will work well for us at New
Relic.
I fully agree with this comment from @nicmunroe
<https://github.com/nicmunroe>:
the best way to resolve the major sticking points is to have the spec
recommend that 64bit-only systems SHOULD backfill left with zeros, for both
ID generation and propagation.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#349?email_source=notifications&email_token=AAAPVVZO5EPOTPMFWPRGDC3QRNRU5A5CNFSM4JGAR6G2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECZRBAY#issuecomment-548606083>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAPVV2HJFEMJAAV7GSRAEDQRNRU5ANCNFSM4JGAR6GQ>
.
|
Sorry for the radio silence, I'll join the call today to discuss. I don't believe item 5 (prescribe the behavior of a 64bit-only system) is complex; we should simply specify how 64-bit IDs should be padded and truncated; and avoid confusion about random-backfill. |
I'm just finding this discussion now. To make sure I understand the current status, I'll try to summarize it; please let me know where I've misunderstood:
Edit: However, this is not what the latest Open Telemetry PR proposes. It instead recommends truncation of the trace id. Truncation is fine if the left-most bits are all 0, but otherwise I think it should instead proceed as if it's encountered an invalid trace id, which would mean to drop it rather than truncate it. This seems like a viable adoption step; was there a discussion thread I missed on truncating? What's the rationale for that? |
@morrisonlevi this comment is important for the understanding and getting on the same page: #344 (comment)
If you are using OpenTelemetry - propagation of full tarce-id will be implemented. However you can be using only right most part as an ID. Basically you operate as 128on64 system. Truncation is for truly Please check out the PR addressing these concerns: #356 |
This issue is tracking an overall resolution of inter-operability of tracing systems operating with shorter trace-id.
@bogdandrutu @adriancole @tylerbenson @jcchavezs @yurishkuro @nicmunroe @tedsuo @reyang @danielkhan (people who was involved in PR discussion) and others, please review this issue. It is written based on my understanding of the problems identified in a spec. Let's agree that this is the list of problems we have to address and then split this single issue into smaller issues to discuss how to resolve them.
Please comment/react on this issue to indicate that it reflects your understanding of a problems identified in spec.
There are existing tracing systems that currently operate with the 64bit
trace-id
, but still want to use the Trace Context headers for cross-process communication. The spec attempts to give an advice on how 128-bit, compliant to the spec tracing systems may cooperate wit the 64bit systems.Let's call the logic of generating 128bit
trace-id
by 64bit system a backfill logic. While addressing concern of adding more clarity into the spec regardingtrace-id
backfill logic (see #337), and whether this backfill logic must apply to new traces or traces "in transition", more problems with the current spec was highlighted. See discussion in PR #344.Here are the issues that were identified.
1. Split trace problem
If app A calls two apps, B and C, with a 64-bit trace, and both B and C each backfill using different algorithm, then the trace become split. It now has two separate trace IDs.
This is an original problem raised in #337. Spec must be very clear on how to backfill
trace-id
in case of newtrace-id
generation comparing to the backfill logic ontrace-id
propagation.2. Backfilling on left vs. right
Many existing systems dealing with the 64bit to 128bit transition already. These systems typically have a logic to backfill
trace-id
and look up traces given the subset of atrace-id
bytes. It is common that these systems implementtrace-id
look ups based on right-most characters as a typical backfill logic puts zeroes in the first bytes oftrace-id
.This backfill logic contradict with the other paragraph of a spec asking to put the randomness to the left side of the
trace-id
. So backfill logic, in order to follow this requirement, must backfill the rightmost bytes.This creates confusion and breaks inter-operability with the existing 64bit tracing systems.
3. Left padding of randomness.
Left-padding of randomness requirement creates another challenge for existing 64bit tracing systems. Keeping rightmost bytes constant will break compatibility with the tracing systems that operates on these rightmost characters.
This requirement must be either rewritten to make sure it works with 64bit systems or removed from the specification.
4. Zero vs. random backfill
We can differentiate tracing systems into three types:
tracestate
and 64 extra bits oftrace-id
Backfill logic that is currently described in a spec may work for 128on64bit systems. But it will not work for the 64bit systems. Even on a first trace-id generation, 64bit systems has no capabilities to preserve extra bytes and re-use them for subsequent outgoing calls from a single incoming call with the newly generated
trace-id
. Thus the requirements for 64bit systems may be different. Or the current requirement can be changed to simple zero backfill.5. Spec should prescribe the behavior of a 64bit-only system
64bit systems (as oppose to 128on64) are quite harmful for inter-operability of tracing systems. Not only these systems fail to propagate an entire
trace-id
, they also fail to propagatetracestate
.Specification currently declares these systems non-compliant - uses the "MUST" language in
trace-id
propagation. However inter-operability with these tracing systems is quite important. So the note in specification on how 64bit systems should operate will go a long way of ensuring a better interoperability of various tracing systems.The spec will benefit from being more explicit on how 64bit tracing systems must operate.
The text was updated successfully, but these errors were encountered: