Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[For 1.0] Mark TraceIdRatioBased experimental. #1412

Conversation

Oberon00
Copy link
Member

@Oberon00 Oberon00 commented Feb 9, 2021

Required for #1373.

See #1372 (comment)

TraceIdRatioBased is only vaguely specified and has a TODO in it. We should not mark it stable now.

Alternative temporary solutions:

  1. Declare it experimental (this PR)
  2. Completely remove that sampler for now (from spec)
  3. Document that only the algorithm is unspecified and may break for language SDKs, but the interface for configuration/creation is stable --> [For 1.0] Add WARNING for TraceIdRatioBased algorithm stability. #1414
  4. Do nothing -- the TODO might already be clear enough

Alternative permanent solution:

  1. (@jkwatson) Quickly specify it before 1.0 [For 1.0] Mark TraceIdRatioBased experimental. #1412 (comment),
  2. Just remove the TODO, leave algorithm unspecified forever [For 1.0] Mark TraceIdRatioBased experimental. #1412 (comment)

Follow-up to find some proper solution is in #1413.

@Oberon00 Oberon00 requested review from a team February 9, 2021 14:13
@Oberon00 Oberon00 mentioned this pull request Feb 9, 2021
@Oberon00 Oberon00 added area:sampling Related to trace sampling area:sdk Related to the SDK release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:trace Related to the specification/trace directory labels Feb 9, 2021
@jkwatson
Copy link
Contributor

jkwatson commented Feb 9, 2021

This will mean that any language that has implemented this as a part of their SDK will need to pull it out of the SDK and move out to an experimental package before they can release 1.0.

@Oberon00
Copy link
Member Author

Oberon00 commented Feb 9, 2021

I think that would be the cleanest solution, yes. But it might be impractical / not worth it at this point already (especially since I don't think there will be fitting existing packages for most SIGs). I don't have a strong opinion on which option would be best, which is why also suggested the alternatives incl "do nothing".

@dyladan
Copy link
Member

dyladan commented Feb 9, 2021

This will mean that any language that has implemented this as a part of their SDK will need to pull it out of the SDK and move out to an experimental package before they can release 1.0.

We already don't know if the languages work together as advertised with the ratio based sampler. At the very least it needs to be documented.

@arminru
Copy link
Member

arminru commented Feb 9, 2021

I would prefer alternative 2 given how close we are to the release. It should be clearly documented for the reason @dyladan pointed out. Ideally it would not only say so in the spec but also in the doc for each language implementation ("configuration/creation interface is stable but underlying algorithm is unspecified and subject to change").

@carlosalberto
Copy link
Contributor

Let's go with 2) definitely.

@Oberon00
Copy link
Member Author

Oberon00 commented Feb 9, 2021

I created #1413 for after-ga follow-up.

@jsuereth You upvoted @arminru but approved this PR. This PR is not doing alternative 2. I will put up a new PR for that.

@Oberon00
Copy link
Member Author

Oberon00 commented Feb 9, 2021

#1414 should implement alternative 2.

@jkwatson
Copy link
Contributor

jkwatson commented Feb 9, 2021

How about
4) Specify this quickly so that languages can have it implemented for 1.0.0

@Oberon00
Copy link
Member Author

Oberon00 commented Feb 9, 2021

There was already a PR for this (#331) that was rejected long ago. Sounds fine for me.

@Oberon00
Copy link
Member Author

Oberon00 commented Feb 9, 2021

Another alternative: Leave the algorithm unspecified forever. The name TraceIdRatioBased is then a bit useless but that is a minor point.

Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR is a fair representation of a current state of a spec. I'd also vote for having a defined algorithm for better interoperability. But definitely a separate discussion.

@yurishkuro
Copy link
Member

Completely remove that sampler for now (from spec)

+1

If there is no spec for the algorithm, then there is no point in mentioning this sampler in the spec, since SDKs either won't know how to implement it or do it inconsistently.

@bogdandrutu
Copy link
Member

I am fine to declare as experimental, there were PRs to define the algorithm, I will make sure to do that this week (proposed that PR). So I think we can leave it and be experimental

@jsuereth
Copy link
Contributor

jsuereth commented Feb 9, 2021

@Oberon00 Sorry, I was reading 2 PRs at the same time, got confused which one I was reviewing.

I think Option #2 is the right one. A few notes I didn't have time to say in the Specification meeting (had a computer crash due to memory thrashing right in the middle of the discussion).

  • SemVar does NOT specify that behavior cannot change. Indeed, the behavior HAS to be abel to change for any bug fix to occur.
  • The algorithm is unspecified but all the requirements of the algorithm ARE specified for this sampler.
  • We have existing implementations of this sampler, but no one asked the simple question of whether they meet the specified requirements, or possibly use the same algorithm (i.e. the copy-paste from each other).

I'm still in favor of removing the TODO on the algorithm. I think you need to NO MORE here, as the requirements for an algorithm are clearly stated and any adjustments to existing implementations can be considered bug-fixes to those requirements.

Side Note: Is there an efficient way to ask SDKs for inputs measure their compliance to the requirements around ratio sampling between upstream downstream processes?

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing my approval. Sorry, my vote is for #2 which was a separate PR I was looking at.

@Oberon00
Copy link
Member Author

Oberon00 commented Feb 9, 2021

but all the requirements of the algorithm ARE specified for this sampler

There are still several implementations possible that are inconsistent with each other (e.g. Python might sample a particular trace ID while Java might not at the same ratio). Whether we want this or not is another question, but probably not a trivial one, see #1413

@yurishkuro
Copy link
Member

but all the requirements of the algorithm ARE specified for this sampler

Requirements are not specification. "all wifi devices MUST interop" is a requirement, but it's not helpful to implementors. For this sampler to be in the spec, the spec needs to clearly describe how it is supposed to work, e.g. what hashing algorithm all languages must use, etc.

An "Experimental" label to me means: here's a specification of how a thing should work, try to implement it, ok if you don't, it's not mandatory. The spec part is still required even for experimental, otherwise there's absolutely no guarantee that implementations across languages will be compatible with the stated requirements.

@jsuereth
Copy link
Contributor

jsuereth commented Feb 9, 2021

@Oberon00

but all the requirements of the algorithm ARE specified for this sampler

There are still several implementations possible that are inconsistent with each other (e.g. Python might sample a particular trace ID while Java might not at the same ratio). Whether we want this or not is another question, but probably not a trivial one, see #1413

Actually not. The implementations between servers need to be consistent per the requirement, so any inconsistent in SDKs would NOT be according to specification.

It's a nuance that's not worth it I guess, but I think while there's MANY possible implementations specified, only ONE can be implemented across all SDKs to be compliant. I.e. we already specify everyone uses the same algorithm, just not exactly which one.

I'd love to understand if there ARE inconsistencies today or if we can just remove the TODO, and then write up the specification of what everyone implemented in the next spec release rather than churn the experimental tag here.

@bogdandrutu
Copy link
Member

@jsuereth and everyone else we went with #1414 to allow implementation to move forward without big problems.

@bogdandrutu bogdandrutu closed this Feb 9, 2021
@Oberon00 Oberon00 deleted the bugfix/experimental-traceidratio branch February 9, 2021 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:sampling Related to trace sampling area:sdk Related to the SDK release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:trace Related to the specification/trace directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants