Skip to content
This repository has been archived by the owner on May 23, 2023. It is now read-only.

Implement ScopeManager for in-process propagation (updated) #64

Merged

Conversation

carlosalberto
Copy link
Contributor

@carlosalberto carlosalberto commented Feb 2, 2018

This is an updated PR for supporting Span propagation, sitting on top of @palazzem's effort (#63), and updated with the feedback received there, as well as updated with the latest Java API.

Summary of changes:

  • Tracer.start_manual was removed.
  • Tracer.start_span stays, without being deprecated, but now by default sets the active Span as the parent one, if any. This is a breaking change, and passing ignore_active_span=True will get the previous behavior.
  • Tracer.start_active became Tracer.start_active_span (longer, but better, more complete name overall).
  • finish_on_close became non-optional, being only left as optional in start_active_span as False (as all its parameters are optional, even operation_name).
  • Added Scope.manager, similar to Span.tracer (will also avoid duplicating code in the implementations).
  • Used properties for ScopeManager.active, Scope.span and others, to mimic what we have right now in the API (Span.context, Span.tracer, etc).

Decided to iterate on a different PR as @palazzem seems to be very busy himself, and we will most likely iterate faster over here ;)

Also, PR of the actual implementation code: opentracing/basictracer-python#25

Update 1: Mention that start_span() will implicitly use the active Span as the parent, if any.

cc @yurishkuro @clutchski @wkiser @itsderek23

Emanuele Palazzetti and others added 14 commits November 13, 2017 09:53
Make our start_* calls behave like the ones
from the Java API:

* start_span() stays untouched.
* start_manual() is gone (no need to deprecate
  as master does not have it yet).
* start_active() becomes start_active_span() (for
  better naming), and finish_on_close becomes 'False'
  by default.
* ignore_active_scope parameter becomes ignore_active_span,
  just as the Java api has it.
* ScopeManager and Scope have now properties
  instead of methods for exposing their
  members.
* Scope now also exposes finish_on_close.
* ScopeManager.activate() *requires* the
  finish_on_close parameter.
This is a little bit of an implementation attribute,
so lets leave it out of the main API.
It's also possible to finish the `Span` when the `Scope` context
expires:

with tracer.start_active_span('...',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason the finish_on_close=True is not the default case? because that is what 99% of the use case will look like. The default value should be based on the default use case for active spans.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally agree, I think that if you create an active span it should finish automatically. In that way snippets like the following works out of the box:

with tracer.start_active_span('web.request'):
  # do something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There a discussion about that during the Java API design: opentracing/opentracing-java#189 (comment) (pretty much the long debate + eventual PR was between instrumentation vs user code). The general opinion was going in the direction of finishSpanOnClose=false till we ended up removing the default value for it ;)

This (sadly) can't be done for the Python API here, as start_active_span takes all parameters as optional (even operation_name, as stated in the description).

That being said, I'd like to hear what other people think, and change to to True if needed ;)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of keeping momentum, I'd be for:

  1. Keeping things as they are (finishSpanOnClose=false)
  2. Adding some comments to the method to help implementers make a decision on finishSpanOnClose.

IMO, this is most likely to bite end-users that are custom instrumenting their own code and aren't doing tracing as their day job. It's less likely to impact contrib repos as those are more likely to have a set of eyes on them. With OT early in its lifecycle, it feels like more of the work is on contrib than end users anyway, so this may be the type of decision that's OK to punt on for a while.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@itsderek23 Hey, thanks for the feedback ;) So, yes, one of the arguments in favor of finishSpanOnClose=false was the fact most people using this will be writing library instrumentation, rather than application code (and usually, OT is used there through a middleware-alike layer, so often a Span is not automatically closed, but activated and passed around).

Of course, there's also the argument that @beberlei and @palazzem do... so lets see and wait for more feedback. Nevertheless, I'd expect this moving (momentum, etc) right this week ;)

@itsderek23
Copy link

Thanks @carlosalberto for jumping in! 👏

@carlosalberto
Copy link
Contributor Author

Hey everybody - the latest iteration involves making finish_on_close a REQUIRED parameter, just as it happens in the Java API (as we still don't want to make assumptions of whether True or False should be the default, and which we could provide in the future, as long as we get more evidence of which one is a better value).

So right now start_active_span() requires TWO parameters: operation_name and finish_on_close, and the rest of the parameters stay just as start_span has them.

I'm wondering how good this overall feels, as, for a start, start_span has ALL of its parameters optional, including operation_name (which I'd love to NOT have as optional at all).

Let me know of your opinion on this and we will be adjusting this PR (towards a Release Candidate in pip ;) )

`Span` / `Scope` (via `ScopeManager#active`).
"""
def __init__(self):
# TODO: `tracer` should not be None, but we don't have a reference;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just a nitpick and not really important to the discussion (so we can tackle even in another cleanup PR), but the idea was to have a noop module (or something similar) where we initialize all noop_* objects so we can use them everywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and I think we can leave it for another PR IHMO, so we decide on how to handle it cleanly (I tried to work on it myself, but there's no clear approach here, so I decided to leave it out for now).


def start_active_span(self,
operation_name,
finish_on_close,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up for: #64 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the operation_name must be mandatory, since I don't see a legit span where the name is None. A span should always represent a unit of work, and it must have a proper name I think.

For the finish_on_close, this is probably the most "safe" solution since we delegate implementers and developers to make their choice when instrumenting their code. The advantage of having a default, is that we cover the most common case (that I think is people using OpenTracing and not exactly Tracer implementers since they usually go with the advanced usage in any case).

In the case of having a default, anyone should have an opinion on that. I think that the most common case is a developer that may not be interested in having such fine-grained control in their code, while implementers have and that's why they can switch to False easily. We may check opentracing-contrib to have a sense of the usage, though this is still the implementer point of view and not the developers one.


:return: a `Scope`, already registered via the `ScopeManager`.
"""
return self._noop_scope

def start_span(self,
operation_name=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should keep this API in sync with the other one? In that case also this operation_name must be a not optional parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to do this - but decided to leave it out as a separated PR (so we can focus on the Scope side) (in case people get too worried about us breaking the API). Then again, we will probably want to include this in the next RC candidate...

@@ -37,16 +39,71 @@ class Tracer(object):

_supported_formats = [Format.TEXT_MAP, Format.BINARY, Format.HTTP_HEADERS]

def __init__(self):
def __init__(self, scope_manager=ScopeManager()):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have scope_manager=None and then initialize it if not set. If we use ScopeManager() as default value the object will be shared across multiple calls. Here a code example of the result that we may don't want:

class ScopeManager(object):
    def __init__(self):
        self.a = 0

    def increment(self):
        self.a += 1
        return self.a


def init(scope_manager=ScopeManager()):
    return scope_manager.increment()


print(init())
print(init())

# output:
# 1
# 2 <-- it should be '1' since we may expect it as a new ScopeManager instance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update :) Even though its supposed to be a 'noop' ScopeManager, I'd rather be on the safe side, as you suggest ;)

We want to be on the safe side: now by default we let
scope_manager=None, and upon checking it within __init__(),
create a new ScopeManager if needed, so instances of Tracer don't share it.
@carlosalberto
Copy link
Contributor Author

carlosalberto commented Feb 6, 2018

Hey everybody - added a comment on the fact that Tracer.start_span() will include a breaking change, as now by default the created Span will use the active one as parent, if any. This is also what we did for Java, as we want to really have Span propagation out-of-the box for the most used methods.

This is of course a breaking change, and users need to know that ;)

cc @palazzem

@carlosalberto
Copy link
Contributor Author

CCing @bhs who could also be interested ;)

@yurishkuro
Copy link
Member

@carlosalberto did you mean that start_span() will automatically use the current active span as a parent? The new span itself won't actually become active unless created with start_active_span()?

@carlosalberto
Copy link
Contributor Author

did you mean that start_span() will automatically use the current active span as a parent?

This one ;) Fixed the wording - sorry, I left a 'be' out there my mistake. Hope it makes sense now :)

@carlosalberto carlosalberto force-pushed the span_propagation_updated branch from 9b46dda to dbf5f8e Compare February 12, 2018 13:36
@carlosalberto carlosalberto force-pushed the span_propagation_updated branch from dbf5f8e to 4db96b3 Compare February 12, 2018 13:39
@carlosalberto
Copy link
Contributor Author

carlosalberto commented Feb 12, 2018

Hey all - just updated the docs prior to do a Release Candidate so we can validate (or invalidate) the current approach. That way we will get more attention from the community and hopefully more feedback (specially for the default-case for start_active_span and related calls).

Any feedback on the docs is appreciated - so far I based the README changes on the Java API ones, adapting it to our Python world. Let me know!

@carlosalberto carlosalberto changed the base branch from master to v2.0.0 February 15, 2018 18:16
@carlosalberto
Copy link
Contributor Author

Re-targeting to the v2.0.0 branch, to keep the upstream effort there and have release candidates from it, as we did for the opentracing-java.

@carlosalberto carlosalberto merged commit 28f2f43 into opentracing:v2.0.0 Feb 16, 2018
@carlosalberto
Copy link
Contributor Author

Merging to v2.0.0 for further iteration and releasing candidate packages to PyPi. For further comments/feedback see #65 .

cc @beberlei @palazzem @itsderek23 @yurishkuro

"""ScopeManager accessor"""
return self._scope_manager

def start_active_span(self,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlosalberto I find it odd that start_span returns Span but start_active_span returns Scope. What's the reasoning behind this method name? Would it make more sense to have something like start_active_scope or am I missing something here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @indrekj

So at this time we are using the same naming as the Java 0.31 API - I do remember, though, there was debate on this same item when testing out such API. IRC, it was something like:

tracer.buildSpan("one").startActive();
tracer.buildSpan("two").startScope();

There was also some discussion about it in the Python side, and, all good and bad things considered, we decided to go with the start_active_span() name again. Hope that explains it ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start_span_active might be less confusing. In Java the word span is not in the method name, so startActive and startScope were not significantly different, but here it does sound odd vs return type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the feeling start_span_active could still also lead to some confusion...

Opinion on this? @palazzem @pglombardo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ruby, it's span = OpenTracing.start_active_span('operation_name').

I think there's a longer discussion about how to make scopes "hidden" from application developers, but still available to framework instrumentors via the ScopeManager. But I'd prefer we do that across-language, and not have a skewed API between languages at this stage.

Copy link

@indrekj indrekj Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tedsuo it also returns a scope, not span. See https://github.com/opentracing/opentracing-ruby/blob/master/lib/opentracing/tracer.rb#L48

But I wouldn't take ruby as an example because we pretty much copied the python interface one-to-one. I started implementing scopes for ruby zipkin tracer and then noticed this method return value. That's why I was curios why it was chosen here. I'm not aware of any ruby tracer implementations that are using scopes yet so might still be a good time to change it if needed. /cc ruby side @mwear @itsderek23

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack sorry, I meant scope.

If we're going to change it, we should change it in both places. If we can change it in Ruby without breaking anyone, then perhaps there's a window of opportunity.

If we change the return value to span, the scope can still be accessed via the scope manager. This feels cleaner to me. Essentially, the Tracer has start_active_span and active_span as convenience methods. Applications developers, in general, do not need to think about Scopes. For developers writing tricky instrumentation for frameworks, etc, they can use the ScopeManager directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI it's not just Ruby, it's also Java, PHP, and C#. So there's a broader decision to make here. I'd prefer we think about how to address this in a cross language manner, and not have python (or python and ruby) be slightly different than the rest, even if it is slightly better.

This probably means adding a new method to these other languages, and deprecating the current one. But let's check in with the Cross-Language Working Group before changing the API here, as it would essentially demand that the other languages update theirs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some discussion in gitter, we'd like to go with start_active_scope for now. There's a broader discussion for how to give application developers a simpler interface for the common case, when they are creating active child spans without any fancy scope juggling. But that is for a future release, not this one!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for joining late the thread. About this specific case, I'd say to keep the approach in sync between Ruby and Python (saw pending PRs) so we can proceed with start_active_scope(). For consistency and future releases, I'd say to write down an RFC that can recap the expected behavior / API name, so that we can have some days to think about it and review it across languages. Then we may end-up with something stable for years without introducing more deprecations / future breaking changes.

Thanks for reporting this concern!

carlosalberto added a commit that referenced this pull request Jul 10, 2018
* Implement ScopeManager for in-process propagation (updated) (#64)
* Updating docstrings and tests
* Testbed import (#80)
* Scope managers integration (#83)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants