Skip to content
This repository has been archived by the owner on May 23, 2023. It is now read-only.

Introduce ActiveSpan, ActiveSpan.Continuation, and ActiveSpanSource #115

Merged
merged 48 commits into from
May 11, 2017
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
000a8c3
Introduce ActiveSpanHolder ASH.Continuation
bhs Mar 28, 2017
6e24189
Improve README and adjust to improve fluency
bhs Mar 28, 2017
5c6b992
Factor out Handle and move to pure interfaces
bhs Apr 1, 2017
cc71c24
Separate ActiveSpan and ActiveSpanSource
bhs Apr 4, 2017
2058413
Demonstrate the ActiveSpanHolder interfaces
bhs Mar 28, 2017
d533226
Adjust to rebase
bhs Mar 28, 2017
6b10c9b
Actually write to MDC
bhs Mar 28, 2017
f63d20b
Adjust to core changes
bhs Apr 1, 2017
e88b0a9
Update to follow suit with core changes
bhs Apr 4, 2017
26b23f3
Separate ActiveSpan and ActiveSpanSource
bhs Apr 4, 2017
e48d475
Improve README and fix small issues
bhs Apr 5, 2017
79880a4
Deal with rebase fallout
bhs Apr 7, 2017
e3ee448
ActiveSpanSource -> ActiveSpanProvider
bhs Apr 7, 2017
56d01f9
Un-rename
bhs Apr 7, 2017
54e6166
Make Tracer inherit from ActiveSpanSource
bhs Apr 7, 2017
24178b9
Deprecate start() in favor of startManual()
bhs Apr 7, 2017
281e5dc
Clean up comments
bhs Apr 7, 2017
4707b23
Truth in advertising
bhs Apr 7, 2017
5b6f17f
Update GlobalTracer
bhs Apr 8, 2017
619fcd9
Add ThreadLocalActiveSpan* tests
bhs Apr 8, 2017
b26fe7c
Update license headers
bhs Apr 8, 2017
d87c1c7
Make javadoc work and fix some documentation bugs
bhs Apr 10, 2017
b1468f7
Perform a `%s/defer/capture/g`
bhs Apr 10, 2017
4087c3e
Fix IntelliJ mangling
bhs Apr 10, 2017
b39fe6d
Update README
bhs Apr 14, 2017
28223dd
s/adopt/makeActive/g
bhs Apr 14, 2017
a8e1a1d
Fix asRoot naming
bhs Apr 14, 2017
9799fd4
Remove MDC demo (preparing for release)
bhs Apr 15, 2017
9dd7be6
Fix some minor rebase issues
bhs Apr 16, 2017
49a3548
Respond to PR comments
bhs Apr 16, 2017
1c250e9
Respond to PR comments
bhs Apr 17, 2017
8776bff
Respond to PR comments
bhs Apr 18, 2017
cd16c62
Respond to PR comments
bhs Apr 20, 2017
07ad79a
Fix NoopSource's SpanContext
bhs Apr 20, 2017
089c904
Review responses
bhs Apr 21, 2017
574a7d2
Fix typo
bhs Apr 24, 2017
c28c44c
Add licenses to files the maven plugin missed
bhs Apr 24, 2017
4718dfb
Respond to PR comments
bhs Apr 25, 2017
bc351dc
Minimize public surface of NoopActiveSpanSource
bhs Apr 28, 2017
6539f4a
Add BaseSpan to clean up ActiveSpan/Span mechanics
bhs Apr 29, 2017
36bdb45
Move ThreadLocalActiveSpan to .util
bhs Apr 29, 2017
4d03cc5
Add license headers
bhs Apr 29, 2017
b5ddc97
Fix tests and a comment
bhs Apr 30, 2017
bf4606c
Clean up more Span/BaseSpan code
bhs Apr 30, 2017
ae36949
Make BaseSpan sub-interface chaining work
bhs May 1, 2017
20a3c67
Self-review with a fine-toothed comb
bhs May 8, 2017
9856c90
Do a full clarification pass on the README
bhs May 9, 2017
f1e5aa7
Respond to PR comments
bhs May 9, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
129 changes: 129 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,135 @@ to facilitate unit-testing of OpenTracing Java instrumentation.

Packages are deployed to Maven Central under the `io.opentracing` group.

## Usage

### Initialization

Initialization is OpenTracing-implementation-specific. Generally speaking, the pattern is to initialize a `Tracer` once for the entire process and to use that `Tracer` for the remainder of the process lifetime. The [GlobalTracer](https://github.com/opentracing/opentracing-java/blob/master/opentracing-util/src/main/java/io/opentracing/util/GlobalTracer.java) provides a helper for singleton access to the `Tracer`.

### `ActiveSpan`s, `Continuation`s, and within-process prapagation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/prapagation/propagation/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other than this trivial typo, no further comments from me, thx.


For any Thread, at most one `Span` may be "active". Of course there may be many other `Spans` involved with the Thread which are (a) started, (b) not finished, and yet (c) not "active": perhaps they are waiting for I/O, blocked on a child Span, or otherwise off of the critical path.

It's inconvenient to pass an active `Span` from function to function manually, so OpenTracing provides an `ActiveSpanSource` abstraction to grant access to an `ActiveSpan` and to pin and `capture()` it for re-activation in another Thread (e.g., in an async callback).

Access to the active `Span` is straightforward:

```
io.opentracing.Tracer tracer = ...;
...
ActiveSpan span = tracer.activeSpan();
if (span != null) {
span.log("...");
}
```

### Starting a new Span

The simplest case – which does not take advantage of `ActiveSpanSource` – looks like this:

```
io.opentracing.Tracer tracer = ...;
...
Span span = tracer.buildSpan("someWork").startManual();
try {
// (do things / record data to `span`)
} finally {
span.finish();
}
```

Or, to take advantage of `ActiveSpanSource` and automatic intra-process propagation, like this:

```
io.opentracing.Tracer tracer = ...;
...
try (ActiveSpan activeSpan = tracer.buildSpan("someWork").startActive()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be good to give the full API example first, i.e.

ActiveSpan activeSpan = tracer.buildSpan("someWork").startActive()
try {
} finally {
  activeSpan.deactivate();
}

and then the same with using try-with-resource, as a shorthand. The try-finally example better explains the API imo, without "magic".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did something...

// Do things.
//
// If we create async work, `activeSpan.capture()` allows us to pass the `ActiveSpan` along as well.
}
```

The above is semantically equivalent to the more explicit try-finally version:

```
io.opentracing.Tracer tracer = ...;
...
ActiveSpan activeSpan = tracer.buildSpan("someWork").startActive();
try {
// Do things.
} finally {
activeSpan.deactivate();
}
```


**If there is an `ActiveSpan`, it will act as the parent to any newly started `Span`** unless the programmer invokes `ignoreActiveSpan()` at `buildSpan()` time, like so:

```
io.opentracing.Tracer tracer = ...;
...
ActiveSpan span = tracer.buildSpan("someWork").ignoreActiveSpan().startActive();
```

### Deferring asynchronous work

Consider the case where a `Span`'s lifetime logically starts in one thread and ends in another. For instance, the intra-Span timing breakdown might look like this:

```
[ ServiceHandlerSpan ]
|-FunctionA-|-----waiting on an RPC------|-FunctionB-|

------------------------------------------------> time
```

The `"ServiceHandlerSpan"` is _active_ when it's running FunctionA and FunctionB, and inactive while it's waiting on an RPC (presumably modelled as its own Span, though that's not the concern here).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit confusing referring to functions as execution contexts. Maybe mention that functionB is running in a different thread?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not really what I meant... I was more referring to things like the MDC state that doesn't have to do with OT. I was also (inadvertently) using the term to refer to a thread a few lines up, though, and I've changed that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one other occurance in L29


**The `ActiveSpanSource` makes it easy to `capture()` the Span and execution context in `FunctionA` and re-activate it in `FunctionB`.** Note that every `Tracer` must also implement `ActiveSpanSource`. These are the steps:

1. Start the `ActiveSpan` via `Tracer.startActive()` rather than via `Tracer.startManual()`; or, if the `Span` was already started manually via `startManual()`, call `ActiveSpanSource#makeActive(span)`. Either route will yield an `ActiveSpan` instance that encapsulates the `Span`.
2. In the method/function that *allocates* the closure/`Runnable`/`Future`/etc, call `ActiveSpan#capture()` to obtain an `ActiveSpan.Continuation`
3. In the closure/`Runnable`/`Future`/etc itself, invoke `ActiveSpan.Continuation#activate` to re-activate the `ActiveSpan`, then `deactivate()` it when the Span is no longer active (or use try-with-resources for less typing).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So many verbs in a small bit of text... start, activate, adopt, allocate, defer, re-activate and deactivate ...
By the way, I think they're all much better chosen than previous iterations, but I think a reader trying to discover what OpenTracing is will get absolutely overwhelmed by these couple of bullets 😉

For example:

```
io.opentracing.Tracer tracer = ...;
...
// STEP 1 ABOVE: start the ActiveSpan
try (ActiveSpan serviceSpan = tracer.buildSpan("ServiceHandlerSpan").startActive()) {
...

// STEP 2 ABOVE: capture the ActiveSpan
final ActiveSpan.Continuation cont = serviceSpan.capture();
doAsyncWork(new Runnable() {
@Override
public void run() {

// STEP 3 ABOVE: use the Continuation to reactivate the Span in the callback.
try (ActiveSpan activeSpan = cont.activate()) {
...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case where this Runnable exists solely/primarily to make a remote call to some other service it seems inconvenient to have to tracer.inject(cont.activate().context(), ...). I'm tempted to ask if a Continuation can be "serializable" for inter-process calls, at which point Continuation becomes the vehicle for context propagation intra- and inter-process?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea.

Typically the RPC instrumentation creates a child span just before making the RPC call, and in the same thread, even if the response is handled in the other thread. So the span and its context is usually available, without needing to use the Continuation to inject(...). And in case of synchronous calls forcing the creation of Continuation just to inject(...) is unnecessary overhead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's an interesting idea... that said, I really like that the inject machinery only depends upon immutable SpanContexts... in various edge cases that's important.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a stupid question because I don't know much about Java. But in this example, if I have access to the outer cont variable in the Runnable (it's not passed as a parameter), why couldn't I just access the serviceSpan directly instead and write something like this - completely without the need of active spans, continuations etc.:

someMethod() {
    final Span serviceSpan = tracer.buildSpan("ServiceHandlerSpan").start();
    
    serviceSpan.setTag("key1", "value");

    doAsyncWork(new Runnable() {
        @Override
        public void run() {
            serviceSpan.setTag("key3", "value");
            serviceSpan.finish();
        }
    }

    serviceSpan.setTag("key2", "value");
}

Would this not be possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cwe1ss there are two problems. The larger one is that the Tracer.activeSpan() wouldn't be set up properly whene someMethod finishes (i.e., it wouldn't revert to whatever was active prior to the start() call). The smaller one is that – when forking and joining work – it is not always easy to know which callback finishes last, and the refcounting can help with that. But I consider the former issue to be way more important (and honestly the latter comes with a bunch of new failure modes, so it's not all gravy).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes - I didn't think about the first issue. thx for the explanation! I wonder if Java will get async/await as well one day. It makes it sooo much easier!

}
}
});
}
```

In practice, all of this is most fluently accomplished through the use of an OpenTracing-aware `ExecutorService` and/or `Runnable`/`Callable` adapter; they can factor most of the typing.

#### Automatic `finish()`ing via `ActiveSpan` reference counts
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with ThreadLocalActiveSpan being moved to opentracing-util, but this section still suggests that automatic finishing via ref counting will happen all the time. IMHO this section should be explicit on whether all tracers must implement ref counting (via ThreadLocalActiveSpan or any other means) or if it is a optional behavior that the tracer implementations can decide to support or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the behavior of active span is different between implementations, then frameworks wouldn't know what the right way is to instrument their code. So the spec must be unambiguous that de-activating the last instance of the active span always finishes the span. Is there another way to achieve that without ref counting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether there's an AtomicInteger is an implementation detail, but I don't see how this spec could be implemented without reference counting of some sort.


When an `ActiveSpan` is created (either via `Tracer.SpanBuilder#startActive` or `ActiveSpanSource#makeActive(Span)`), the reference count associated with the `ActiveSpan` is `1`.

- When an `ActiveSpan.Continuation` is created via `ActiveSpan#capture`, the reference count **increments**
- When an `ActiveSpan.Continuation` is `ActiveSpan.Continuation#activate()`d and thus transformed back into an `ActiveSpan`, the reference count **is unchanged**
- When an `ActiveSpan` is `ActiveSpan#deactivate()`d, the reference count **decrements**

When the reference count decrements to zero, **the `Span`'s `finish()` method is invoked automatically.**

When used as designed, the programmer should not need to invoke `Span#finish()` manually, but rather should let `ActiveSpan` and `ActiveSpan.Continuation` invoke `finish()` as soon as the last active or deferred `ActiveSpan` is deactivated.

# Development

This is a maven project, and provides a wrapper, `./mvnw` to pin a consistent
Expand Down
96 changes: 96 additions & 0 deletions opentracing-api/src/main/java/io/opentracing/ActiveSpan.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
/**
* Copyright 2016-2017 The OpenTracing Authors
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
* in compliance with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software distributed under the License
* is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
* or implied. See the License for the specific language governing permissions and limitations under
* the License.
*/
package io.opentracing;

import java.io.Closeable;

/**
* In any execution context (or any thread, etc), there is at most one "active" {@link Span}/{@link ActiveSpan}
* primarily responsible for the work accomplished by the surrounding application code. That {@link ActiveSpan} may be
* accessed via the {@link ActiveSpanSource#activeSpan()} method. If the application needs to capture work that should
* be part of the same Span, the Source provides a {@link ActiveSpan#capture} method that returns a
* {@link Continuation}; this continuation may be used to re-activate and continue the {@link Span} in that other
* asynchronous executor and/or thread.
*
* <p>
* {@link ActiveSpan}s are created via {@link Tracer.SpanBuilder#startActive()} or {@link ActiveSpanSource#makeActive}.
* They can be {@link ActiveSpan#capture()}ed as {@link ActiveSpan.Continuation}s, then
* re-{@link Continuation#activate()}d later.
*
* <p>
* NOTE: {@link ActiveSpan} extends {@link Closeable} rather than {@code AutoCloseable} in order to keep support
* for JDK1.6.
*
* @see ActiveSpanSource
*/
public interface ActiveSpan extends Closeable, Span {
Copy link
Contributor

@sjoerdtalsma sjoerdtalsma Apr 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe explicitly override the close() method to:

  • Not declaring IOException from close() makes it more useable in my opinion.
  • Document that close is intended as a try-with-resources usable alias for deactivate().

(the first is more important than the second bullit)

/**
* Mark the end of the active period for the {@link Span} pinned by this {@link ActiveSpan}. When the last
* {@link ActiveSpan} is deactivated for a given {@link Span}, it is automatically {@link Span#finish()}ed.
* <p>
* NOTE: Calling {@link #deactivate} more than once on a single {@link ActiveSpan} instance is undefined behavior.
*
* @see Closeable#close() {@link ActiveSpan}s are auto-closeable and may be used in try-with-resources blocks
*/
void deactivate();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it also restore the previous active span that was current before this one was activated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's certainly what an obvious impl does, but is that part of the spec? E.g., imagine an ActiveSpanSource that plugs into a totally-synchronous HTTP serving framework... it might define the next active Span whenever the next request gets serviced, and not before.

Copy link
Member

@pavolloffay pavolloffay Apr 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still a little bit worried about situations when there are multiple ActiveSpans associated with the same span. So deactivation causes multiple finishes.

The second potential issue is that finish does not imply deactivation. (maybe I am trying to find infallible solution to all undefined use cases, Actually, I don't think these use cases are undefined, I just trying to protect naive use of OT).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what to say about this... I definitely see your point and agree that there is a risk here. Personally, I'm ok with it, but it's unfortunately just a judgement call and I have no data to suggest that programmers won't trip over this.

I basically want to get people out of the habit of calling finish and get in the habit of using try {} for most straightforward Span lifetimes.

So, to be clear: you would be happier if we wrote something like this in the spec/docs:

If `Span.finish` is called while `ActiveSpan`s are still extant, the effect is to turn all future `deactivate` calls into noops.

Correct? I am not convinced I want to put that sort of bookkeeping responsibility upon implementations (esp since it still creates a weird ambiguity about the tagging/logging features of the Span after finish has been called), but maybe we can agree on what the hypothetical change would be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very sure about this situations either..

About: try-with-resources it is a nice straight forward construct, but for instance in framework integrations start and finish usually run in separate threads...

see this: https://github.com/redhat-helloworld-msa/ola/blob/82aeeec525bef2102098cd5602d1f13af07a124f/src/main/java/com/redhat/developers/msa/ola/OlaController.java#L118

If somebody would call deactivate span would be finished at a wrong place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to be a total ignorant but I am still thinking about various use cases. I have something contra ref count approach:

imagine you call Continuation ActiveSpan.capture() to prepare for some execution in a different thread, but there is some exception thrown and following code won't be called:

ActiveSpan activeSpan = Continuation.active()`
spactiveSpan.deactivate()

and span won't be finished. In general an instrumented callable is created but between its execution an error can happen and callable won't be called.

shouldn't be refCount increment moved to activate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in framework integrations start and finish usually run in separate threads...

Of course you are correct about this. IMO it means that the framework would need an actual (internal-to-framework) code change to make tracing sensible, at least when a threadlocal is involved. Not a great answer, but I don't see how we solve this with any approach that's optimized for threadlocals.

As for the refcounting: I am pretty certain that the increment needs to happen at capture()-time... But maybe we should change the contract for finish() to make it explicitly safe (but a noop) to call finish() twice on a single Span instance... that way the exception handler in your example could finish() the Span (and tag it with an error, etc) even if there are Continuations outstanding. Another important point would be to have something like Continuation.cancel() which would be a synonym for Continuation.activate().deactivate(). (I.e., it decrements the refcount)

Thoughts??

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is ActiveSpan.deactivate exposed as a separate method? Can't this be done in finish so that the usage is transparent for the user? Would it be valid to call finish() on an ActiveSpan without deactivating it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just read the explanation below (regarding reference counts etc), but isn't this still something that could be done within finish? It would just decrease the reference counts until it reached 0 and then call into wrapped.finish();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cwe1ss I would be interested what you think about #116.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cwe1ss this is the downside of having ActiveSpan inherit from Span... it saves typing, but then these two conceptually similar (but sadly not equivalent) things – deactivate and finish – coexist in the ActiveSpan interface.

Just thinking out loud, another idea (??) would be to have a sort of BaseSpan (naming TBD) that has the tagging and logging but not the finish-ing... ActiveSpan would inherit from BaseSpan, as would a new class called, say, ManualSpan which would have the explicit finish() method.


/**
* A synonym for {@link #deactivate()} that can be used in try-with-resources blocks.
*/
@Override
void close();

/**
* "Capture" a new {@link Continuation} associated with this {@link ActiveSpan} and {@link Span}, as well as any
* 3rd-party execution context of interest.
*
* <p>
* <em>IMPORTANT:</em> the caller MUST {@link Continuation#activate()} and {@link ActiveSpan#deactivate()} the
* returned {@link Continuation} or the pinned {@link Span} will never automatically {@link Span#finish()}. That is,
* calling {@link #capture()} increments a refcount that must be decremented somewhere else.
*
* <p>
* The associated {@link Span} will not {@link Span#finish()} while a {@link Continuation} is outstanding; in
* this way, it provides a reference/pin just like an {@link ActiveSpan} does.
*
* @return a new {@link Continuation} to {@link Continuation#activate()} at the appropriate time.
*/
Continuation capture();

/**
* A {@link Continuation} can be used *once* to activate a Span along with any non-OpenTracing execution context
* (e.g., MDC), then deactivate when processing activity moves on to another Span. (In practice, this active
* period typically extends for the length of a deferred async closure invocation.)
*
* <p>
* Most users do not directly interact with {@link Continuation}, {@link Continuation#activate()} or
* {@link ActiveSpan#deactivate()}, but rather use {@link ActiveSpanSource}-aware Runnables/Callables/Executors.
* Those higher-level primitives need not be defined within the OpenTracing core API, and so they are not.
*
* @see ActiveSpanSource#makeActive(Span)
*/
interface Continuation {
/**
* Make the Span (and other execution context) encapsulated by this Continuation active and return it.
*
* <p>
* NOTE: It is an error to call activate() more than once on a single Continuation instance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what should happen, an exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like throwing exceptions from monitoring code... maybe just explain that it's the dreaded "undocumented behavior"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only three options how the implementation can handle it

  1. return a functional active span
  2. return null
  3. throw exception

Is there a reasonable / workable implementation for the first option? If not, then the second option is practically equivalent to the third, since NullPointerException will happen anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I guess there's also

  1. Return a noop ActiveSpan

?

Copy link
Contributor

@sjoerdtalsma sjoerdtalsma Apr 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How important is it to forbid this?
Say I capture the active span and want to start two new threads.
What's objectionable to reactivating the same active span (continuation) in both threads (from a logical perspective)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What benefits in particular would have NoSpan in contrast to null?

Maybe there is yet another option:
if called # > 1 it would increment refcount.. but then the semantic representation of Continuation is not very clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pavolloffay the benefit is pretty simple: callers would not have to check for null. E.g., compare

void someMethod(String color) {
    GlobalTracer.activeSpan().setTag("color", color);
    ...
}

vs.

void someMethod(String color) {
    Span active = GlobalTracer.activeSpan();
    if (active != null) {
        active.setTag("color", color);
    }
    ...
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhs yes this is obvious. I was thinking about other benefits.

This is definitely an interesting topic. By using NoSpan caller can loose some tracing data without noticing. How would be handled situation when NoSpan is passed as a reference to SpanBuilder?

Btw. there are two places for NoSpan/null

  1. ActiveSpan: ActiveSpanSource.active()
  2. ActiveSpan: Continuation.activate()

for 1. if it returns no/null caller would probably start a new trace in almost all situations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK re "this is obvious" – just wanted to point out the main usability benefit.

It's more NoSpanContext than NoSpan, but the Tracer.extract()-when-there's-nothing-there return value feels similar to this, too.

Copy link
Contributor

@mabn mabn Apr 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. ActiveSpan: ActiveSpanSource.active()
  2. ActiveSpan: Continuation.activate()

These are different cases.

  1. The code invoking Continuation.activate() will typically be some kind of a Runnable, some "entry point" into another context of execution. If will be passed Continuation explicitly and because of that it can be assumed that the Continuation is properly prepared - that activate() was not called.

  2. ActiveSpanSource.active() on the other hand will be used by various libraries in methods executed in various contexts. It's perfectly legal for active span to be missing. If it can return null then every call to active() will be followed by a null check.
    I'd suggest to return something not-null - preferably a new span.

Another option would be adding a separate method, e.g. activeOrNew(), to make this explicit. Or to make this the default (as the more popular variant) and call the second one activeOrNull, or activeOrElse(defaultValue) similarly to Optional.orElse(param).

There's one more case where this comes into play - creating new span.
The thing is that right now it's illegal to call builder.childOf(null_span), so this:

tracer.buildSpan("span1")
  .withTag("x", "y")
  .childOf(source.activeSpan())
  .start()

is illegal. It has to be:

SpanBuilder builder = tracer.buildSpan("span1")
  .withTag("x", "y");
Span active = source.activeSpan();
if(active) {
  builder.childOf(active)
}
builder.start()

It does not look very good. But in this case returning new span from activeSpan does not seem helpful - when creating a span when there is no active one I wouldn't want to create implicitly an active span and then instantly start a child one.

*
* @see ActiveSpanSource#makeActive(Span)
* @return a handle to the newly-activated Span
*/
ActiveSpan activate();
}

}
39 changes: 39 additions & 0 deletions opentracing-api/src/main/java/io/opentracing/ActiveSpanSource.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
/**
* Copyright 2016-2017 The OpenTracing Authors
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
* in compliance with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software distributed under the License
* is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
* or implied. See the License for the specific language governing permissions and limitations under
* the License.
*/
package io.opentracing;

/**
* {@link ActiveSpanSource} allows an existing (possibly thread-local-aware) execution context provider to act as a
* source for an actively-scheduled OpenTracing Span.
*
* @see ActiveSpan
*/
public interface ActiveSpanSource {

/**
* @return the {@link ActiveSpan active span}, or null if none could be found. This does not affect the reference
* count for the {@link ActiveSpan}.
*/
ActiveSpan activeSpan();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this has already been discussed, but from a user's standpoint I'd prefer returning a NoopSpan over returning null so that they don't have to do null checks just to add a simple tag.

If the tag is expensive to compute, it would however be beneficial if the user could differ between proper and noop spans. But rather than checking for nulls or a global NOOP constant I propose to introduce a isSampled() method on BaseSpan. From the user's perspective, it does not really matter wether there is no active span or the active span is not sampled. They only want to avoid computing a expensive tag if not necessary.

I think this is something we should decide on before merging, or at least before releasing a new version as this would be a somewhat breaking change (although remaining binary compatibility) to introduce later (check for null vs check for isSampled()).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there no valid use-cases where you need the null value? E.g. only create a span if there's no active one etc?

the general isSampled() functionality is tracked by opentracing/specification#8

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably there are use cases for that. But I think the default use case is to set simple tags or add logs. IMHO an API should be optimized for the default use case. So maybe add ActiveSpanSource#hasActiveSpan() and let #activeSpan() return noops?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixbarny this has already been discussed, yes. Unfortunately the github thread here is so convoluted that I can't even find the relevant comment. :-(

I am 💯% happy to add an issue about this (and block release of a new version of OT-Java accordingly), but IMO it can be decoupled from this PR. Just to prep you for my point-of-view: I generally think instrumentation will either (a) create a new Span, in which case there's no need for a null check regardless, or (b) be at the layer of a logging library or similar where the verbosity of a null check is fine.

If we actually think that it's going to be commonplace to use the OT interface as the logging API (rather than a registrant of the logging API), then I would be in favor of what you propose. That seems like the essential question.

(I'm less convinced about arguments involving tags: they are almost always set at start()-time in my experience)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the separate issue

I think this would actually be the main api end users would use.

The use case that I have in mind is when ot is used by an agent, users should still be able to add custon tags. For example, I've implemented a search analytics Kibana dashboard on top of stagemonitor and OT for a customer. It works by adding meta info about the search via tags like no of hits and the search term. For these kinds of things it would be nice if null checks could be avoided.


/**
* Wrap and "makeActive" a {@link Span} by encapsulating it – and any active state (e.g., MDC state) in the
* execution context – in a new {@link ActiveSpan}.
*
* @param span the Span just started
* @return an {@link ActiveSpan} that encapsulates the given Span and any other Source-specific context (e.g.,
* MDC data)
*/
ActiveSpan makeActive(Span span);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly asked before: why makeActive and not activate ?
It feels like makeActive needs the argument in between the two words.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjoerdtalsma @ivantopo I don't want to overload that word since it's used by Continuation.activate(). This method is really more about "converting" a manually-started Span to an ActiveSpan.

It does bring up an interesting point / problem, though, about the ActiveSpan extends Span decision: you could call makeActive(...) on an ActiveSpan, which would sort of "wrap it twice". The APIs are constructed to discourage this, but it bugs me that the compiler wouldn't prevent it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... not to say that we couldn't rename this method. I previously called it adopt() which was deemed too indirect. Another idea would be convertToActive().

Note that this method will be used (very) rarely... As such, I think it's ok for it to be lengthy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 agreed on not overloading the term

}
39 changes: 19 additions & 20 deletions opentracing-api/src/main/java/io/opentracing/Span.java
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/**
* Copyright 2016 The OpenTracing Authors
* Copyright 2016-2017 The OpenTracing Authors
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
* in compliance with the License. You may obtain a copy of the License at
Expand All @@ -21,7 +21,7 @@
*
* <p>Spans are created by the {@link Tracer#buildSpan} interface.
*/
public interface Span extends Closeable {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: Span is no longer Closeable to make semantics clearer around try-with-resources for ActiveSpan (which would become the normal/preferred way of actually using a Span).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see Span#close method being removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I had thought (incorrectly) that the close() contract was simply forced on Span implementations implicitly. Great catch.

public interface Span {
/**
* Retrieve the associated SpanContext.
*
Expand All @@ -34,8 +34,9 @@ public interface Span extends Closeable {
/**
* Sets the end timestamp to now and records the span.
*
* <p>With the exception of calls to Span.context(), this should be the last call made to the span instance, and to
* do otherwise leads to undefined behavior.
* <p>With the exception of calls to {@link #context}, this should be the last call made to the span instance.
* Future calls to {@link #finish} are defined as noops, and future calls to methods other than {@link #context}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this comment text to define multiple finish() calls as noops... that way users can finish() manually and have the ActiveSpan try-with-resources scope end afterwards (without causing undefined behavior).

* lead to undefined behavior.
*
* @see Span#context()
*/
Expand All @@ -53,8 +54,6 @@ public interface Span extends Closeable {
*/
void finish(long finishMicros);

void close();

/**
* Set a key:value tag on the Span.
*/
Expand All @@ -74,14 +73,14 @@ public interface Span extends Closeable {
* Caveat emptor.
*
* <p>A contrived example (using Guava, which is not required):
* <pre>{@code
span.log(
ImmutableMap.Builder<String, Object>()
.put("event", "soft error")
.put("type", "cache timeout")
.put("waited.millis", 1500)
.build());
}</pre>
* <pre><code>
span.log(
ImmutableMap.Builder<String, Object>()
.put("event", "soft error")
.put("type", "cache timeout")
.put("waited.millis", 1500)
.build());
</code></pre>
*
* @param fields key:value log fields. Tracer implementations should support String, numeric, and boolean values;
* some may also support arbitrary Objects.
Expand Down Expand Up @@ -110,9 +109,9 @@ public interface Span extends Closeable {
*
* Shorthand for
*
* <pre>{@code
span.log(Collections.singletonMap("event", event));
}</pre>
* <pre><code>
span.log(Collections.singletonMap("event", event));
</code></pre>
*
* @param event the event value; often a stable identifier for a moment in the Span lifecycle
* @return the Span, for chaining
Expand All @@ -124,9 +123,9 @@ public interface Span extends Closeable {
*
* Shorthand for
*
* <pre>{@code
span.log(timestampMicroseconds, Collections.singletonMap("event", event));
}</pre>
* <pre><code>
span.log(timestampMicroseconds, Collections.singletonMap("event", event));
</code></pre>
*
* @param timestampMicroseconds The explicit timestamp for the log record. Must be greater than or equal to the
* Span's start timestamp.
Expand Down
Loading