Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[On Hold] Improve behavior of un-ended ReadableSpan #693

Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ public void export_ReturnCode() {

static SpanData makeBasicSpan() {
return SpanData.newBuilder()
.setHasEnded(true)
.setTraceId(io.opentelemetry.trace.TraceId.getInvalid())
.setSpanId(io.opentelemetry.trace.SpanId.getInvalid())
.setName("span")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,7 @@ public void testStatusNotOk() {
long endMs = startMs + 900;
SpanData span =
SpanData.newBuilder()
.setHasEnded(true)
.setTraceId(TraceId.fromLowerBase16(TRACE_ID, 0))
.setSpanId(SpanId.fromLowerBase16(SPAN_ID, 0))
.setName("GET /api/endpoint")
Expand All @@ -244,6 +245,7 @@ private static SpanData getSpanData(long startMs, long endMs) {
Link link = SpanData.Link.create(createSpanContext(LINK_TRACE_ID, LINK_SPAN_ID), attributes);

return SpanData.newBuilder()
.setHasEnded(true)
.setTraceId(TraceId.fromLowerBase16(TRACE_ID, 0))
.setSpanId(SpanId.fromLowerBase16(SPAN_ID, 0))
.setParentSpanId(SpanId.fromLowerBase16(PARENT_SPAN_ID, 0))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ public void testExport() throws Exception {
long endMs = startMs + duration;
SpanData span =
SpanData.newBuilder()
.setHasEnded(true)
.setTraceId(TraceId.fromLowerBase16(TRACE_ID, 0))
.setSpanId(SpanId.fromLowerBase16(SPAN_ID, 0))
.setName("GET /api/endpoint")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ public void returnCode() {
long epochNanos = TimeUnit.MILLISECONDS.toNanos(System.currentTimeMillis());
SpanData spanData =
SpanData.newBuilder()
.setHasEnded(true)
.setTraceId(new TraceId(1234L, 6789L))
.setSpanId(new SpanId(9876L))
.setStartEpochNanos(epochNanos)
Expand Down
17 changes: 17 additions & 0 deletions sdk/src/main/java/io/opentelemetry/sdk/trace/ReadableSpan.java
Original file line number Diff line number Diff line change
Expand Up @@ -60,4 +60,21 @@ public interface ReadableSpan {
* library
*/
InstrumentationLibraryInfo getInstrumentationLibraryInfo();

/**
* Returns whether this Span has already been ended.
*
* @return {@code true} if the span has already been ended, {@code false} if not.
* @since 0.4.0
*/
boolean hasEnded();

/**
* Returns the latency of the {@code Span} in nanos. If still active then returns now() - start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice in this javadoc to define "latency" in this context, because it is a word that some may interpret as meaning "network latency", which seems like an odd thing to have on a generic Span.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think it might be better to just rename this to currentDuration. Latency, while technically correct (see open-telemetry/opentelemetry-specification#330), is probably confusing to most users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe "elapsedTime" ?

* time.
*
* @return the latency of the {@code Span} in nanos.
* @since 0.4.0
*/
long getLatencyNanos();
}
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ final class RecordEventsReadableSpan implements ReadableSpan, Span {
private long endEpochNanos;
// True if the span is ended.
@GuardedBy("lock")
private boolean hasBeenEnded;
private boolean hasEnded;

/**
* Creates and starts a span with the given configuration.
Expand Down Expand Up @@ -169,13 +169,15 @@ public SpanData toSpanData() {
.setAttributes(getAttributes())
.setStartEpochNanos(startEpochNanos)
.setEndEpochNanos(getEndEpochNanos())
.setLatencyNanos(getLatencyNanos())
.setKind(kind)
.setLinks(getLinks())
.setParentSpanId(parentSpanId)
.setHasRemoteParent(hasRemoteParent)
.setResource(resource)
.setStatus(getStatus())
.setTimedEvents(adaptTimedEvents())
.setHasEnded(hasEnded())
.build();
}

Expand All @@ -190,6 +192,13 @@ private List<SpanData.TimedEvent> adaptTimedEvents() {
return result;
}

@Override
public boolean hasEnded() {
synchronized (lock) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this proliferation of synchronization really makes me want to implement this with AtomicBoolean, since it's lock-free.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could remove the lock here then, but all the checks for hasEnded within span still need to hold the lock if they want to ensure that they don't set properties after the span has ended.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. IMO, the whole thread-safety of this class is pretty janky. We should think about separating the mutable parts from the immutable and having a simpler synchronization model.

return hasEnded;
}
}

@Override
public SpanContext getSpanContext() {
return getContext();
Expand Down Expand Up @@ -220,14 +229,14 @@ public InstrumentationLibraryInfo getInstrumentationLibraryInfo() {
}

/**
* Returns the end nano time (see {@link System#nanoTime()}). If the current {@code Span} is not
* ended then returns {@link Clock#nanoTime()}.
* Returns the end nano time (see {@link System#nanoTime()}) or zero if the current {@code Span}
* is not ended.
*
* @return the end nano time.
*/
private long getEndEpochNanos() {
synchronized (lock) {
return getEndNanoTimeInternal();
return endEpochNanos;
}
}

Expand Down Expand Up @@ -297,18 +306,13 @@ Map<String, AttributeValue> getAttributes() {
*
* @return the latency of the {@code Span} in nanos.
*/
long getLatencyNs() {
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
@Override
public long getLatencyNanos() {
synchronized (lock) {
return getEndNanoTimeInternal() - startEpochNanos;
return (hasEnded ? endEpochNanos : clock.now()) - startEpochNanos;
}
}

// Use getEndNanoTimeInternal to avoid over-locking.
@GuardedBy("lock")
private long getEndNanoTimeInternal() {
return hasBeenEnded ? endEpochNanos : clock.now();
}

/**
* Returns the kind of this {@code Span}.
*
Expand Down Expand Up @@ -363,7 +367,7 @@ public void setAttribute(String key, AttributeValue value) {
Preconditions.checkNotNull(key, "key");
Preconditions.checkNotNull(value, "value");
synchronized (lock) {
if (hasBeenEnded) {
if (hasEnded) {
logger.log(Level.FINE, "Calling setAttribute() on an ended Span.");
return;
}
Expand Down Expand Up @@ -403,7 +407,7 @@ public void addEvent(Event event, long timestamp) {

private void addTimedEvent(TimedEvent timedEvent) {
synchronized (lock) {
if (hasBeenEnded) {
if (hasEnded) {
logger.log(Level.FINE, "Calling addEvent() on an ended Span.");
return;
}
Expand All @@ -416,7 +420,7 @@ private void addTimedEvent(TimedEvent timedEvent) {
public void setStatus(Status status) {
Preconditions.checkNotNull(status, "status");
synchronized (lock) {
if (hasBeenEnded) {
if (hasEnded) {
logger.log(Level.FINE, "Calling setStatus() on an ended Span.");
return;
}
Expand All @@ -428,7 +432,7 @@ public void setStatus(Status status) {
public void updateName(String name) {
Preconditions.checkNotNull(name, "name");
synchronized (lock) {
if (hasBeenEnded) {
if (hasEnded) {
logger.log(Level.FINE, "Calling updateName() on an ended Span.");
return;
}
Expand All @@ -449,12 +453,12 @@ public void end(EndSpanOptions endOptions) {

private void endInternal(long endEpochNanos) {
synchronized (lock) {
if (hasBeenEnded) {
if (hasEnded) {
logger.log(Level.FINE, "Calling end() on an ended Span.");
return;
}
this.endEpochNanos = endEpochNanos;
hasBeenEnded = true;
hasEnded = true;
}
spanProcessor.onEnd(this);
}
Expand All @@ -471,7 +475,7 @@ public boolean isRecording() {

void addChild() {
synchronized (lock) {
if (hasBeenEnded) {
if (hasEnded) {
logger.log(Level.FINE, "Calling end() on an ended Span.");
return;
}
Expand Down Expand Up @@ -511,7 +515,7 @@ private RecordEventsReadableSpan(
this.kind = kind;
this.spanProcessor = spanProcessor;
this.resource = resource;
this.hasBeenEnded = false;
this.hasEnded = false;
this.numberOfChildren = 0;
this.clock = clock;
this.startEpochNanos = startEpochNanos;
Expand All @@ -523,7 +527,7 @@ private RecordEventsReadableSpan(
@Override
protected void finalize() throws Throwable {
synchronized (lock) {
if (!hasBeenEnded) {
if (!hasEnded) {
logger.log(Level.SEVERE, "Span " + name + " is GC'ed without being ended.");
}
}
Expand Down
50 changes: 50 additions & 0 deletions sdk/src/main/java/io/opentelemetry/sdk/trace/SpanData.java
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
package io.opentelemetry.sdk.trace;

import com.google.auto.value.AutoValue;
import com.google.common.base.Optional;
import io.opentelemetry.sdk.common.InstrumentationLibraryInfo;
import io.opentelemetry.sdk.resources.Resource;
import io.opentelemetry.trace.AttributeValue;
Expand Down Expand Up @@ -172,6 +173,23 @@ public abstract class SpanData {
*/
public abstract boolean getHasRemoteParent();

/**
* Returns whether this Span has already been ended.
*
* @return {@code true} if the span has already been ended, {@code false} if not.
* @since 0.4.0
*/
public abstract boolean getHasEnded();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend just "hasEnded" without the extraneous get

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, AutoValue seems to require a get-prefix here (see also getHasRemoteParent).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yet another reason for me to hate autovalue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to decide if isEnded is better, or just as ugly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isFinished ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think isFinished sounds like it would be true if the span is exported (finished passing through the span pipeline). Maybe a name for the inverse could be used, like isRunning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isActive? just to throw out another option

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being "active" (on a scope) is already another concept in OpenTelemetry, unfortunately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like isRunning


/**
* Returns the latency of the {@code Span} in nanos. If still active then returns now() - start
* time.
*
* @return the latency of the {@code Span} in nanos.
* @since 0.4.0
*/
public abstract long getLatencyNanos();

/**
* An immutable implementation of {@link Link}.
*
Expand Down Expand Up @@ -280,6 +298,12 @@ public abstract static class Builder {

abstract List<io.opentelemetry.trace.Link> getLinks();

abstract Optional<Long> getStartEpochNanos();

abstract Optional<Long> getEndEpochNanos();

abstract Optional<Long> getLatencyNanos();

/**
* Create a new SpanData instance from the data in this.
*
Expand All @@ -291,6 +315,14 @@ public SpanData build() {
setAttributes(Collections.unmodifiableMap(new HashMap<>(getAttributes())));
setTimedEvents(Collections.unmodifiableList(new ArrayList<>(getTimedEvents())));
setLinks(Collections.unmodifiableList(new ArrayList<>(getLinks())));

// Calculate latency from start & end if possible.
if (!getLatencyNanos().isPresent()
&& getStartEpochNanos().isPresent()
&& getEndEpochNanos().isPresent()
&& getEndEpochNanos().get() >= getStartEpochNanos().get()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really ugly block of code. Can we at least move it into it's own method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is only required for tests. The best option from my perspective would be to remove the latency from SpanData.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with removing latency from the SpanData. For the EndEpoch what is the behavior you want?

setLatencyNanos(getEndEpochNanos().get() - getStartEpochNanos().get());
}
return autoBuild();
}

Expand Down Expand Up @@ -442,5 +474,23 @@ public abstract Builder setInstrumentationLibraryInfo(
* @since 0.3.0
*/
public abstract Builder setHasRemoteParent(boolean hasRemoteParent);

/**
* Sets to true if the span has been ended.
*
* @param hasEnded A boolean indicating if the span has been ended.
* @return this
* @since 0.4.0
*/
public abstract Builder setHasEnded(boolean hasEnded);

/**
* Sets to true if the span has been ended.
*
* @param latencyNanos A boolean indicating if the span has been ended.
* @return this
* @since 0.4.0
*/
public abstract Builder setLatencyNanos(long latencyNanos);
}
}
Loading