Span api feels unintuitive and underdocumented #1792

dennisjac · 2021-01-06T16:24:20Z

This only affects the JavaScript OpenTelemetry library
This may affect other libraries, but I would like to get opinions here first

For the last couple of days I have tried (and mostly succeeded) to implement some basic Opentelemetry instrumentation but I also ran into some problems which might be related to issues with documentation and my understanding of how the api is supposed to work.

The documentation is rather ... minimal. Looking at https://opentelemetry.io/docs/js/instrumentation/ there only seems to be rather incomplete documentation about how the instrumentation is supposed to work. Topics like context (and its propagation), withSpan(), status and kind of spans are not mentioned at all. The moment somebody will start to work with nested spans the use if startSpan()+span.end() will not work anymore (as context is ignored) and there is no easy way for users to actually figure out what is going on as there is no warning about the lack of context.
Some of the chosen defaults seem odd now that I have worked with Opentelemetry instrumentation a bit:
2.1 When creating a new span with startSpan() one has to explicitly specify the parent span. From my understanding you almost always want to want to be the currently active span to be the parent of the new span e.g.:
tracer.startSpan(name, { parent: tracer.getCurrentSpan() } );
I currently cannot come up with an example where I would not want the specify a parent or use a different one. Shouldn't this be the default behavior then with the ability to override this in cases where it is necessary?
2.2 startSpan() does not "activate" the new span and in order to do that withSpan() needs to be called afterwards. Again given that you probably want to do this 99% of the time shouldn't this be the default?
Combining all of the above I feel the startSpan() and withSpan() functions are too low-level and make the adding of instrumentation unnecessarily cumbersome. Would it be possible to add a utility function that combines the effects mentioned above to provide an API that is less flexible but covers the most likely use-case better?

When looking at examples around the web I see the following usage as idiomatic:

span = tracer.startSpan('name', { parent: tracer.getCurrentSpan() })
tracer.withSpan(span, () => {
  doStuff()
  span.end()
})

Would it be possible to add a function that looks like this:

tracer.doSpan('name',[{options}], () => {
  doStuff()
})

This function would:

Create a new span 'name'
Implicitly set the current span of the surrounding context as the parent for the new span
run doStuff() with the proper expected context
Implicitly call span.end() after running doStuff()

To test this I created the following (naive and conceptional) implementation:

function doSpan(name, fn) {
    const span = tracer.startSpan(name, { parent: tracer.getCurrentSpan() } );
    const boundfn = tracer.bind(fn, span);
    return boundfn().then((res) => { span.end(); return res; });
}

This is an instrumented function without doSpan():

app.get('/testwith', async (req, res) => {
    const span1 = tracer.startSpan('span 1', { parent: tracer.getCurrentSpan() } );
    await tracer.withSpan(span1, async () => {
        const span2 = tracer.startSpan('span 2', { parent: span1 } );
        await tracer.withSpan(span2, async () => {
            const span3 = tracer.startSpan('span 3', { parent: span2 } );
            await tracer.withSpan(span3, async () => {
                await new Promise((resolve, reject) => { setTimeout(resolve, 1000); });
                span3.setStatus({code: opentelemetry.StatusCode.OK});
                span3.end();
            });
            span2.setStatus({code: opentelemetry.StatusCode.OK});
            span2.end();
        });
        span1.setStatus({code: opentelemetry.StatusCode.OK});
        span1.end();
  });
  res.send('done');
});

and this is a version with the example doSpan() implementation:

app.get('/testdo', async (req, res) => {
  await doSpan('span 1', async () => {
    await doSpan('span 2', async () => {
      await doSpan('span 3', async () => {
        await new Promise((resolve, reject) => { setTimeout(resolve, 1000); });
        tracer.getCurrentSpan().setStatus({code: opentelemetry.StatusCode.OK});
      });
      tracer.getCurrentSpan().setStatus({code: opentelemetry.StatusCode.OK});
      console.log(`result = ${result}`);
    });
    tracer.getCurrentSpan().setStatus({code: opentelemetry.StatusCode.OK});
  });
  res.send('done');
});

In a better implementation the callback function for doSpan() could receive the newly created span as an argument so that
tracer.getCurrentSpan().setStatus({code: opentelemetry.StatusCode.OK});
can become
span.setStatus({code: opentelemetry.StatusCode.OK});

I feel this version is easier on the eye but also probably does what most people would expect by handling the parenting and context and always end()-ing the span (which the example in the documentation seems to forget). If you then need specifically different behavior regarding parenting or context you can fall back to startSpan() and withSpan().

The text was updated successfully, but these errors were encountered:

dyladan · 2021-01-06T17:05:31Z

The documentation is rather ... minimal. Looking at opentelemetry.io/docs/js/instrumentation there only seems to be rather incomplete documentation about how the instrumentation is supposed to work. Topics like context (and its propagation), withSpan(), status and kind of spans are not mentioned at all. The moment somebody will start to work with nested spans the use if startSpan()+span.end() will not work anymore (as context is ignored) and there is no easy way for users to actually figure out what is going on as there is no warning about the lack of context.

We're aware the documentation is a little lacking. If you're willing to help us out there it would be greatly appreciated. This is something I will be focusing on in the next week or so because I want the docs to be more approachable when we launch API 1.0

2.1 When creating a new span with startSpan() one has to explicitly specify the parent span. From my understanding you almost always want to want to be the currently active span to be the parent of the new span e.g.:
tracer.startSpan(name, { parent: tracer.getCurrentSpan() } );
I currently cannot come up with an example where I would not want the specify a parent or use a different one. Shouldn't this be the default behavior then with the ability to override this in cases where it is necessary?

For starters, the currently active span is the default. Not sure what version of the API you're using, but the parent option was actually removed from startSpan and the parent is extracted from the context (which defaults to the current active context).

2.2 startSpan() does not "activate" the new span and in order to do that withSpan() needs to be called afterwards. Again given that you probably want to do this 99% of the time shouldn't this be the default?

The spec explicitly states this should be the default behavior.

Span creation MUST NOT set the newly created Span as the currently active Span by default, but this functionality MAY be offered additionally as a separate operation.

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#span-creation

Would it be possible to add a function that looks like this:
tracer.doSpan('name',[{options}], () => {
  doStuff()
})

Yes this would definitely be possible according to the wording of the spec. I would probably modify it slightly to something like this (wondering why you had options in an array?).

tracer.startActiveSpan('name', options, context.active(), (span) => {
	doStuff();
	span.end(); // the span may not always end at the end of the function, so it's important that the user has the option to end it whenever it makes sense.
});

jonaskello · 2021-01-24T11:45:49Z

I agree this is very confusing. I have been trying to create a child span for a long time now without any success. At least my attempted child spans to not show up as nested in jaeger. They do participate in the same trace which is nice but they don't get nested. Everywhere I search I find the example mentioned in the original post:

span = tracer.startSpan('name', { parent: tracer.getCurrentSpan() })
tracer.withSpan(span, () => {
  doStuff()
  span.end()
})

However that does not work at all since:

There is no parent option anymore, that was explained above thanks, now I can stop trying that :-)
There seems to be no tracer.withSpan() anymore. Not sure what to do...

This is my latest iteration of trying many permutations to replace the above non-working example:

export function createChildSpan(operation: string): Span {
  const tracer = opentelemetry.trace.getTracer("default");
  const span = tracer.startSpan(operation);
  opentelemetry.setSpan(opentelemetry.context.active(), span);
  return span;
}

If there is a way to create child spans that works I would really like to know how :-). Perferably creating child spans that participate in the automatic spans using NodeTracerProvider.

EDIT:

I have now come to the conclusion that there is no longer any way to directly set parent-child relationships between spans. Because you cannot say what the parent span is when creating a child span. Instead we must indirectly set the parent-child relationship using context. So when starting a new span you can specify a context and in that context there might be a parent span and if so that becomes the parent of the newly started span. Is that a correct conclusion?

Previously when using opentracing I was sending along the parent span as parameter when making function calls and so I could create child spans in child functions by specifying the parent. I guess now if I want to be explicit and pass a parameter in my functions that parameter needs to be the context instead since that is what is holding the parent. So instead of keeping track of the parent and passing it along, I must now keep track of the parent (to be able to close it) and also the context that the parent was set in. Is that correct? Because I find no way of getting the context from the parent span. It seems only the SpanContext is available on a span and that is not the same as Context and therefore can not be used when creating child spans.

Flarna · 2021-01-25T07:30:10Z

So when starting a new span you can specify a context and in that context there might be a parent span and if so that becomes the parent of the newly started span. Is that a correct conclusion?

Yes, that's correct. See OTel spec for more details https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#context-interaction

Please note that OTel specification did the move from specifying a parent span to using the context everywhere. While this is for sure more complicated an not that intuitive it's more flexible. Anyhow, this ship has sailed as spec for tracing 1.0 is frozen meanwhile and implementations like OTel Js adapt and follow the spec which required breaking changes.
Clearly this such changes are not nice for consumers but the version starts with 0.x which indicates that it is not yet stable and breaking changes happen.

jonaskello · 2021-01-25T11:46:03Z

Thanks for the clarification. I guess the reason this change was made is because having explicit parents does not work with automated tracing. You can pass down parent within your own code but not to the auto tracing. So I can see the benefit in having a single way to specify parent that also works together with auto-tracing.

Btw, I found api.context.with() which I guess replaces tracer.withSpan?

Flarna · 2021-01-25T11:56:56Z

I was not that involved in the spec discussions regarding this. As far as I know the main reason was that ontext is more generic and it can be used to transfer more concerns and not just a parent Span/SpanContext.
Once context was in place and usable for linking spans it didn't make that much sense to keep an additional option for parent spans.

Btw, I found api.context.with() which I guess replaces tracer.withSpan?

yes, see also https://github.com/open-telemetry/opentelemetry-js#0140-to-0150

delaneyj · 2021-05-30T20:28:56Z

Not sure if this is the right place for this but https://opentelemetry.io/docs/js/instrumentation/ still references parent. I do feel like the parent made sense, but ended up wrapping the 0.19 version for personal clarity.

I don't understand why tracer is separate from context. When would you be creating spans that hit multiple tracers? If you are splitting up wouldn't this be at the collector level? Probably missing something because in my head the initial const span = tracer.startSpan(...) knows the origin tracer and some kind of span.child('foo', ...) wouldn't involve explicit tracer at all.

I ended up wrapping tracer+span together to make nested spans simpler.

export class TracerSpan {
constructor(public readonly tracer: Tracer, public readonly span: Span) {}

child(name: string): TracerSpan {
    const spanCtx = setSpan(context.active(), this.span)
    const span = this.tracer.startSpan(name, {}, spanCtx)
    return new TracerSpan(this.tracer, span)
}
}

and passing down to functions then using ala

const { span } = tracerSpan.child('loadingFoo')

which seems to work fine, but given the removal of parent makes me think I'm missing some understanding on how active context is setup.

Any clarity would really help, on a greenfield project and want to set the right idioms early on.

dyladan · 2021-06-01T19:48:41Z

Not sure if this is the right place for this but opentelemetry.io/docs/js/instrumentation still references parent. I do feel like the parent made sense, but ended up wrapping the 0.19 version for personal clarity.

Thanks for bringing this up. For now you can see an updated doc in the API repo here.

I don't understand why tracer is separate from context. When would you be creating spans that hit multiple tracers? If you are splitting up wouldn't this be at the collector level? Probably missing something because in my head the initial const span = tracer.startSpan(...) knows the origin tracer and some kind of span.child('foo', ...) wouldn't involve explicit tracer at all.

Tracer is separate from context because it may propagate additional information like baggage, or in the future metrics/logs related information. There is a lot of context and history for this decision, too much for a single github comment really, but a great place to start here if you really want to understand the reasoning behind the decisions would be the really well written OTEP 66.

I ended up wrapping tracer+span together to make nested spans simpler.

export class TracerSpan {
constructor(public readonly tracer: Tracer, public readonly span: Span) {}

child(name: string): TracerSpan {
    const spanCtx = setSpan(context.active(), this.span)
    const span = this.tracer.startSpan(name, {}, spanCtx)
    return new TracerSpan(this.tracer, span)
}
}

An interesting idea, and I'm glad it worked for your usecase, but you have completely lost the context created by setSpan in this case. I'm sure the function signature could be worked over a little to not lose it. You may be interested in the newly added tracer.startActiveSpan method. We're likely going to be adding more helper functions like this soon. Maybe a tracer.withParent(span, callback) context helper would be helpful for you?

delaneyj · 2021-06-01T22:40:53Z

Maybe a tracer.withParent(span, callback) context helper would be helpful for you?

YES! Coming from Go where it's ctx = ctx.WithValue(...) all the way down I basically want the same here. The setSpan felt wrong but didn't know if something like tracer.withParent already existed and I just wasn't seeing it. I open to really anything similar to the context story from Go. The context object didn't seem to match as its a singleton, not something that's sent down the call chain, but i'm probably missing something.

cartermp · 2022-09-10T21:56:40Z

I think we can probably consider this resolved, at least the major pain points, with the website docs (https://opentelemetry.io/docs/instrumentation/js/instrumentation/) getting an overhaul recently. Nearly all major concepts are now documented (with the easiest way suggested) and have a code sample you can copy/paste. I'm sure there's still some gaps though, so anything someone finds, please file an issue over at https://github.com/open-telemetry/opentelemetry.io and I'll see what I can do

open-telemetry#1792) Better align with the other exports. Follow-up to open-telemetry#1772.

legendecas added the api:traces Issues and PRs related to the Traces API label Sep 13, 2022

hi-ogawa mentioned this issue Nov 28, 2022

chore: setup OpenTelemetry hi-ogawa/youtube-dl-web-v2#38

Merged

13 tasks

pichlermarc pushed a commit to dynatrace-oss-contrib/opentelemetry-js that referenced this issue Dec 15, 2023

refactor(auto-instrumentations-node): Name export getResourceDetectors (

24ffb3b

open-telemetry#1792) Better align with the other exports. Follow-up to open-telemetry#1772.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Span api feels unintuitive and underdocumented #1792

Span api feels unintuitive and underdocumented #1792

dennisjac commented Jan 6, 2021

dyladan commented Jan 6, 2021

jonaskello commented Jan 24, 2021 •

edited

Loading

Flarna commented Jan 25, 2021

jonaskello commented Jan 25, 2021

Flarna commented Jan 25, 2021

delaneyj commented May 30, 2021 •

edited

Loading

dyladan commented Jun 1, 2021

delaneyj commented Jun 1, 2021

cartermp commented Sep 10, 2022

Span api feels unintuitive and underdocumented #1792

Span api feels unintuitive and underdocumented #1792

Comments

dennisjac commented Jan 6, 2021

dyladan commented Jan 6, 2021

jonaskello commented Jan 24, 2021 • edited Loading

Flarna commented Jan 25, 2021

jonaskello commented Jan 25, 2021

Flarna commented Jan 25, 2021

delaneyj commented May 30, 2021 • edited Loading

dyladan commented Jun 1, 2021

delaneyj commented Jun 1, 2021

cartermp commented Sep 10, 2022

jonaskello commented Jan 24, 2021 •

edited

Loading

delaneyj commented May 30, 2021 •

edited

Loading