Skip to content
This repository has been archived by the owner on May 23, 2023. It is now read-only.

Tracer.Inject and Tracer.Extract are too abstract #127

Open
rakyll opened this issue Nov 29, 2016 · 19 comments
Open

Tracer.Inject and Tracer.Extract are too abstract #127

rakyll opened this issue Nov 29, 2016 · 19 comments

Comments

@rakyll
Copy link

rakyll commented Nov 29, 2016

These two functions exist because it exists as a terminology in the spec but the Go implementation may not need to enforce any types to represent these concepts.

The Tracer interface requires users to implement:

 Inject(sp SpanContext, format interface{}, carrier interface{}) error

which is an enforcement but the interface says nothing about the format and carrier. If both the input and output are of interface{}, even though users are enforced to implement an injector inside the block, the APIs are not helping them.

Letting them freely inject and extract depending on their transport layer and custom requirements would be more beneficial. Then the OpenTracing package can export injector and extractor for common requirements, such as:

req = opentracing.InjectIntoHTTPRequest(ctx, req)
@bhs
Copy link
Contributor

bhs commented Nov 29, 2016

@rakyll your issue title is inaccurate; Inject and Extract are designed for extensibility (and have already been extended as intended). What you're suggesting would require that OpenTracing know about every transport/carrier/whatever-you-want-to-call-it that its users want to use. This would be a regression from a functionality standpoint.

I do understand the appeal of more strongly-typed methods, by the way. But the direction you're suggesting would ultimately lead to a proliferation of (strongly-typed) Inject/Extract methods.

Meta-note / PS: it seems like you have a lot of concerns about the OpenTracing API in general. I'm happy to schedule a hangout / skype / whatever to discuss them in more detail.

@yurishkuro
Copy link
Member

@rakyll The "too abstract" is actually more of a fault of Go language. For example, in languages that support generics (Java, C#), the OpenTracing API actually enforces the types.

And to add to what Ben said, introducing transport specific methods like InjectIntoHTTPRequest does not eliminate the need for interface{} based methods because we do not know which other transports people might be using in the future. Current API allows building support for more transports without changing the API.

@rakyll
Copy link
Author

rakyll commented Nov 29, 2016

What you're suggesting would require that OpenTracing know about every transport/carrier/whatever-you-want-to-call-it that its users want to use.

That's not what I am suggesting. Each carrier may implement their own injector and extractor without depending on a common interface. Think about it as the encoders/decoders in Go. There is not a single Serializer interface in the standard library but encoding/json and encoding/xml are providing their own encoders/decoders.

An (in interface{}, out interface{}) is not an abstraction. It may help users to see the concept of injecting and extracting listed on godoc but both concepts could have been communicated with docs, examples.

Meta-note / PS: it seems like you have a lot of concerns about the OpenTracing API in general. I'm happy to schedule a hangout / skype / whatever to discuss them in more detail.

Sure, this sounds great. These are mostly naming/organization related items that I can file as an issue here but we can chat if it is productive.

The "too abstract" is actually more of a fault of Go language. For example, in languages that support generics (Java, C#), the OpenTracing API actually enforces the types.

Exactly. This is a limitation of the language. Hence, Go APIs often lack central generic big interfaces.

Current API allows building support for more transports without changing the API.

I saw your point clearly now. I think I never truly cared about this problem because it is similar to the json vs xml case. This is the reality of the Go programmers:

var t T
switch (contentType) {
case "application/json":
  json.Decode(body, &t)
case "application/xml":
  xml.Decode(body, &t)
}

None of these very significant problems will ever fix without significant changes to the language. My point is rather than providing magic, we should limit the APIs at the level of what language provides. Many Go projects such as Martini failed for similar reasons and this pattern is so unloved by the Go community and its creators that it has its own item on the Go proverbs, "interface{} says nothing".

@bhs
Copy link
Contributor

bhs commented Nov 29, 2016

@rakyll let's schedule the hangout :)

Will be much faster for all parties. Let me know what works at bhs@lightstep.com.

@kelseyhightower
Copy link

@rakyll Thanks for reporting this issue. I had a really hard time working with this package because of the interfaces. I had to lean heavily on the examples because, as you've mentioned, the interfaces are a bit to abstract to guide me in the right direction.

@bensigelman @yurishkuro Thanks for the hard work on this package and providing support for OpenTracing to the Go community. The feedback from @rakyll is gold and would help improve the adoption of this package, especially for experienced Go programmers who are accustomed to "the Go way of doing things". I do recognize that I'm biased after working so long with the Go standard library and other popular Go packages.

@bhs
Copy link
Contributor

bhs commented Dec 3, 2016

@kelseyhightower thanks for chiming in.

@rakyll and I have a VC scheduled for Monday afternoon where I'm hoping to discuss this as well as some much higher-level stuff around how OpenTracing in general, opentracing-go specifically, and whatever golang-team is planning around builtin tracing functionality can fit together. If you or anyone else wants to join in, please let me know and I'm happy to add to the cal invite.

Regarding this specific issue...

I am sympathetic to the concerns. In fact, if you go way back in the history of this repository (when we were truly just prototyping) you'll see that things were more strongly typed (e.g., the TraceContextMarshaller of yesteryear).

The reason the API moved towards interface{} is why most anything moves towards interface{}: because it seemed like the only way to satisfy some real-world requirement. I would point towards rpc.Call in the Go stdlib as an analogous solutio to an analogous predicament.

To state the requirement as concisely as I can: let's imagine that your company needs to bind OpenTracing to the esoteric RPC system you're using. Let's further imagine that we can't afford the overhead of encoding the SpanContext twice (which was what was happening in TraceContextMarshaller way back when). Now, without adding a dependency from opentracing-go on your campany's esoteric RPC system (which is, IMO, a total non-starter from a maintenance standpoint), how do we make a type-safe Inject variant that satisfies the above? (@yurishkuro please chime in if you can think of a more concise way of describing the design constraints here... I am trying to summarize some of what was going on for you with Uber's internal systems... And I have absolutely seen the above pattern through LightStep's work with well-respected technology companies.)

There's more about this topic here:

http://opentracing.io/documentation/pages/api/cross-process-tracing.html#custom-injectextract-carrier-formats

Last but not least...

To step back, the idea is that vanilla application code never even sees Inject and/or Extract: they are intended to be used by an RPC/IPC layer and hopefully nowhere else. (E.g., if you use GRPC and Go, as an application programmer you never care about these functions as they are dealt with by the GRPC-OT bindings) So let's keep that in mind, too.

@yurishkuro
Copy link
Member

@bensigelman I can join the call Monday afternoon

For the reasons described (i.e. to support yet unknown RPC protocols), I think the tracer implementations do need to deal with inject/extract that are based on interface{}. However, we can make things easier for the end users of the API via utility packages. E.g. there can be strongly typed static functions like this:

package http

func Inject(tracer opentracing.Tracer, spanContext opentracing.SpanContext, req http.Request) {
    tracer.Inject(
        spanContext, 
        Format.HTTP_HEADERS, 
        opentracing.HTTPHeadersCarrier(req.Header))
}

Another, perhaps even better option, is to combine format and carrier into a single struct, e.g.

type Carrier struct {
    format opentracing.Format
    carrier interface{}
}

func HTTPHeaders(req http.Request) Carrier {
    return Carrier{
        format: Format.HTTP_HEADERS, 
        carrier: opentracing.HTTPHeadersCarrier(req.Header),
    }
}

type Tracer interface {
    func Inject(spanContext SpanContext, carrier Carrier)
}

This makes the user side of the API strongly typed, but the tracer still gets interface{} to work with and therefore can support more formats via add-on codecs.

@rakyll
Copy link
Author

rakyll commented Dec 5, 2016

please chime in if you can think of a more concise way of describing the design constraints here... I am trying to summarize some of what was going on for you with Uber's internal systems... And I have absolutely seen the above pattern through LightStep's work with well-respected technology companies.)

I think there are two problems.

  • How to inject/extract span info
  • How to intercept and auto inject/extract span info

OpenTracing currently provides a generic solution to automatically provide the injection/extracting whereas the point I am leaning towards is about enabling this manually first and making the automatic mechanism optional.

Callsites know about the carrier. This is not primarily tracing mechanism's responsibility. By bundling this functionality to the tracing mechanism, we are making it opinionated to work against one type of carrier. This is such a big advantage for companies with unified transport layer such as Google, but I wonder if this approach will really scale the companies with variety of different transport layers. They will end up implementing gigantic injector/extractors or will have to pass different tracers for each carrier.

(Let's talk more about this during the GVC. Just wanted to dump some ideas to give more context about what I think.)

@jmacd
Copy link
Contributor

jmacd commented Dec 5, 2016

It's worth examining where this problem comes from. We have a situation in which (all for good reason):

(1) Tracer implementations hide the structural details of their SpanContext
(2) Carrier/Format operate on foreign data types

(@bensigelman Re: "overhead of encoding the SpanContext twice (which was what was happening in TraceContextMarshaller way back when)". I had to spell this out for myself: IOW you've rejected the solution in which we specify a generic SpanContext representation and thereby nicely separate handling of the SpanContext-hidden details from the foreign-data-type-hidden details.)

The Inject/Extract APIs are troublesome because we're trying to build a bridge between two unspecified structures. It is inevitable that we will use a switch statement in one place or another, and one of these types has to know about the other's concrete structure. I would argue that this OT issue is the same for all OT libraries, but we only think to complain about it for strongly typed languages.

For Go, you could imagine replacing these interface{} types by non-empty interfaces, but there isn't much of use to put in those interfaces (e.g., "Description() string"). I'm inclined to suggest something closer to @yurishkuro's example that binds the two parameters (carrier type and data type) together in a strongly-typed way. Under the covers, nothing much changes: the Tracer implementation will handle the well-known carrier formats and data types, while external carrier / data types will be handled by carriers that switch on known Tracer implementations.

type CarrierInjector interface {
  Inject(Tracer tracer, SpanContext)
}

Tracer.Inject calls through to the injector:

func (t *someTracerImpl) Inject(sc ot.SpanContext, ci ot.CarrierInjector) {
  ci.Inject(t, sc)
}

A user would write: tracer.Inject(span, SuperArcaneTransportCarrier(req)):

func SuperArcaneTransportCarrier(req *SATRequest) CarrierInjector {
  return satCarrier{req}
}

type satCarrier struct {
  req *SATRequest
}

func (sc satCarrier) Inject(tracer ot.Tracer, ot.SpanContext) {
  switch t := tracer.(type) {
    ...
  }
}

Some additional machinery would be needed for the builtin carrier formats and data types defined at the OT level, since those APIs cannot depend on the Tracer implementation directly.

@yurishkuro
Copy link
Member

@rakyll said

Callsites know about the carrier. This is not primarily tracing mechanism's responsibility. By bundling this functionality to the tracing mechanism, we are making it opinionated to work against one type of carrier.

I think there's a subtle distinction here. Callsites know about the transport and how to encode certain types of data into that transport. They have no clue what data the tracer wants to encode. That's why we chose the two most general data Formats to be (a) a binary blob and (b) a set of key/value strings. If the transport can support one of those directly, then we're done. If it needs those formats encoded in some way, then it's the callsite responsibility to do so.

In theory, having these two Formats should be sufficient for any transport. But then we have HTTP, which plays fast and loose with preserving the data. Given it's the most prevalent transport, we gave it its own Format and required tracers to support it, meaning it's "like string key/values" but with caveats.

A Format describes the "encoding" of the data. A Carrier describes the structure and the API for accessing that data (the same Carrier can be used with different encodings, e.g. both TextMap and HTTP formats use string key/value carrier). The semantics of the data is private to the tracer implementation.

@bhs
Copy link
Contributor

bhs commented Dec 6, 2016

This is not going to be a long-enough message, but several of us just had a nice google hangout... we discussed the issue here as well as a number of other related but higher-level things.

@rakyll kindly offered to write up some more specific examples of how some of this Inject/Extract machinery might fit together and we can take things from there.

(I was trying to pay close attention so didn't take detailed notes from the call... if someone else did, feel free to dump them here or wherever)

@yurishkuro
Copy link
Member

@rakyll @bensigelman as a follow-up to the call, here's an example of how inject/extract API is used in TChannel:

Inject: https://github.com/uber/tchannel-go/blob/dev/tracing.go#L172

  • tchannel supports application-level headers as unrestricted key/value strings, so we can use a plain Format.TEXT_MAP encoding
  • however, to separate tracing headers from other application headers, we use a special carrier tracingHeadersCarrier that prefixes each key with $tracing$

Extract: https://github.com/uber/tchannel-go/blob/dev/tracing.go#L241

  • similar to inject, plain Format.TEXT_MAP with a carrier that only gives the tracer access to headers that start with $tracing$, after stripping away the prefix

And another example from yarpc, which uses Thrift-encoded envelope for RPC messages and stores span context as a binary blob: https://github.com/yarpc/yarpc-go/blob/dev/serialize/serialize.go#L111

@rakyll
Copy link
Author

rakyll commented Dec 7, 2016

Pardon me for the late response. I wanted to take the time to read more and understand the decoupling between Tracer's inject/extract, formats and carriers. I was able to understand finally why interface{} ended up as the only solution.

To summarize,

  • Carrier is where you carry the trace information.
  • Format is how you represent the trace information.
  • Tracer is the only one who knows how to apply formatted information into the carrier.

Dependencies among these actors:

  • Tracer needs formatted information and the carrier object.
  • Format needs the span context.
  • Carrier needs no one.

Additional to that:

  • Format and carrier are coupled because you are limited by the format a carrier can carry.
  • Tracer and carrier are too loosely coupled because a tracing backend only understands/works with certain carrier types.

Given we cannot determine user’s format and carrier, there is no way to strictly type them. But, I still don’t understand why format and carrier has to be totally independent. Would it be possible to inspect the the format from the carrier type to reduce the number of unknowns? For builtin types, there is already enforcement for the carrier type.

Then

Inject(sm SpanContext, any interface{}) error
Extract(any interface{}) (SpanContext, error)

And they can document any can be io.Writer/Reader, TextMapWriter/Reader and HTTPHeadersCarrier.

Ross took notes from the meeting btw, publishing them now: https://gist.github.com/rakyll/da8ac20922dbea68466b8fafc885711f

@yurishkuro
Copy link
Member

@rakyll I think it's essentially what I proposed in the second part here

@jmacd
Copy link
Contributor

jmacd commented Dec 8, 2016 via email

@bhs
Copy link
Contributor

bhs commented Dec 8, 2016

Yes, the decoupled format and carrier has theoretical advantages, but in practice I don't see it being worthwhile... custom formats and carriers are usually 1:1.

@jmacd
Copy link
Contributor

jmacd commented Dec 8, 2016 via email

@yurishkuro
Copy link
Member

@jmacd I may be misunderstanding your proposal. It does not seem to address the interface{} issue with the actual Tracer.Inject method, but rather to suggest an implementation strategy for tracers to support custom injectors, which incidentally many already do, just without a standardized API.

What I am suggesting still keeps format and carrier conceptually de-coupled and still relies on interface{} for the actual carrier, but in practice that complexity is only exposed to the Tracer implementation, while the end user is exposed to strongly typed signatures for standard formats.

@jmacd
Copy link
Contributor

jmacd commented Jan 18, 2017

@yurishkuro Related discussion in a C++ PR here: opentracing/opentracing-cpp#6

I never answered your question--why I'd introduce a CarrierInjector type. I was thinking of whether I could derive a new kind of IPC mechanism from some existing (supported) IPC mechanism (hiding the implementation), use a carrier format that all the tracers knew about, while not exposing the concrete carrier-format type in my API. I would provide a corresponding new kind of carrier injector that would dispatch to a known carrier-format type. Yes, at some level, this is just asking to standardize the mechanism for supporting custom injectors. OTOH, I think there's a stronger statement to be made -- tracer implementors cannot support custom injectors until we standardize something like this at the OT level.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants