Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch Jaeger receiver to internal data model #844

Conversation

dmitryax
Copy link
Member

@dmitryax dmitryax commented Apr 18, 2020

Conversion were replicated from existing jaeger -> OC -> internal.

One concern:
We are getting ParentSpanID from Jaeger span links and writing the link to internal Span.Links slice as well. I'm not sure if it's desired behavior. I'd remove parent span link from the translated internal span and keep that information only ParentSpanID field. Keeping the link copy in Span.Links makes it even more confusing because, with the links conversion, we lose RefType information (like Span_Link_PARENT_LINKED_SPAN or Span_Link_TYPE_UNSPECIFIED), OTLP doesn't have that field.
@tigrannajaryan @pjanotti what do you think about it ^

@codecov-io
Copy link

codecov-io commented Apr 18, 2020

Codecov Report

Merging #844 into master will decrease coverage by 0.61%.
The diff coverage is 59.09%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #844      +/-   ##
==========================================
- Coverage   85.93%   85.32%   -0.62%     
==========================================
  Files         157      159       +2     
  Lines       11841    12115     +274     
==========================================
+ Hits        10176    10337     +161     
- Misses       1297     1396      +99     
- Partials      368      382      +14     
Impacted Files Coverage Δ
translator/trace/jaeger/jaegerproto_to_traces.go 54.03% <54.03%> (ø)
translator/trace/jaeger/jaegerthrift_to_traces.go 62.60% <62.60%> (ø)
receiver/jaegerreceiver/factory.go 93.22% <100.00%> (+0.05%) ⬆️
receiver/jaegerreceiver/trace_receiver.go 85.23% <100.00%> (-0.21%) ⬇️
service/builder/receivers_builder.go 78.37% <0.00%> (+1.35%) ⬆️
translator/internaldata/resource_to_oc.go 76.47% <0.00%> (+2.94%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6358a1a...66c0d38. Read the comment docs.

@dmitryax dmitryax force-pushed the jaeger-receiver-to-internal-data branch from 1ccb464 to 7765622 Compare April 18, 2020 20:54
@@ -33,7 +33,7 @@ const header = `// Copyright 2020 OpenTelemetry Authors
// Code generated by "internal/data_generator/main.go". DO NOT EDIT.
// To regenerate this file run "go run internal/data_generator/main.go".

package data`
package pdata`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you split all changes to the data_generator and generated files in a separate PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done #850

}

// CreateMetricsReceiver creates a metrics receiver based on provided config.
func (f *Factory) CreateMetricsReceiver(
logger *zap.Logger,
ctx context.Context,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For unused arguments use _

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

span0.SetStartTime(pdata.TimestampUnixNano(uint64(t1.UnixNano())))
span0.SetEndTime(pdata.TimestampUnixNano(uint64(t2.UnixNano())))
span0.Status().InitEmpty()
span0.Status().SetCode(pdata.StatusCode(otlptrace.Status_NotFound))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an issue to declare all statuses as constants in the pdata.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done #851

@@ -0,0 +1,272 @@
// Copyright 2019, OpenTelemetry Authors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2020?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

Comment on lines 31 to 46
// Number of special fields in the Node. See the code below that deals with special fields.
const specialNodeAttrCount = 7

// Number of special fields in the Resource.
const specialResourceAttrCount = 1

// Calculate maximum total number of attributes. It is OK if we are a bit higher than
// the exact number since this is only needed for capacity reservation.
maxTotalAttrCount := 0
if ocNode != nil {
maxTotalAttrCount += len(ocNode.Attributes) + specialNodeAttrCount
}
if ocResource != nil {
maxTotalAttrCount += len(ocResource.Labels) + specialResourceAttrCount
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just dead code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate PR then :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done #852

@dmitryax dmitryax force-pushed the jaeger-receiver-to-internal-data branch 4 times, most recently from 94a8259 to 66c0d38 Compare April 20, 2020 16:37
Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversion were replicated from existing jaeger -> OC -> internal.

One concern:
We are getting ParentSpanID from Jaeger span links and writing the link to internal Span.Links slice as well. I'm not sure if it's desired behavior. I'd remove parent span link from the translated internal span and keep that information only ParentSpanID field. Keeping the link copy in Span.Links makes it even more confusing because, with the links conversion, we lose RefType information (like Span_Link_PARENT_LINKED_SPAN or Span_Link_TYPE_UNSPECIFIED), OTLP doesn't have that field.
@tigrannajaryan @pjanotti what do you think about it ^

@dmitryax I do not quite understand. Can you point to the old code that does this? You probably also need to check this with backend/UI team to make sure we don't break them if we change the behavior.

Comment on lines +136 to +155
attrs := jTagsToInternalAttributes(span.Tags)
setInternalSpanStatus(attrs, dest.Status())
if spanKindAttr, ok := attrs[tracetranslator.TagSpanKind]; ok {
dest.SetKind(jSpanKindToInternal(spanKindAttr.StringVal()))
}
dest.Attributes().InitFromMap(attrs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We convert a slice to map here, then convert the map to slice again. I suspect this may be slow. Let's get a perf profile once this is merged and see what it shows. If it is slow we can probably just do a one pass and convert the source Tags slice directly to dest attributes.
This is OK for now, but let's verify it.

@dmitryax
Copy link
Member Author

@tigrannajaryan
| Can you point to the old code that does this?
Here we set ParentSpanID https://github.com/open-telemetry/opentelemetry-collector/blob/master/translator/trace/jaeger/jaegerproto_to_protospan.go#L117
And here we duplicate the already used parentSpanID reference it in SpanLinks https://github.com/open-telemetry/opentelemetry-collector/blob/master/translator/trace/jaeger/jaegerproto_to_protospan.go#L152-L178 , which looks redundant. But this is not a big concern, I'm fine with that duplication.

| You probably also need to check this with backend/UI team to make sure we don't break them if we change the behavior.
Backend reads CHILD_OF span reference, which will be recreated back from parentSpanID field with internal->jaeger exporter translation so should be fine.

@dmitryax
Copy link
Member Author

@tigrannajaryan
Since jaeger has only two reference types: FOLLOWS_FROM and CHILD_OF removing that duplication allows us to reconstruct Jaeger reference type on exporter:

  • any existing Span link will get FOLLOWS_FROM ref type
  • ParentSpanId will be converted to CHILD_OF reference added back to the jaeger SpanRefs collection.

@dmitryax
Copy link
Member Author

In addition to the comment above, looks like we are not checking for this duplicate in OC SpanLinks during OC->Jaeger translation

jReferences, err := ocLinksToJaegerReferencesProto(ocSpan.Links)
if err != nil {
return nil, fmt.Errorf("error converting OC links to Jaeger references: %v", err)
}
// OC ParentSpanId can be nil/empty: only attempt conversion if not nil/empty.
if len(ocSpan.ParentSpanId) != 0 {
uParentSpanID, err := tracetranslator.BytesToUInt64SpanID(ocSpan.ParentSpanId)
if err != nil {
return nil, fmt.Errorf("OC incorrect parent span ID: %v", err)
}
jReferences = append(jReferences, jaeger.SpanRef{
TraceID: traceID,
SpanID: jaeger.SpanID(uParentSpanID),
RefType: jaeger.SpanRefType_CHILD_OF,
})
}

It means that Jaeger -> OC -> Jaeger translation most of the times adds a duplicate in Span.References for Parent span Id

@tigrannajaryan
Copy link
Member

@dmitryax I don't know if the current parent span translation logic is intentional or a bug, I am not familiar with this part of the codebase. I am OK if you want to clean it up. We need to test the Collector fully anyway. Maybe @pjanotti knows more.

@dmitryax
Copy link
Member Author

@pjanotti could you please take a look at the PR since you are the most familiar with original translation?

@dmitryax dmitryax force-pushed the jaeger-receiver-to-internal-data branch 2 times, most recently from 3253c7d to e1374a6 Compare April 21, 2020 20:10
@dmitryax
Copy link
Member Author

@pjanotti @tigrannajaryan I've applied the change to the span links translation as described above.

@dmitryax
Copy link
Member Author

Appeared that HTTP Status code -> OC Status code translation is different between jaeger thrift -> OC and jaeger proto -> OC.

@owais @pjanotti do you know if it's intentional?

In this PR I've applied the translation in both thrift and proto versions.

@dmitryax dmitryax force-pushed the jaeger-receiver-to-internal-data branch from e1374a6 to ca5aa7d Compare April 21, 2020 20:48
Translation was replicated from existing jaeger -> OC -> internal logic.
@dmitryax dmitryax force-pushed the jaeger-receiver-to-internal-data branch from ca5aa7d to 54e252e Compare April 21, 2020 23:20
@tigrannajaryan
Copy link
Member

Appeared that HTTP Status code -> OC Status code translation is different between jaeger thrift -> OC and jaeger proto -> OC.

* We have the http codes translation in thrift https://github.com/open-telemetry/opentelemetry-collector/blob/master/translator/trace/jaeger/jaegerthrift_to_protospan.go#L222

* And don't have it in proto translation https://github.com/open-telemetry/opentelemetry-collector/blob/master/translator/trace/jaeger/jaegerproto_to_protospan.go#L224

@owais @pjanotti do you know if it's intentional?

In this PR I've applied the translation in both thrift and proto versions.

@dmitryax I know that jaeger proto version is also used by SAPM, so I would take the proto version of the translation as the canonical one and modify thirft to match it.

@tigrannajaryan
Copy link
Member

@dmitryax actually, ignore that, I am not sure what I said is correct. Can you check with the backend team?

@tigrannajaryan
Copy link
Member

We discussed the translation and all looks good. We will test it one more time when this goes to production.

@tigrannajaryan tigrannajaryan merged commit e44607a into open-telemetry:master Apr 22, 2020
wyTrivail pushed a commit to mxiamxia/opentelemetry-collector that referenced this pull request Jul 13, 2020
Translation was replicated from existing jaeger -> OC -> internal logic.
MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this pull request Nov 11, 2021
Troels51 pushed a commit to Troels51/opentelemetry-collector that referenced this pull request Jul 5, 2024
swiatekm pushed a commit to swiatekm/opentelemetry-collector that referenced this pull request Oct 9, 2024
…open-telemetry#844)

* fix pipeline order when enabling k8sattributes

* bump collector and document breakage

* kubernetesAttributes example

* regenerate examples

* Update charts/opentelemetry-collector/UPGRADING.md

Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>

---------

Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants