Invalid attributes on Spans, Events, Links, and Resources #203

mwear · 2020-02-29T21:23:29Z

We've discussed this before and temporarily went with the idea of a strict mode, that was never finished and the Exception handing spec discourages an OTel implementation from raising unhandled exceptions. See: #14 and #126.

We are in the situation where invalid attributes on spans, links, events and resources can make it into the export pipeline where currently it breaks things. In order to free the export pipeline from having to validate all data, we should ensure that attributes that make to the pipeline are valid.

I'm proposing the following strategy for attribute validation:

When possible, validate attributes only on telemetry that is going to be exported
Avoid excessive runtime checks on data that won't be exported
Valid attributes should be kept, while invalid attributes should be dropped and a message can optionally be logged.
Invalid attributes have the potential to generate lots of logging data, so we should be conservative in logging, if we go that route

fbogsany · 2020-03-02T16:55:31Z

The general approach here sounds good to me.

fhwang · 2020-03-05T17:37:41Z

Assign this to me plz :)

fhwang · 2020-03-10T14:36:42Z

Any thoughts on where is the right place in the SDK to insert this validation logic? I was thinking it could be done right before calling the exporter's export method, but that seems to be referenced from multiple places in the SDK. I could put the checks in multiple places (and/or refactor to de-dupe), but I just want to check beforehand to make sure I'm not missing anything obvious.

[opentelemetry-ruby (validate-attrs-on-export*)]$ gg "\.export"|grep -wv test|grep -v "io.opentelemetry.sdk.trace.export.MultiSpanExporter"
sdk/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:            Timeout.timeout(@exporter_timeout_seconds) { @exporter.export(batch) }
sdk/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:              result_code = @exporter.export(batch)
sdk/lib/opentelemetry/sdk/trace/export/multi_span_exporter.rb:                merge_result_code(result_code, span_exporter.export(spans))
sdk/lib/opentelemetry/sdk/trace/export/simple_span_processor.rb:            @span_exporter&.export([span.to_span_data])

fbogsany · 2020-03-10T18:07:36Z

A couple of options:

In OpenTelemetry::SDK::Trace::Span#to_span_data. This is called from a SpanProcessor right before passing the data to an exporter. The downside of this is that processors may have to handle invalid data, but data will be valid for exporters (and in general, we expect more exporters to be plugged in than processors).
In a SpanProcessor. This can then be installed or removed as part of the export pipeline, and is relatively self-contained. This would likely be installed as the first child of a MultiSpanProcessor. The default pipeline might be something like: MultiSpanProcessor { ValidatingProcessor, BatchSpanProcessor { exporter = ConsoleSpanExporter } }.

mwear · 2020-03-10T18:46:05Z

Since SDK spans should only be created if they're going to be sampled, would it be reasonable to eagerly drop invalid attributes on span creation and also drop in Span#set_attribute?

I think we could probably do the same on Event and Link creation. Since they are immutable it requires making a copy to drop invalid data, which is somewhat cumbersome. It's harder to ensure that they will be sampled before doing the work, but it might be ok to eagerly drop attributes on initialization?

The upside to this approach is that processors will only see valid data, the slight downside, is that events and links might processed, but not sampled.

fhwang · 2020-03-11T16:37:38Z

Cool, I'm exploring this option. A few other questions:

SDK::Span#initialize and SDK::Span#trim_span_attributes do not currently test for validity of attribute keys. Can I assume that's just an oversight and fix that? (I'm thinking of centralizing some of SDK::Span's attribute processing into a single convenience method.)
What's the nature of Resource attribute validation I should be looking at? I'm not sure where this is described in the spec.

fbogsany · 2020-03-11T17:44:44Z

The downside of any eager validation is that more work is required even if attributes are dropped for other reasons (e.g. max links, events, or attributes is exceeded).

Everything span-attribute-related should go through SDK::Span#trim_span_attributes, so if we eagerly validate, it should be done there. The fact it isn't is certainly an oversight.

fhwang · 2020-03-11T18:04:44Z

Would it be safe to defer all attribute validation (key correctness, value correctness, attributes hash size) until SDK::Span#finish is called? Is it reasonable to assume that no exporter will ever send data before that call?

fhwang · 2020-03-11T18:44:40Z

Actually disregard that, I suppose #to_span_data is a better place for it.

fbogsany · 2020-03-11T19:03:32Z

SDK::Span#finish is the last point to do it before SpanProcessors see the data, so if we want valid data in the processors, it has to be done by then. #to_span_data is the last point to do it before the exporters see the data.

The most flexible approach is to perform validation in an optional (but enabled by default) processor, but that implies that processors earlier in the pipeline may see invalid data. There are scenarios where that could be just fine (for example, processors that rewrite or drop spans).

The main reason to eagerly limit attribute hash size is to limit the heap size kept live by data in the pipeline.

fbogsany · 2020-04-03T15:29:14Z

Bumped this to Alpha 0.4 so we can proceed with the 0.3 release (as discussed on the SIG call).

mwear added the help wanted Extra attention is needed label Feb 29, 2020

mwear added this to the Alpha v0.3 milestone Feb 29, 2020

mwear changed the title ~~Invalid attributes on Span, Events, Links, and Resources~~ Invalid attributes on Spans, Events, Links, and Resources Feb 29, 2020

mwear assigned fhwang Mar 5, 2020

fhwang mentioned this issue Mar 11, 2020

SDK::Trace::Span defers attr validation until finish is called. #207

Closed

fbogsany modified the milestones: Alpha v0.3, Alpha v0.4 Apr 3, 2020

fbogsany unassigned fhwang Jun 11, 2020

fbogsany modified the milestones: Alpha v0.5, Alpha v0.6 Jun 29, 2020

fbogsany modified the milestones: Alpha v0.6, Beta Aug 27, 2020

fbogsany modified the milestones: Beta v0.7, Beta v0.8 Sep 18, 2020

fbogsany mentioned this issue Nov 2, 2020

SDK Exporter/BatchProcessor should not hard-fail in an initializer #464

Closed

fbogsany mentioned this issue Dec 17, 2020

feat: Structured error handling #521

Merged

2 tasks

fbogsany closed this as completed in #521 Dec 21, 2020

mwear mentioned this issue Aug 15, 2024

feat: coerce OpenTelemetry::Trace::Status.description to String #1657

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalid attributes on Spans, Events, Links, and Resources #203

Invalid attributes on Spans, Events, Links, and Resources #203

mwear commented Feb 29, 2020

fbogsany commented Mar 2, 2020

fhwang commented Mar 5, 2020

fhwang commented Mar 10, 2020

fbogsany commented Mar 10, 2020

mwear commented Mar 10, 2020

fhwang commented Mar 11, 2020

fbogsany commented Mar 11, 2020

fhwang commented Mar 11, 2020

fhwang commented Mar 11, 2020

fbogsany commented Mar 11, 2020

fbogsany commented Apr 3, 2020

Invalid attributes on Spans, Events, Links, and Resources #203

Invalid attributes on Spans, Events, Links, and Resources #203

Comments

mwear commented Feb 29, 2020

fbogsany commented Mar 2, 2020

fhwang commented Mar 5, 2020

fhwang commented Mar 10, 2020

fbogsany commented Mar 10, 2020

mwear commented Mar 10, 2020

fhwang commented Mar 11, 2020

fbogsany commented Mar 11, 2020

fhwang commented Mar 11, 2020

fhwang commented Mar 11, 2020

fbogsany commented Mar 11, 2020

fbogsany commented Apr 3, 2020