Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize environment variables between SDK implementations #572

Closed
mwear opened this issue Apr 21, 2020 · 15 comments · Fixed by #666
Closed

Standardize environment variables between SDK implementations #572

mwear opened this issue Apr 21, 2020 · 15 comments · Fixed by #666
Labels
area:sdk Related to the SDK enhancement New feature or request priority:p2 Medium priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA
Milestone

Comments

@mwear
Copy link
Member

mwear commented Apr 21, 2020

Currently environment variables are being defined individually by SIGs on a one off basis. We should coordinate this effort and come up with a list of standardized environment variable names, values, and behavior.

As a starting point, we should document the environment variables that already exist, and move on to additional definitions from there.

Here are some environment variable definitions that have been discussed in spec issues. If SIGs can add environment variables that they are using, or would like to define in the comments, I can update this list.

Name Description Language Link
OTEL_RESOURCE_LABELS Key values pairs to be used as resource labels Javascript issue
OTEL_RESOURCE_ATTRIBUTES Key values pairs to be used as resource labels Java issue
OTEL_SAMPLING_RATE Sampling rate to be used by the probability sampler Javascript issue
OTEL_LOG_LEVEL Log level used by the SDK logger Javascript issue

Another important consideration we need to address, is the value format. Here is a related issue: #505.

@ocelotl
Copy link
Contributor

ocelotl commented Apr 21, 2020

Just to update everybody on the Python side:

The Python implementation currently reads any environment variable prefixed with OPENTELEMETRY_PYTHON_. The idea behind this is to avoid conflicting with other OpenTelemetry implementations.

The API currently uses OPENTELEMETRY_PYTHON_TRACER_PROVIDER and OPENTELEMETRY_PYTHON_METER_PROVIDER. The configuration object is made to read any other OPENTELEMETRY_PYTHON_ prefixed environment variables so that they can be accesses by any other OpenTelemetry Python components, like instrumentations for third party libraries or frameworks.

@tsloughter
Copy link
Member

Erlang is also using OTEL_RESOURCE_LABELS for resource labels.

@naseemkullah
Copy link
Member

naseemkullah commented May 1, 2020

FYI OTEL_SAMPLING_RATE has not been implemented in JS yet (nor has OTEL_LOG_LEVEL for that matter, but that name is pretty solid).

re: OTEL_SAMPLING_RATE, seeing that it sets the probability of the probability sampler, would OTEL_SAMPLING_PROBABILITY be a better name, or are rate and probability considered synonymous (enough) ?

@naseemkullah
Copy link
Member

naseemkullah commented May 1, 2020

JAEGER_AGENT_HOST in JS is used for the Jaeger Exporter's host option same way as it is used today in the Jaeger clients.

JAEGER_AGENT_HOST defines hostname for reporting spans over UDP/Thrift.

related PR: open-telemetry/opentelemetry-js#924

worth noting: in code configuration of the exporter's host value is more authoritative than the env var.

Not sure if this is what we want for all env vars or not.

We would probably want consistent authority (in code > env var or vice-versa) across env vars.

@toumorokoshi
Copy link
Member

If I recall correctly, we didn't want to name things "OTEL" because there was already another technical concept that was called that.

That said, a google search for "otel" pops up nothing, so maybe it's ok?

@thisthat
Copy link
Member

thisthat commented May 11, 2020

Java uses the following environment variables:

Name Description Language Link
OTEL_OTLP_METRIC_TIMEOUT OpenTelemetry Exporter: max waiting time for the collector to process each metric batch Java Link
OTEL_OTLP_SPAN_TIMEOUT OpenTelemetry Exporter: max waiting time for the collector to process each span batch Java Link
OTEL_IMR_EXPORT_INTERVAL SDK: export interval between pushes to the exporter Java Link
OTEL_SSP_EXPORT_SAMPLED SimpleSpanProcessor: only sampled spans should be exported Java Link
OTEL_BSP_SCHEDULE_DELAY BatchSpanProcessor: delay interval between two consecutive exports Java Link
OTEL_BSP_MAX_QUEUE BatchSpanProcessor: maximum queue size Java Link
OTEL_BSP_MAX_EXPORT_BATCH BatchSpanProcessor: maximum batch size Java Link
OTEL_BSP_EXPORT_TIMEOUT BatchSpanProcessor: maximum allowed time to export data Java Link
OTEL_BSP_EXPORT_SAMPLED BatchSpanProcessor: only sampled spans should be exported Java Link

@toumorokoshi
Copy link
Member

@mwear do you have enough information to pick up a proposal? We are rapidly adding functionality to Python's auto-instrumentation which heavily utilizes environment variables.

@carlosalberto carlosalberto added this to the v0.5 milestone May 18, 2020
@mwear
Copy link
Member Author

mwear commented May 18, 2020

@toumorokoshi I'll draft up a specification this week. The goal will be to capture the environment variables that make sense to be standardized between languages. For language specific environment variables, it might be a good idea of us to decide on conventions each language can use, rather than specific environment variable names. We can have that conversation as part of this spec as well.

@bogdandrutu
Copy link
Member

@mwear please make sure #567 is also covered :)

@trask
Copy link
Member

trask commented May 20, 2020

In no way tied to these 😄, just reporting the current state for Java auto-instrumentation. Looking forward to using the standardized env vars.

environment variable default value
OTA_EXPORTER_OTLP_ENDPOINT <exception on missing>
OTA_EXPORTER_JAEGER_ENDPOINT "localhost:14250"
OTA_EXPORTER_JAEGER_SERVICE_NAME "(unknown service)"
OTA_EXPORTER_ZIPKIN_ENDPOINT "http://localhost:9411/api/v2/spans"
OTA_EXPORTER_ZIPKIN_SERVICE_NAME "(unknown service)"

@trask
Copy link
Member

trask commented May 20, 2020

Oh, also:

environment variable default value notes
OTA_PROPAGATORS "tracecontext" comma-separated list, currently supporting "tracecontext" and "b3"

@toumorokoshi
Copy link
Member

@toumorokoshi I'll draft up a specification this week. The goal will be to capture the environment variables that make sense to be standardized between languages. For language specific environment variables, it might be a good idea of us to decide on conventions each language can use, rather than specific environment variable names. We can have that conversation as part of this spec as well.

+1 to standardize the namespacing for language-specific environment variables.

@jmacd
Copy link
Contributor

jmacd commented Jun 3, 2020

FYI OTEL_SAMPLING_RATE has not been implemented in JS yet (nor has OTEL_LOG_LEVEL for that matter, but that name is pretty solid).

re: OTEL_SAMPLING_RATE, seeing that it sets the probability of the probability sampler, would OTEL_SAMPLING_PROBABILITY be a better name, or are rate and probability considered synonymous (enough) ?

@naseemkullah No, I believe that rate and probability are inverses of each other. I believe the reason we use "rate" to describe inverse probability is that if you see one sampled item using probability p over a unit of time, then you should count that as representing 1/p individuals over the same unit of time. Thus, rate is a multiplier >= 1 while probability is <= 1.

@naseemkullah
Copy link
Member

naseemkullah commented Jun 4, 2020

FYI OTEL_SAMPLING_RATE has not been implemented in JS yet (nor has OTEL_LOG_LEVEL for that matter, but that name is pretty solid).
re: OTEL_SAMPLING_RATE, seeing that it sets the probability of the probability sampler, would OTEL_SAMPLING_PROBABILITY be a better name, or are rate and probability considered synonymous (enough) ?

@naseemkullah No, I believe that rate and probability are inverses of each other. I believe the reason we use "rate" to describe inverse probability is that if you see one sampled item using probability p over a unit of time, then you should count that as representing 1/p individuals over the same unit of time. Thus, rate is a multiplier >= 1 while probability is <= 1.

🤯 Thanks @jmacd , while I don't quite get why probability and rate are inverses of each other, I am leaning more towards OTEL_SAMPLING_PROBABILITY being the correct name for setting the probability sampler's sampling probability.

@dyladan
Copy link
Member

dyladan commented Jun 4, 2020

rate and probability are not actually inverses of each other. rate describes a ratio between an independent and a dependent variable, while probability is the chance of some occurrence.

You could technically say "sample at a rate of 0.1 sampled traces per trace" (sampled traces dependent on total traces) and that would seem to equal a 10% probability. But, if you sampled every 10th trace, that would still be a 1 in 10 rate. In that scenario, each trace would not have a 10% probability of being sampled.

@reyang reyang added release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:miscellaneous For issues that don't match any other spec label labels Jul 10, 2020
@carlosalberto carlosalberto added release:required-for-ga Must be resolved before GA release, or nice to have before GA and removed release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:miscellaneous For issues that don't match any other spec label labels Jul 10, 2020
@reyang reyang mentioned this issue Jul 10, 2020
@bogdandrutu bogdandrutu added the priority:p2 Medium priority level label Jul 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:sdk Related to the SDK enhancement New feature or request priority:p2 Medium priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA
Projects
None yet
Development

Successfully merging a pull request may close this issue.