Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add forceFlush to API #1287

Closed
anuraaga opened this issue Dec 13, 2020 · 8 comments
Closed

Add forceFlush to API #1287

anuraaga opened this issue Dec 13, 2020 · 8 comments
Labels
needs discussion Need more information before all suitable labels can be applied spec:trace Related to the specification/trace directory

Comments

@anuraaga
Copy link
Contributor

anuraaga commented Dec 13, 2020

Currently forceFlush is only defined for the SDK. However, it's needed mostly in instrumentation, it's not a configuration or initialization concern like other SDK methods. In particular, it's generally needed when instrumenting a serverless runtime that automatically freezes, such as AWS lambda. But by being in the SDK, it breaks our convention that instrumentation only depends on the API. Indeed the Java instrumentation currently depends on the SDK, and if the user's app doesn't include it, will crash.

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/master/instrumentation/aws-lambda-1.0/library/aws-lambda-1.0-library.gradle#L19

Since forceFlush is needed in logic, not initialization / configuration, it seems to also make sense in the API conceptually. We don't want logic writers to be touching the SDK as much as possible, e.g., randomly shutting it down during a request.

@anuraaga anuraaga added the spec:trace Related to the specification/trace directory label Dec 13, 2020
@carlosalberto
Copy link
Contributor

cc @tedsuo

@jkwatson
Copy link
Contributor

I'm super hesitant to add this sort of call to the API. I understand the use-case, but this will encourage people who don't need it to call it as well (much like people invoke the Java garbage collector explicitly in their code when, at best, it doesn't actually do anything).

Is there a way that the instrumentation can guard against the SDK not being loaded at runtime, rather than crashing the application?

@anuraaga
Copy link
Contributor Author

@jkwatson It's possible, but it seems complex and also ties the instrumentation to the SDK against our normal designs. I also don't know if such a trick would be possible in all languages. We would need to add a note to our wording that "instrumentation only depends on API" with a caveat that it can depend on it in a compile-only type of way to provide forceFlush. Instead, we could have a heavily documented forceFlush in the API, and if even then a user called it, presumably they're also calling System.gc() and have other things to worry about? A rename to forceFlushBeforeRuntimeFreeze to clarify the use case is obtuse but also an option :) But I'm ok with just adding the caveat and not adding to the API if that seems better to folks.

@Oberon00
Copy link
Member

FaaS are peculiar environments and you will often need hacks and special configurations to support them. So it seems OK to have that particular instrumentation depend on the SDK. Alternatively, such instrumentations could have a configurable hook that is called whenever a possible suspension point is reached (i.e. after a request is finished).

Note that the API vs SDK question was briefly touched in #351 (comment) but everyone seemed to agree that it should be in the SDK so there was not much discussion on that particular point.

@andrewhsu andrewhsu added the needs discussion Need more information before all suitable labels can be applied label Dec 15, 2020
@andrewhsu
Copy link
Member

talked about this at the spec sig mtg this morning, would like to discuss with @anuraaga at the spec sig mtg later today

@tedsuo
Copy link
Contributor

tedsuo commented Dec 16, 2020

Discussed further with @anuraaga on the spec call.

API should not get a Flush method, as its presence encourages the wrong behavior in instrumentation authors, who will incorrectly believe they need to manage flushing themselves. We saw this issue in prior tracing implementations and do not want to revisit that scenario with OpenTelemetry.

In this special case, Lambda performs a freeze, but does not provide the user with an onFreeze hook. So ,to a certain extent lambda needs to provide an SDK manager of some kind, or a flush callback, etc. Rather than having flush baked into the instrumentation itself, this should be separate. There's no issue with framework code that manages the lifecycle of the SDK (Spring Sleuth is an example), but this should not be mixed up with instrumentation in a way that creates a scenario where users cannot escape from flush or shutdown being called if they don't want it to happen.

@carlosalberto
Copy link
Contributor

@tedsuo Just to confirm: can we close this issue?

@anuraaga
Copy link
Contributor Author

Yeah I think we can close and maybe revisit if there happen to be more use cases in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs discussion Need more information before all suitable labels can be applied spec:trace Related to the specification/trace directory
Projects
None yet
Development

No branches or pull requests

6 participants