-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
profiler: Disable agentless mode for WithAPIKey #906
profiler: Disable agentless mode for WithAPIKey #906
Conversation
This pull request warns users of WithAPIKey that the option alone will no longer enable agentless uploading by default in the future. It can still be enabled via WithAgentlessUpload but this is discouraged and not officially supported.
Co-authored-by: Gabriel Aszalos <gabriel.aszalos@gmail.com>
// key configured, we warn the customers that this is probably a | ||
// misconfiguration. | ||
if cfg.apiKey != "" { | ||
log.Warn("You are currently setting profiler.WithAPIKey or the DD_API_KEY env variable, but as of dd-trace-go v1.30.0 this value is getting ignored by the profiler. Please see the profiler.WithAPIKey go docs and verify that your integration is still working. If you can't remove DD_API_KEY from your environment, you can use WithAPIKey(\"\") to silence this warning.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This might be too "aggressive" as it's rather big, specially on those cases where people have no control. You do mention they can remove the warning by changing but is it gaining us anything? Anyway not a big deal for me, I just know that some tracers care about reducing verbosity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why are we warning when agentless is still a legitimate use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This might be too "aggressive" as it's rather big, specially on those cases where people have no control. You do mention they can remove the warning by changing but is it gaining us anything? Anyway not a big deal for me, I just know that some tracers care about reducing verbosity.
This PR introduces a backwards incompatible change in a minor release of the profiling library. Arguably a big warning is warranted. But if you think dropping the bit about removing the warning would help, I'd be okay with that. Just let me know.
I'm also okay to change the entire message if you have a different proposal, just let me know.
Also, why are we warning when agentless is still a legitimate use case?
There is no legitimate production use case at this point. The main use case is for our own internal testing, and perhaps to help customers troubleshoot connectivity issues. But none of that code should ever run in production, so I think the warning makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no legitimate production use case at this point.
Going to push back on that a bit. I can think of a number of use cases that don't run afoul of billing issues where a customer might not want to run the agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Legitimate production use cases aside (I'm going to kind of disagree with that assertion), I do not think this should be a warning. Documented or not, the client does support running agentless, even if it is no longer the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmbauer perhaps I don't fully understand the use cases you have in mind. Let's continue the discussion in the google doc thread and make sure we're aligned before continuing the code related discussions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied in the doc. The question of billing for profiles not submitted via the agent is a product and business decision. That in no way makes the production uses for agentless uploads illegitimate from a customer perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I do feel strongly about the business decision to force profiles through the agent (it's bad for customers and bad for business long term as severless becomes the norm) I don't feel strongly about the warning. Feel free to ship.
* 'v1' of https://github.com/DataDog/dd-trace-go: Implement DD_PROFILING_OUTPUT_DIR for dev/debug (DataDog#924) contrib/go-pg/pg.v10: add INTEGRATION flag check for tests (DataDog#921) contrib/go-redis/redis.v8: optimize BeforeProcess and BeforeProcessPipeline (DataDog#920) CI: check go.sum is up-to-date (DataDog#918) README: Fix go get instructions (DataDog#913) internal/log: Improve API for testing (DataDog#916) ddtrace/tracer: add support for agent discovery endpoint, feature flags, stats & drops (DataDog#859) internal/version: bump to v1.31.0 (DataDog#910) CONTRIBUTING.md: add link to contrib/README.md (DataDog#802) profiler: Disable agentless mode for WithAPIKey (DataDog#906) contrib/Shopify/sarama: fix possible deadlock in WrapAsyncProducer (DataDog#907) contrib/database/sql.tracedConn implement driver.SessionResetter (DataDog#908) ddtrace/opentracer: consider FollowsFrom references as children (DataDog#905) ddtrace/opentracer: add support for opentracing.TracerContextWithSpanExtension (DataDog#855) ddtrace/tracer: follow noDebugStack setting when using SetTag with an error (DataDog#900) contrib/net/http: add a getter to retrieve original round tripper (DataDog#903) ddtrace/tracer: ensure Flush call is synchronous (DataDog#901)
We’ve just hit this warning while trying to integrate profiler into our golang app that runs on an AWS lambda. We are able to use tracer without having any datadog agent running. do I understand correctly that now, to run this profiler, we would need to move our lambda into VPC, set up a new EC2 instance with a datadog agent in the same VPC, and then configure datadog SDK on our lambda to send profile tracers this agent? this sounds like a huge overkill. What is a technical reason preventing this profiler to work in the same way as the tracer does? From our perspective, we will probably have to give up the use of the profiler if the condition is setting up an agent |
@MatejBalantic first of all, sorry for the late reply, this message slipped through the cracks on my end. I agree that setting up an agent on a separate EC2 instance is overkill : /. There are various reasons for why profiling requires the agent right now, but I think our host-centric billing model is the main reason. That being said, I've just kicked off a few internal threads to discuss this problem and will try to see if I can find a better solution for you than what you've outlined. |
@MatejBalantic from internal discussions the answer is that we don't support serverless profiling right now, but adding first-class support for it is on our roadmap. For now you'll have to setup an agent, but even then you might find that the lambda container shutdown doesn't play nice with our profile uploading logic (i.e. the container might shut down without uploading the profiling data in some cases). So it's probably better to wait until we fully support profiling on lambda. |
This is a backwards incompatible change, but deemed neccessary by the
profiling team. Since it's been analyzed to only impact a tiny minority
of users, it's done without a major version change.
UPGRADING:
If you are currently using profiler.WithAPIKey() or DD_API_KEY to upload
profiling data, please remove this configuration. dd-trace-go will now
default to upload through the datadog agent running at localhost:8126.
If your agent is running on another port, please use
profiler.WithAgentAddr() to configure the correct agent address.
The new profiler.WithAgentlessUpload() is not officially supported for
anything other than debugging/testing when hitting integration issues.
If you think you need it for your production environment, please contact
our support.
BACKGROUND:
Historically the profiler only supported direct uploading to the backend
without using an agent, aka "agentless mode". This integration method
was succeeded by agent based uploading which should always be the
default behavior by now. The old agentless mode is not officially
supported anymore and only meant for testing and debugging at this
point.
Up until now setting an API Key via WithAPIKey or DD_API_KEY still
allowed users to opt into the agentless mode, but this is problematic
for two reasons:
by accident.
uploading.
Users were warned against agentless uploading via a banner that was
shown from July 2020 to February 2021 and various email campaigns. It is
now time to make sure nobody is uploading via agentless mode in
production anymore.