profiler: Disable agentless mode for WithAPIKey #906

felixge · 2021-04-19T10:28:31Z

This is a backwards incompatible change, but deemed neccessary by the
profiling team. Since it's been analyzed to only impact a tiny minority
of users, it's done without a major version change.

UPGRADING:

If you are currently using profiler.WithAPIKey() or DD_API_KEY to upload
profiling data, please remove this configuration. dd-trace-go will now
default to upload through the datadog agent running at localhost:8126.
If your agent is running on another port, please use
profiler.WithAgentAddr() to configure the correct agent address.

The new profiler.WithAgentlessUpload() is not officially supported for
anything other than debugging/testing when hitting integration issues.
If you think you need it for your production environment, please contact
our support.

BACKGROUND:

Historically the profiler only supported direct uploading to the backend
without using an agent, aka "agentless mode". This integration method
was succeeded by agent based uploading which should always be the
default behavior by now. The old agentless mode is not officially
supported anymore and only meant for testing and debugging at this
point.

Up until now setting an API Key via WithAPIKey or DD_API_KEY still
allowed users to opt into the agentless mode, but this is problematic
for two reasons:

A DD_API_KEY might be set in the environment for unrelated reasons or
by accident.
The user may have forgotten switch their integration to agent
uploading.

Users were warned against agentless uploading via a banner that was
shown from July 2020 to February 2021 and various email campaigns. It is
now time to make sure nobody is uploading via agentless mode in
production anymore.

This pull request warns users of WithAPIKey that the option alone will no longer enable agentless uploading by default in the future. It can still be enabled via WithAgentlessUpload but this is discouraged and not officially supported.

profiler/options.go

profiler/profiler.go

Co-authored-by: Gabriel Aszalos <gabriel.aszalos@gmail.com>

AlexJF · 2021-04-19T13:16:38Z

profiler/profiler.go

+		// key configured, we warn the customers that this is probably a
+		// misconfiguration.
+		if cfg.apiKey != "" {
+			log.Warn("You are currently setting profiler.WithAPIKey or the DD_API_KEY env variable, but as of dd-trace-go v1.30.0 this value is getting ignored by the profiler. Please see the profiler.WithAPIKey go docs and verify that your integration is still working. If you can't remove DD_API_KEY from your environment, you can use WithAPIKey(\"\") to silence this warning.")


Nit: This might be too "aggressive" as it's rather big, specially on those cases where people have no control. You do mention they can remove the warning by changing but is it gaining us anything? Anyway not a big deal for me, I just know that some tracers care about reducing verbosity.

Also, why are we warning when agentless is still a legitimate use case?

Nit: This might be too "aggressive" as it's rather big, specially on those cases where people have no control. You do mention they can remove the warning by changing but is it gaining us anything? Anyway not a big deal for me, I just know that some tracers care about reducing verbosity.

This PR introduces a backwards incompatible change in a minor release of the profiling library. Arguably a big warning is warranted. But if you think dropping the bit about removing the warning would help, I'd be okay with that. Just let me know.

I'm also okay to change the entire message if you have a different proposal, just let me know.

Also, why are we warning when agentless is still a legitimate use case?

There is no legitimate production use case at this point. The main use case is for our own internal testing, and perhaps to help customers troubleshoot connectivity issues. But none of that code should ever run in production, so I think the warning makes sense?

There is no legitimate production use case at this point.

Going to push back on that a bit. I can think of a number of use cases that don't run afoul of billing issues where a customer might not want to run the agent.

Legitimate production use cases aside (I'm going to kind of disagree with that assertion), I do not think this should be a warning. Documented or not, the client does support running agentless, even if it is no longer the default.

@pmbauer perhaps I don't fully understand the use cases you have in mind. Let's continue the discussion in the google doc thread and make sure we're aligned before continuing the code related discussions here.

Replied in the doc. The question of billing for profiles not submitted via the agent is a product and business decision. That in no way makes the production uses for agentless uploads illegitimate from a customer perspective.

While I do feel strongly about the business decision to force profiles through the agent (it's bad for customers and bad for business long term as severless becomes the norm) I don't feel strongly about the warning. Feel free to ship.

…-mode

* 'v1' of https://github.com/DataDog/dd-trace-go: Implement DD_PROFILING_OUTPUT_DIR for dev/debug (DataDog#924) contrib/go-pg/pg.v10: add INTEGRATION flag check for tests (DataDog#921) contrib/go-redis/redis.v8: optimize BeforeProcess and BeforeProcessPipeline (DataDog#920) CI: check go.sum is up-to-date (DataDog#918) README: Fix go get instructions (DataDog#913) internal/log: Improve API for testing (DataDog#916) ddtrace/tracer: add support for agent discovery endpoint, feature flags, stats & drops (DataDog#859) internal/version: bump to v1.31.0 (DataDog#910) CONTRIBUTING.md: add link to contrib/README.md (DataDog#802) profiler: Disable agentless mode for WithAPIKey (DataDog#906) contrib/Shopify/sarama: fix possible deadlock in WrapAsyncProducer (DataDog#907) contrib/database/sql.tracedConn implement driver.SessionResetter (DataDog#908) ddtrace/opentracer: consider FollowsFrom references as children (DataDog#905) ddtrace/opentracer: add support for opentracing.TracerContextWithSpanExtension (DataDog#855) ddtrace/tracer: follow noDebugStack setting when using SetTag with an error (DataDog#900) contrib/net/http: add a getter to retrieve original round tripper (DataDog#903) ddtrace/tracer: ensure Flush call is synchronous (DataDog#901)

MatejBalantic · 2022-01-17T23:02:17Z

We’ve just hit this warning while trying to integrate profiler into our golang app that runs on an AWS lambda. We are able to use tracer without having any datadog agent running.

do I understand correctly that now, to run this profiler, we would need to move our lambda into VPC, set up a new EC2 instance with a datadog agent in the same VPC, and then configure datadog SDK on our lambda to send profile tracers this agent?

this sounds like a huge overkill. What is a technical reason preventing this profiler to work in the same way as the tracer does?

From our perspective, we will probably have to give up the use of the profiler if the condition is setting up an agent

felixge · 2022-01-28T10:40:12Z

@MatejBalantic first of all, sorry for the late reply, this message slipped through the cracks on my end.

I agree that setting up an agent on a separate EC2 instance is overkill : /. There are various reasons for why profiling requires the agent right now, but I think our host-centric billing model is the main reason.

That being said, I've just kicked off a few internal threads to discuss this problem and will try to see if I can find a better solution for you than what you've outlined.

felixge · 2022-01-28T15:59:21Z

@MatejBalantic from internal discussions the answer is that we don't support serverless profiling right now, but adding first-class support for it is on our roadmap. For now you'll have to setup an agent, but even then you might find that the lambda container shutdown doesn't play nice with our profile uploading logic (i.e. the container might shut down without uploading the profiling data in some cases). So it's probably better to wait until we fully support profiling on lambda.

felixge added 5 commits April 19, 2021 11:39

profiler: WithAPIKey Warning, WithAgentlessUpload

6994492

This pull request warns users of WithAPIKey that the option alone will no longer enable agentless uploading by default in the future. It can still be enabled via WithAgentlessUpload but this is discouraged and not officially supported.

env var

7637001

profiler: Change WithAPIKey default behavior

9664117

Fix test case

1aa0f15

Improve tests

c43d1d3

felixge added this to the 1.30.0 milestone Apr 19, 2021

Make golint happy

a6651bc

felixge changed the title ~~Remove automatic agentless mode~~ profiler: Disable agentless mode for WithAPIKey Apr 19, 2021

felixge marked this pull request as ready for review April 19, 2021 11:04

felixge requested review from pmbauer, gbbr and AlexJF April 19, 2021 11:04

gbbr previously approved these changes Apr 19, 2021

View reviewed changes

profiler/options.go Outdated Show resolved Hide resolved

profiler/profiler.go Outdated Show resolved Hide resolved

Update profiler/profiler.go

9979c9e

Co-authored-by: Gabriel Aszalos <gabriel.aszalos@gmail.com>

felixge dismissed gbbr’s stale review via 9979c9e April 19, 2021 12:32

AlexJF previously approved these changes Apr 19, 2021

View reviewed changes

Use internal.BoolEnv

ec0b130

felixge dismissed AlexJF’s stale review via ec0b130 April 19, 2021 14:45

pmbauer approved these changes Apr 21, 2021

View reviewed changes

Merge branch 'v1' into felix.geisendoerfer/remove-automatic-agentless…

4e19879

…-mode

felixge merged commit d9472a0 into v1 Apr 21, 2021

felixge deleted the felix.geisendoerfer/remove-automatic-agentless-mode branch April 21, 2021 19:58

JianyiGao mentioned this pull request Sep 19, 2023

[V2 experimentation]profiler: remove WithAPIKey and WithAgentlessUpload #2224

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

profiler: Disable agentless mode for WithAPIKey #906

profiler: Disable agentless mode for WithAPIKey #906

felixge commented Apr 19, 2021 •

edited

Loading

AlexJF Apr 19, 2021 •

edited

Loading

pmbauer Apr 19, 2021

felixge Apr 19, 2021

pmbauer Apr 19, 2021 •

edited

Loading

pmbauer Apr 19, 2021

felixge Apr 20, 2021

pmbauer Apr 20, 2021

pmbauer Apr 21, 2021 •

edited

Loading

MatejBalantic commented Jan 17, 2022

felixge commented Jan 28, 2022

felixge commented Jan 28, 2022

profiler: Disable agentless mode for WithAPIKey #906

profiler: Disable agentless mode for WithAPIKey #906

Conversation

felixge commented Apr 19, 2021 • edited Loading

AlexJF Apr 19, 2021 • edited Loading

Choose a reason for hiding this comment

pmbauer Apr 19, 2021

Choose a reason for hiding this comment

felixge Apr 19, 2021

Choose a reason for hiding this comment

pmbauer Apr 19, 2021 • edited Loading

Choose a reason for hiding this comment

pmbauer Apr 19, 2021

Choose a reason for hiding this comment

felixge Apr 20, 2021

Choose a reason for hiding this comment

pmbauer Apr 20, 2021

Choose a reason for hiding this comment

pmbauer Apr 21, 2021 • edited Loading

Choose a reason for hiding this comment

MatejBalantic commented Jan 17, 2022

felixge commented Jan 28, 2022

felixge commented Jan 28, 2022

felixge commented Apr 19, 2021 •

edited

Loading

AlexJF Apr 19, 2021 •

edited

Loading

pmbauer Apr 19, 2021 •

edited

Loading

pmbauer Apr 21, 2021 •

edited

Loading