Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

profiler upload does always wait a second before retrying upload #1395

Closed
sumerc opened this issue Jul 21, 2022 · 1 comment · Fixed by #1396
Closed

profiler upload does always wait a second before retrying upload #1395

sumerc opened this issue Jul 21, 2022 · 1 comment · Fixed by #1396
Assignees

Comments

@sumerc
Copy link

sumerc commented Jul 21, 2022

Interestingly, I have spotted this while I was reading the code below:

if rerr, ok := err.(*retriableError); ok {
	statsd.Count("datadog.profiling.go.upload_retry", 1, nil, 1)
	wait := time.Duration(rand.Int63n(p.cfg.period.Nanoseconds()))
	log.Error("Uploading profile failed: %v. Trying again in %s...", rerr, wait)
	p.interruptibleSleep(time.Second)
	continue
}

It seems that wait is not used anywhere. I think the line should be following?

...
p.interruptibleSleep(wait * time.Nanosecond)
@sumerc sumerc changed the title profiler upload does not wait before retrying upload profiler upload does always wait a second before retrying upload Jul 21, 2022
@nsrip-dd
Copy link
Contributor

nsrip-dd commented Jul 21, 2022

Thanks, great catch! It looks like the randomized wait was introduced in #827, but it was lost by #961. Yes, it should be p.interruptibleSleep(wait * time.Nanosecond) to match the pre-#961 behavior. I'll make a PR shortly to address this.

@nsrip-dd nsrip-dd self-assigned this Jul 21, 2022
nsrip-dd added a commit that referenced this issue Jul 21, 2022
PR #827 introduced a randomized wait duration before retrying a failed
profile upload. This was done to reduce the likelihood of flooding the
backend with uploads. The wait was done to be an average of half the
profiling perdio. However, PR #961 lost this behavior when introducing
interruptible sleeps. This commit restores the lost behavior.

Fixes #1395
nsrip-dd added a commit that referenced this issue Jul 21, 2022
PR #827 introduced a randomized wait duration before retrying a failed
profile upload. This was done to reduce the likelihood of flooding the
backend with uploads. The wait was done to be an average of half the
profiling perdio. However, PR #961 lost this behavior when introducing
interruptible sleeps. This commit restores the lost behavior.

Fixes #1395
Kyle-Verhoog pushed a commit that referenced this issue Jul 21, 2022
PR #827 introduced a randomized wait duration before retrying a failed
profile upload. This was done to reduce the likelihood of flooding the
backend with uploads. The wait was done to be an average of half the
profiling perdio. However, PR #961 lost this behavior when introducing
interruptible sleeps. This commit restores the lost behavior.

Fixes #1395
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants