Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repeatedly calling jwk.(Cache).Register() with WithWaitReady(false) may cause a deadlock #1250

Open
shasderias opened this issue Dec 12, 2024 · 3 comments
Assignees

Comments

@shasderias
Copy link

shasderias commented Dec 12, 2024

Describe the bug

Calling jwk.(Cache).Register() many times with the jwk.WithWaitReady(false) option may result in a deadlock.

Please attach the output of go version

go version go1.23.4 darwin/arm64

(also reproducible on 1.23.1)

To Reproduce / Expected behavior

Repro Repo

Run the following test.

package main

import (
	"context"
	"fmt"
	"testing"

	"github.com/lestrrat-go/httprc/v3"
	"github.com/lestrrat-go/jwx/v3/jwk"
)

func TestRepro(t *testing.T) {
	ctx := context.Background()

	c, err := jwk.NewCache(ctx, httprc.NewClient())
	if err != nil {
		panic(err)
	}

	for i := 0; i < 50; i++ {
		u := fmt.Sprintf("https://www.googleapis.com/oauth2/v3/certs/%d", i)
		fmt.Println("registering", u)
		if err := c.Register(ctx, u, jwk.WithWaitReady(false)); err != nil {
			panic(err)
		}
	}
}

Expected: the program to terminate.
Actual: the program deadlocks (see log below).

The issue is also reproducible with:

  • a simple go test -v run from the command line;
  • real URLs for endpoints that return valid JWKS files.

The issue does not appear to reproduce with go run, although it may be the case that the problem is still present, just less likely to occur.

The program does not always deadlock in the same iteration of the for loop.

GOROOT=<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64 #gosetup
GOPATH=<myHomeDir>go #gosetup
<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/bin/go test -c -o <myHomeDir>Library/Caches/JetBrains/IntelliJIdea2024.3/tmp/GoLand/___TestRepro_in_jwk_deadlock_repro.test jwk_deadlock_repro #gosetup
<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/bin/go tool test2json -t <myHomeDir>Library/Caches/JetBrains/IntelliJIdea2024.3/tmp/GoLand/___TestRepro_in_jwk_deadlock_repro.test -test.v=test2json -test.paniconexit0 -test.run ^\QTestRepro\E$ #gosetup
=== RUN   TestRepro
registering https://www.googleapis.com/oauth2/v3/certs/0
registering https://www.googleapis.com/oauth2/v3/certs/1
registering https://www.googleapis.com/oauth2/v3/certs/2
registering https://www.googleapis.com/oauth2/v3/certs/3
registering https://www.googleapis.com/oauth2/v3/certs/4
registering https://www.googleapis.com/oauth2/v3/certs/5
registering https://www.googleapis.com/oauth2/v3/certs/6
registering https://www.googleapis.com/oauth2/v3/certs/7
registering https://www.googleapis.com/oauth2/v3/certs/8
registering https://www.googleapis.com/oauth2/v3/certs/9
registering https://www.googleapis.com/oauth2/v3/certs/10
registering https://www.googleapis.com/oauth2/v3/certs/11
registering https://www.googleapis.com/oauth2/v3/certs/12
registering https://www.googleapis.com/oauth2/v3/certs/13
registering https://www.googleapis.com/oauth2/v3/certs/14
registering https://www.googleapis.com/oauth2/v3/certs/15
registering https://www.googleapis.com/oauth2/v3/certs/16
registering https://www.googleapis.com/oauth2/v3/certs/17
registering https://www.googleapis.com/oauth2/v3/certs/18
registering https://www.googleapis.com/oauth2/v3/certs/19
registering https://www.googleapis.com/oauth2/v3/certs/20
registering https://www.googleapis.com/oauth2/v3/certs/21
registering https://www.googleapis.com/oauth2/v3/certs/22
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan receive]:
testing.(*T).Run(0x14000110680, {0x1050075b6?, 0x14000153b48?}, 0x105148c20)
	<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:1751 +0x328
testing.runTests.func1(0x14000110680)
	<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:2168 +0x40
testing.tRunner(0x14000110680, 0x14000153c68)
	<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:1690 +0xe4
testing.runTests(0x1400000c2a0, {0x105363e50, 0x1, 0x1}, {0xc800000000000000?, 0xc8e7685b4c65a659?, 0x0?})
	<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:2166 +0x3ac
testing.(*M).Run(0x1400007ebe0)
	<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:2034 +0x588
main.main()
	_testmain.go:45 +0x90

goroutine 7 [select]:
github.com/lestrrat-go/httprc/v3.sendBackend[...]({0x10514ee30, 0x1053a2da0}, 0x14000120460, {0x14000121420, {0x1051510c0, 0x1400012f9e0}, {0x0, 0x0}}, 0x14000121420)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/controller.go:102 +0x100
github.com/lestrrat-go/httprc/v3.(*controller).Add(0x14000120850, {0x10514ee30, 0x1053a2da0}, {0x1051510c0, 0x1400012f9e0}, {0x14000028e40?, 0x1400006e280?, 0x0?})
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/controller.go:154 +0x198
github.com/lestrrat-go/jwx/v3/jwk.(*Cache).Register(0x140000285c0, {0x10514ee30, 0x1053a2da0}, {0x1400001cb70, 0x2d}, {0x14000104f38?, 0xb9fb12f6?, 0x10537c540?})
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/jwx/v3@v3.0.0-alpha1/jwk/cache.go:152 +0x4d0
jwk_deadlock_repro.TestRepro(0x14000110820?)
	<myHomeDir>dev/jwk_deadlock_repro/repro_test.go:23 +0x1bc
testing.tRunner(0x14000110820, 0x105148c20)
	<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:1690 +0xe4
created by testing.(*T).Run in goroutine 1
	<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:1743 +0x314

goroutine 8 [select]:
github.com/lestrrat-go/httprc/v3.worker.Run({{0x10514bdf8, 0x10537b3c0}, 0x14000120460, 0x140001204d0, 0x14000120540, {0x10514c180, 0x1053a2da0}, {0x10514bdd8, 0x1053a2da0}}, {0x10514eed8?, ...}, ...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/worker.go:32 +0x2ac
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/client.go:155 +0x644

goroutine 9 [select]:
github.com/lestrrat-go/httprc/v3.worker.Run({{0x10514bdf8, 0x10537b3c0}, 0x14000120460, 0x140001204d0, 0x14000120540, {0x10514c180, 0x1053a2da0}, {0x10514bdd8, 0x1053a2da0}}, {0x10514eed8?, ...}, ...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/worker.go:32 +0x2ac
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/client.go:155 +0x644

goroutine 10 [select]:
github.com/lestrrat-go/httprc/v3.worker.Run({{0x10514bdf8, 0x10537b3c0}, 0x14000120460, 0x140001204d0, 0x14000120540, {0x10514c180, 0x1053a2da0}, {0x10514bdd8, 0x1053a2da0}}, {0x10514eed8?, ...}, ...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/worker.go:32 +0x2ac
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/client.go:155 +0x644

goroutine 11 [select]:
github.com/lestrrat-go/httprc/v3.worker.Run({{0x10514bdf8, 0x10537b3c0}, 0x14000120460, 0x140001204d0, 0x14000120540, {0x10514c180, 0x1053a2da0}, {0x10514bdd8, 0x1053a2da0}}, {0x10514eed8?, ...}, ...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/worker.go:32 +0x2ac
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/client.go:155 +0x644

goroutine 12 [select]:
github.com/lestrrat-go/httprc/v3.worker.Run({{0x10514bdf8, 0x10537b3c0}, 0x14000120460, 0x140001204d0, 0x14000120540, {0x10514c180, 0x1053a2da0}, {0x10514bdd8, 0x1053a2da0}}, {0x10514eed8?, ...}, ...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/worker.go:32 +0x2ac
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/client.go:155 +0x644

goroutine 13 [select]:
github.com/lestrrat-go/httprc/v3.sendWorker({0x10514eed8, 0x140000101e0}, 0x140001204d0, {0x1051510c0, 0x1400012e900})
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/backend.go:113 +0x94
github.com/lestrrat-go/httprc/v3.(*controller).loop(0x14000120850, {0x10514eed8, 0x140000101e0}, 0x1400000e4e0)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/backend.go:162 +0x350
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/client.go:175 +0x8d8

goroutine 14 [semacquire]:
sync.runtime_Semacquire(0x0?)
	<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/runtime/sema.go:71 +0x2c
sync.(*WaitGroup).Wait(0x1400000e4e0)
	<myHomeDir>go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/sync/waitgroup.go:118 +0x74
github.com/lestrrat-go/httprc/v3.(*Client).Start.func4(...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/client.go:178
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/v3@v3.0.0-beta1/client.go:177 +0x940


Process finished with the exit code 1
@lestrrat
Copy link
Collaborator

Thanks, I'll look into it tomorrow.

@lestrrat
Copy link
Collaborator

@shasderias Hmmm, I'm not sure if this is blocking or if it's just ... slow. In my testing I did observe things timing out, but after adding some print logs, the call to AddResource seem to all be returning eventually, in order.

And if I give it enough workers, let's just say 50 (via httprc.WithWorkers(50)) it seems to pass immediately.

I acknowledge there's something to be desired, but I have a feeling it's just bad performance. (which I'll work on, but just want to make sure if we're seeing the same symptoms)

@shasderias
Copy link
Author

I'm a little short on time now, but my gut feeling is:

  • this is a true deadlock as in, I think there's a genuine race condition that has a chance of causing a program to halt;
  • this is probably not (just) a performance problem;
  • I'm not surprised that adding print statements make the problem goes away, as terminal I/O takes a comparatively "long time", which can mask the deadlock;
  • the preceding point is also true for adding workers; one goroutine for each resource monitored = no chance for the goroutines to "fight" over jobs.

I've also managed to repro on go run and on a Windows machine on 1.23.1.

I'll have a bit more time after the 19th, will try to validate the above/investigate further then if it's still necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants