Skip to content

Commit

Permalink
identity: default to RS256 for new workload ids (hashicorp#18882)
Browse files Browse the repository at this point in the history
OIDC mandates the support of the RS256 signing algorithm so in order to maximize workload identity's usefulness this change switches from using the EdDSA signing algorithm to RS256.

Old keys will continue to use EdDSA but new keys will use RS256. The EdDSA generation code was left in place because it's fast and cheap and I'm not going to lie I hope we get to use it again.

**Test Updates**

Most of our Variables and Keyring tests had a subtle assumption in them that the keyring would be initialized by the time the test server had elected a leader. ed25519 key generation is so fast that the fact that it was happening asynchronously with server startup didn't seem to cause problems. Sadly rsa key generation is so slow that basically all of these tests failed.

I added a new `testutil.WaitForKeyring` helper to replace `testutil.WaitForLeader` in cases where the keyring must be initialized before the test may continue. However this is mostly used in the `nomad/` package.

In the `api` and `command/agent` packages I decided to switch their helpers to wait for keyring initialization by default. This will slow down tests a bit, but allow those packages to not be as concerned with subtle server readiness details. On my machine rsa key generation takes 63ms, so hopefully the difference isn't significant on CI runners.

**TODO**

- Docs and changelog entries.
- Upgrades - right now upgrades won't get RS256 keys until their root key rotates either manually or after ~30 days.
- Observability - I'm not sure there's a way for operators to see if they're using EdDSA or RS256 unless they inspect a key. The JWKS endpoint can be inspected to see if EdDSA will be used for new identities, but it doesn't technically define which key is active. If upgrades can be fixed to automatically rotate keys, we probably don't need to worry about this.

**Requiem for ed25519**

When workload identities were first implemented we did not immediately consider OIDC compliance. Consul, Vault, and many other third parties support JWT auth methods without full OIDC compliance. For the machine<-->machine use cases workload identity is intended to fulfill, OIDC seemed like a bigger risk than asset.

EdDSA/ed25519 is the signing algorithm we chose for workload identity JWTs because of all these lovely properties:

1. Deterministic keys that can be derived from our preexisting root keys. This was perhaps the biggest factor since we already had a root encryption key around from which we could derive a signing key.
2. Wonderfully compact: 64 byte private key, 32 byte public key, 64 byte signatures. Just glorious.
3. No parameters. No choices of encodings. It's all well-defined by [RFC 8032](https://datatracker.ietf.org/doc/html/rfc8032).
4. Fastest performing signing algorithm! We don't even care that much about the performance of our chosen algorithm, but what a free bonus!
5. Arguably one of the most secure signing algorithms widely available. Not just from a cryptanalysis perspective, but from an API and usage perspective too.

Life was good with ed25519, but sadly it could not last.

[IDPs](https://en.wikipedia.org/wiki/Identity_provider), such as AWS's IAM OIDC Provider, love OIDC. They have OIDC implemented for humans, so why not reuse that OIDC support for machines as well? Since OIDC mandates RS256, many implementations don't bother implementing other signing algorithms (or at least not advertising their support). A quick survey of OIDC Discovery endpoints revealed only 2 out of 10 OIDC providers advertised support for anything other than RS256:

- [PayPal](https://www.paypalobjects.com/.well-known/openid-configuration) supports HS256
- [Yahoo](https://api.login.yahoo.com/.well-known/openid-configuration) supports ES256

RS256 only:

- [GitHub](https://token.actions.githubusercontent.com/.well-known/openid-configuration)
- [GitLab](https://gitlab.com/.well-known/openid-configuration)
- [Google](https://accounts.google.com/.well-known/openid-configuration)
- [Intuit](https://developer.api.intuit.com/.well-known/openid_configuration)
- [Microsoft](https://login.microsoftonline.com/fabrikamb2c.onmicrosoft.com/v2.0/.well-known/openid-configuration)
- [SalesForce](https://login.salesforce.com/.well-known/openid-configuration)
- [SimpleLogin (acquired by ProtonMail)](https://app.simplelogin.io/.well-known/openid-configuration/)
- [TFC](https://app.terraform.io/.well-known/openid-configuration)
  • Loading branch information
schmichael authored and nvanthao committed Mar 1, 2024
1 parent bf2891d commit afc0013
Show file tree
Hide file tree
Showing 23 changed files with 416 additions and 149 deletions.
26 changes: 17 additions & 9 deletions api/internal/testutil/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ func NewTestServer(t testing.T, cb ServerConfigCallback) *TestServer {

// Wait for the server to be ready
if nomadConfig.Server.Enabled && nomadConfig.Server.BootstrapExpect != 0 {
server.waitForLeader()
server.waitForServers()
} else {
server.waitForAPI()
}
Expand Down Expand Up @@ -296,21 +296,29 @@ func (s *TestServer) waitForAPI() {
)
}

// waitForLeader waits for the Nomad server's HTTP API to become
// available, and then waits for a known leader and an index of
// 1 or more to be observed to confirm leader election is done.
func (s *TestServer) waitForLeader() {
// waitForServers waits for the Nomad server's HTTP API to become available,
// and then waits for the keyring to be intialized. This implies a leader has
// been elected and Raft writes have occurred.
func (s *TestServer) waitForServers() {
f := func() error {
// Query the API and check the status code
// Using this endpoint as it is does not have restricted access
resp, err := s.HTTPClient.Get(s.url("/v1/status/leader"))
resp, err := s.HTTPClient.Get(s.url("/.well-known/jwks.json"))
if err != nil {
return fmt.Errorf("failed to get leader: %w", err)
return fmt.Errorf("failed to contact leader: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if err = s.requireOK(resp); err != nil {
return fmt.Errorf("leader response is not ok: %w", err)
}

jwks := struct {
Keys []interface{} `json:"keys"`
}{}
if err := json.NewDecoder(resp.Body).Decode(&jwks); err != nil {
return fmt.Errorf("error decoding jwks response: %w", err)
}
if len(jwks.Keys) == 0 {
return fmt.Errorf("no keys found")
}
return nil
}
must.Wait(s.t,
Expand Down
2 changes: 1 addition & 1 deletion client/vaultclient/vaultclient_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ const (
jwtAuthConfigTemplate = `
{
"jwks_url": "<<.JWKSURL>>",
"jwt_supported_algs": ["EdDSA"],
"jwt_supported_algs": ["RS256", "EdDSA"],
"default_role": "nomad-workloads"
}
`
Expand Down
11 changes: 6 additions & 5 deletions client/widmgr/mock.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@
package widmgr

import (
"crypto/ed25519"
"crypto/rand"
"crypto/rsa"
"fmt"
"slices"
"time"
Expand All @@ -22,13 +23,13 @@ type MockWIDSigner struct {
// SignIdentities will use it to find expirations or reject invalid identity
// names
wids map[string]*structs.WorkloadIdentity
key ed25519.PrivateKey
key *rsa.PrivateKey
keyID string
mockNow time.Time // allows moving the clock
}

func NewMockWIDSigner(wids []*structs.WorkloadIdentity) *MockWIDSigner {
_, privKey, err := ed25519.GenerateKey(nil)
privKey, err := rsa.GenerateKey(rand.Reader, 2048)
if err != nil {
panic(err)
}
Expand Down Expand Up @@ -63,7 +64,7 @@ func (m *MockWIDSigner) JSONWebKeySet() *jose.JSONWebKeySet {
jwk := jose.JSONWebKey{
Key: m.key.Public(),
KeyID: m.keyID,
Algorithm: "EdDSA",
Algorithm: "RS256",
Use: "sig",
}
return &jose.JSONWebKeySet{
Expand Down Expand Up @@ -95,7 +96,7 @@ func (m *MockWIDSigner) SignIdentities(minIndex uint64, req []*structs.WorkloadI
}
}
opts := (&jose.SignerOptions{}).WithHeader("kid", m.keyID).WithType("JWT")
sig, err := jose.NewSigner(jose.SigningKey{Algorithm: jose.EdDSA, Key: m.key}, opts)
sig, err := jose.NewSigner(jose.SigningKey{Algorithm: jose.RS256, Key: m.key}, opts)
if err != nil {
return nil, fmt.Errorf("error creating signer: %w", err)
}
Expand Down
4 changes: 2 additions & 2 deletions command/agent/http_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1560,7 +1560,7 @@ func benchmarkJsonEncoding(b *testing.B, handle *codec.JsonHandle) {
func httpTest(t testing.TB, cb func(c *Config), f func(srv *TestAgent)) {
s := makeHTTPServer(t, cb)
defer s.Shutdown()
testutil.WaitForLeader(t, s.Agent.RPC)
testutil.WaitForKeyring(t, s.Agent.RPC, s.Config.Region)
f(s)
}

Expand All @@ -1572,7 +1572,7 @@ func httpACLTest(t testing.TB, cb func(c *Config), f func(srv *TestAgent)) {
}
})
defer s.Shutdown()
testutil.WaitForLeader(t, s.Agent.RPC)
testutil.WaitForKeyring(t, s.Agent.RPC, s.Config.Region)
f(s)
}

Expand Down
10 changes: 1 addition & 9 deletions command/agent/testagent.go
Original file line number Diff line number Diff line change
Expand Up @@ -192,15 +192,7 @@ RETRY:

failed := false
if a.Config.NomadConfig.BootstrapExpect == 1 && a.Config.Server.Enabled {
testutil.WaitForResult(func() (bool, error) {
args := &structs.GenericRequest{}
var leader string
err := a.RPC("Status.Leader", args, &leader)
return leader != "", err
}, func(err error) {
a.T.Logf("failed to find leader: %v", err)
failed = true
})
testutil.WaitForKeyring(a.T, a.RPC, a.Config.Region)
} else {
testutil.WaitForResult(func() (bool, error) {
req, _ := http.NewRequest(http.MethodGet, "/v1/agent/self", nil)
Expand Down
2 changes: 1 addition & 1 deletion command/recommendation_dismiss_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ func TestRecommendationDismissCommand_Run(t *testing.T) {
}

func TestRecommendationDismissCommand_AutocompleteArgs(t *testing.T) {
ci.Parallel(t)
srv, client, url := testServer(t, false, nil)
defer srv.Shutdown()

Expand All @@ -113,7 +114,6 @@ func TestRecommendationDismissCommand_AutocompleteArgs(t *testing.T) {
}

func testRecommendationAutocompleteCommand(t *testing.T, client *api.Client, srv *agent.TestAgent, cmd *RecommendationAutocompleteCommand) {
ci.Parallel(t)
require := require.New(t)

// Register a test job to write a recommendation against.
Expand Down
13 changes: 9 additions & 4 deletions command/var_list_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"github.com/hashicorp/nomad/api"
"github.com/hashicorp/nomad/ci"
"github.com/mitchellh/cli"
"github.com/shoenig/test/must"
"github.com/stretchr/testify/require"
)

Expand Down Expand Up @@ -130,7 +131,7 @@ func TestVarListCommand_Online(t *testing.T) {

nsList := []string{api.DefaultNamespace, "ns1"}
pathList := []string{"a/b/c", "a/b/c/d", "z/y", "z/y/x"}
variables := setupTestVariables(client, nsList, pathList)
variables := setupTestVariables(t, client, nsList, pathList)

testTmpl := `{{ range $i, $e := . }}{{if ne $i 0}}{{print "•"}}{{end}}{{printf "%v\t%v" .Namespace .Path}}{{end}}`

Expand Down Expand Up @@ -349,14 +350,14 @@ type testSVNamespacePath struct {
Path string
}

func setupTestVariables(c *api.Client, nsList, pathList []string) SVMSlice {
func setupTestVariables(t *testing.T, c *api.Client, nsList, pathList []string) SVMSlice {

out := make(SVMSlice, 0, len(nsList)*len(pathList))

for _, ns := range nsList {
c.Namespaces().Register(&api.Namespace{Name: ns}, nil)
for _, p := range pathList {
setupTestVariable(c, ns, p, &out)
must.NoError(t, setupTestVariable(c, ns, p, &out))
}
}

Expand All @@ -369,8 +370,12 @@ func setupTestVariable(c *api.Client, ns, p string, out *SVMSlice) error {
Path: p,
Items: map[string]string{"k": "v"}}
v, _, err := c.Variables().Create(testVar, &api.WriteOptions{Namespace: ns})
if err != nil {
return err
}

*out = append(*out, *v.Metadata())
return err
return nil
}

type NSPather interface {
Expand Down
2 changes: 1 addition & 1 deletion e2e/vaultcompat/cluster_setup_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ var roleLegacy = map[string]interface{}{
func authConfigJWT(jwksURL string) map[string]any {
return map[string]any{
"jwks_url": jwksURL,
"jwt_supported_algs": []string{"EdDSA"},
"jwt_supported_algs": []string{"RS256", "EdDSA"},
"default_role": "nomad-workloads",
}
}
Expand Down
15 changes: 9 additions & 6 deletions nomad/auth/auth_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@
package auth

import (
"crypto/ed25519"
"crypto/rand"
"crypto/rsa"
"crypto/x509"
"crypto/x509/pkix"
"errors"
Expand All @@ -20,7 +21,6 @@ import (

"github.com/hashicorp/nomad/acl"
"github.com/hashicorp/nomad/ci"
"github.com/hashicorp/nomad/helper/crypto"
"github.com/hashicorp/nomad/helper/pointer"
"github.com/hashicorp/nomad/helper/testlog"
"github.com/hashicorp/nomad/helper/uuid"
Expand Down Expand Up @@ -1217,20 +1217,23 @@ func (ctx *testContext) Certificate() *x509.Certificate {
}

type testEncrypter struct {
key ed25519.PrivateKey
key *rsa.PrivateKey
}

func newTestEncrypter() *testEncrypter {
buf, _ := crypto.Bytes(32)
k, err := rsa.GenerateKey(rand.Reader, 2048)
if err != nil {
panic(err)
}
return &testEncrypter{
key: ed25519.NewKeyFromSeed(buf),
key: k,
}
}

func (te *testEncrypter) signClaim(claims *structs.IdentityClaims) (string, error) {

opts := (&jose.SignerOptions{}).WithType("JWT")
sig, err := jose.NewSigner(jose.SigningKey{Algorithm: jose.EdDSA, Key: te.key}, opts)
sig, err := jose.NewSigner(jose.SigningKey{Algorithm: jose.RS256, Key: te.key}, opts)
if err != nil {
return "", err
}
Expand Down
19 changes: 13 additions & 6 deletions nomad/client_agent_endpoint_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -156,13 +156,20 @@ func TestMonitor_Monitor_RemoteServer(t *testing.T) {
servers := []*Server{s1, s2}
var nonLeader *Server
var leader *Server
for _, s := range servers {
if !s.IsLeader() {
nonLeader = s
} else {
leader = s
testutil.WaitForResult(func() (bool, error) {
for _, s := range servers {
if !s.IsLeader() {
nonLeader = s
} else {
leader = s
}
}
}

return nonLeader != nil && leader != nil && nonLeader != leader, fmt.Errorf(
"leader=%p follower=%p", nonLeader, leader)
}, func(err error) {
t.Fatalf("timed out trying to find a leader: %v", err)
})

cases := []struct {
desc string
Expand Down
6 changes: 3 additions & 3 deletions nomad/core_sched_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2444,7 +2444,7 @@ func TestCoreScheduler_CSIBadState_ClaimGC(t *testing.T) {
}
}
return true
}, time.Second*5, 10*time.Millisecond, "invalid claims should be marked for GC")
}, 10*time.Second, 50*time.Millisecond, "invalid claims should be marked for GC")

}

Expand All @@ -2454,7 +2454,7 @@ func TestCoreScheduler_RootKeyGC(t *testing.T) {

srv, cleanup := TestServer(t, nil)
defer cleanup()
testutil.WaitForLeader(t, srv.RPC)
testutil.WaitForKeyring(t, srv.RPC, "global")

// reset the time table
srv.fsm.timetable.table = make([]TimeTableEntry, 1, 10)
Expand Down Expand Up @@ -2559,7 +2559,7 @@ func TestCoreScheduler_VariablesRekey(t *testing.T) {

srv, cleanup := TestServer(t, nil)
defer cleanup()
testutil.WaitForLeader(t, srv.RPC)
testutil.WaitForKeyring(t, srv.RPC, "global")

store := srv.fsm.State()
key0, err := store.GetActiveRootKeyMeta(nil)
Expand Down
Loading

0 comments on commit afc0013

Please sign in to comment.