Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial work on OIDC (SSO) integration #126

Merged
merged 26 commits into from
Oct 31, 2021
Merged

Initial work on OIDC (SSO) integration #126

merged 26 commits into from
Oct 31, 2021

Conversation

unreality
Copy link
Contributor

This pull adds initial support for SSO. If you add oidc_endpoint, oidc_client_id and oidc_client_secret to config.json, when a new machine attempts to register, the OIDC endpoint will be used for authentication. A new namespace is created using the email address of the user, and the machine is added to that namespace.

Expiry times probably need to be implemented/used so that Logging out works correctly, and i think further work needs to be done on the register endpoint to become secure.

oidc_endpoint needs to be the base url of the OIDC endpoint (ie if your well-known configuration is at http://localhost:8080/auth/realms/master/.well-known/openid-configuration your endpoint url should be http://localhost:8080/auth/realms/master/ -- it does not do proper path joining at the moment, so the training / is required

@kradalby
Copy link
Collaborator

Hi @unreality This is great!

I have not come around to starting to review the code yet, but I have built it and played a bit with it.

On the first attempt I got the following panic when the callback came back from my OIDC (I use dex):

GET /oidc/callback?code=xatlvybq3mfbs3l6vm4jfdr57&state=0c22a99162dfaacbbe31287a666f41e4 HTTP/2.0
Host: headscale.example.no
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip, deflate, br
Accept-Language: en-us
Referer: https://id.example.no/dex/approval?req=zepbtkldlpe7h3nrlhh7emot5
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Safari/605.1.15

runtime error: index out of range [0] with length 0
/usr/lib/go-1.17/src/runtime/panic.go:90 (0x439054)
        goPanicIndex: panic(boundsError{x: int64(x), signed: true, y: y, code: boundsIndex})
/root/git/headscale_oidc/oidc.go:81 (0xbdc3dd)
/root/git/headscale_oidc/oidc.go:256 (0xbde16b)
/root/go/pkg/mod/github.com/gin-gonic/gin@v1.7.4/context.go:165 (0x977081)
/root/go/pkg/mod/github.com/gin-gonic/gin@v1.7.4/recovery.go:99 (0x97706c)
/root/go/pkg/mod/github.com/gin-gonic/gin@v1.7.4/context.go:165 (0x9762e6)
/root/go/pkg/mod/github.com/gin-gonic/gin@v1.7.4/logger.go:241 (0x9762c9)
/root/go/pkg/mod/github.com/gin-gonic/gin@v1.7.4/context.go:165 (0x97581d)
/root/go/pkg/mod/github.com/gin-gonic/gin@v1.7.4/gin.go:489 (0x9754a5)
/root/go/pkg/mod/github.com/gin-gonic/gin@v1.7.4/gin.go:445 (0x975004)
/usr/lib/go-1.17/src/net/http/server.go:2878 (0x785bba)
        serverHandler.ServeHTTP: handler.ServeHTTP(rw, req)
/usr/lib/go-1.17/src/net/http/server.go:3479 (0x787944)
        initALPNRequest.ServeHTTP: h.h.ServeHTTP(rw, req)
/usr/lib/go-1.17/src/net/http/h2_bundle.go:5832 (0x75af77)
        (*http2serverConn).runHandler: handler(rw, req)
/usr/lib/go-1.17/src/runtime/asm_amd64.s:1581 (0x46bc40)
        goexit: BYTE        $0x90        // NOP

I have not had any time to dig into what can cause this and will have a look tomorrow if no one beats me to it. Heck, it could even be me having misconfigured something.

Other than that I think it is a good start, but I would make a little list of things to discuss before merging (@juanfont, @qbit):

  • We would probably need to handle the log out/expiry before merger
  • We need to understand what kind of work needs to be done for it to be secure
  • Automatically making namespace based on email probably make sense, but I need to think a bit about it

@unreality
Copy link
Contributor Author

That trace is probably a bug on my part, I have made an assumption about the JWKs returned by the OIDC server. I'll see if I can spin up a Dex server and do some testing. All my testing has been done using Keycloak.

  • I agree that logout/expiry should probably be implemented before merging. I dont think it would be too hard to implement basic expiry.
  • My main concern with making it secure is the expiry times, and also adding a confirmation when new machines are added. It also looks like there is also a state in which registration is requested, but since the machine is already registered, it skips any change to the machine in the database. I think Machine.Registered usage needs to be clarified/cleaned up?
  • An additional todo is also to grab the groups and/or other data from the OIDC server and make a decision on namespace based on that, but i think email address would be sufficient for now?

@kradalby
Copy link
Collaborator

That trace is probably a bug on my part, I have made an assumption about the JWKs returned by the OIDC server. I'll see if I can spin up a Dex server and do some testing. All my testing has been done using Keycloak.

That would be good, we want to ensure that we are compliant if we are advertising OpenID Connect.

  • I agree that logout/expiry should probably be implemented before merging. I dont think it would be too hard to implement basic expiry.

Sounds good

  • My main concern with making it secure is the expiry times, and also adding a confirmation when new machines are added.

Do you mean expiry as in on the server-side?

It also looks like there is also a state in which registration is requested, but since the machine is already registered, it skips any change to the machine in the database. I think Machine.Registered usage needs to be clarified/cleaned up?

There is quite a lot of "hard-to-follow" stuff with the registration and all the database calls all over the place. It is on my cleanup list, but time.

I'll try to have a look at that while reviewing.

  • An additional todo is also to grab the groups and/or other data from the OIDC server and make a decision on namespace based on that, but i think email address would be sufficient for now?

I agree, I think email should be good, or at least a good default.

My concern is that I would concern headscale a hobbyist project, which means that a setup might have users from all over the place, domain-wise.
And if you use gmail.com, and your friend uses gmail.com, this is a problem if you share a headscale instance.

I dont have a good solution as of now, maybe I can think of something while reviewing the code.

…ame oidc_endpoint to oidc_issuer to be more inline with spec
@unreality
Copy link
Contributor Author

This new build should work with Dex now, i replaced my code with go-oidc. I'll try to tackle nodekey expiration etc as well, unless you want to separate that into a separate pull request?

@kradalby
Copy link
Collaborator

kradalby commented Oct 6, 2021

Let's make it feature complete before we merge :), so implementing expiry and logout makes sense.

Great work on using the upstream lib! I'll take a look. Sorry for the delay I was hunting some other bugs and got distracted.

api.go Outdated Show resolved Hide resolved
api.go Outdated Show resolved Hide resolved
oidc.go Outdated Show resolved Hide resolved
oidc.go Outdated Show resolved Hide resolved
oidc.go Outdated Show resolved Hide resolved
oidc.go Outdated Show resolved Hide resolved
unreality and others added 2 commits October 8, 2021 15:26
Co-authored-by: Kristoffer Dalby <kradalby@kradalby.no>
@unreality
Copy link
Contributor Author

@kradalby ive implemented most of your suggestions, and implemented some basic expiry checks, although i'd love to have some review over the expiry code since im sure ive missed some scenarios

Copy link
Collaborator

@kradalby kradalby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, it is looking like we are closing in :)

I posted mostly comments about adding comments. Machines are good at reading code, but extracting logic and intents by reading this much code is hard and I think we need more "why" explanations of decisions to understand why they were made.

If we dont have that, people will look in the future and wonder why, but not dear to change it.

That said, just using it works for me, I have pulled down the artefact from the build pipeline and the registration is good, will check back tomorrow and do some more client tests to look at the behaviour.

oidc.go Outdated Show resolved Hide resolved
oidc.go Outdated Show resolved Hide resolved
cmd/headscale/cli/utils.go Outdated Show resolved Hide resolved
api.go Show resolved Hide resolved
api.go Outdated Show resolved Hide resolved
cmd/headscale/cli/utils.go Outdated Show resolved Hide resolved
oidc.go Outdated Show resolved Hide resolved
oidc.go Outdated Show resolved Hide resolved
oidc.go Outdated Show resolved Hide resolved
@unreality
Copy link
Contributor Author

I've done some more work on the expiry code and attempted to clean up the registration method.

@kradalby
Copy link
Collaborator

Im on Holiday so ill have a look when I get back :)

@kradalby
Copy link
Collaborator

Could you see if you can resolve the conflicts so I can have that as part of the next review?

oidc.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@kradalby kradalby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

Mostly just comments and making the tests pass left, then we can merge and let users try it in main.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
api.go Outdated
return
}

resp := tailcfg.RegisterResponse{}

// We have the updated key!
if m.NodeKey == wgkey.Key(req.NodeKey).HexString() {
if m.Registered {

// The client sends an Expiry in the past if the client is requesting a logout
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a reference for this? would be good to have in the comment, for example if it can be found in the Tailscale source code.

@@ -161,6 +161,18 @@ func getHeadscaleApp() (*headscale.Headscale, error) {
return nil, err
}

// maxMachineRegistrationDuration is the maximum time a client can request for a client registration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a comment explaining why maxMachineRegistrationDuration is the max time, and why we have a max time.

maxMachineRegistrationDuration = viper.GetDuration("max_machine_registration_duration")
}

// defaultMachineRegistrationDuration is the default time assigned to a client registration if one is not specified by the client
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a comment explaining why defaultMachineRegistrationDuration is the default time, and why we have a default time.

machine.go Outdated
Comment on lines 65 to 67
// If the Machine is expired, updateMachineExpiry updates the Machine Expiry time to the maximum allowed duration,
// or the default duration if no Expiry time was requested by the client
func (h *Headscale) updateMachineExpiry(m *Machine) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does actually expire here? The full login? as in, does the machine have to log in again after this expire? Or is it the expiry of how long the oidc login can last?

I think we need larger godoc comment here explaining:

  • What is this function doing
  • Why is it necessary
  • An example to make it easier to understand

ClientSecret: viper.GetString("oidc.client_secret"),
},

MaxMachineRegistrationDuration: maxMachineRegistrationDuration, // the maximum duration a client may request for expiry time
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a comment here explaining what "expiry time" means, preferably with an example.

machine.go Outdated
@@ -36,6 +36,7 @@ type Machine struct {
LastSeen *time.Time
LastSuccessfulUpdate *time.Time
Expiry *time.Time
RequestedExpiry *time.Time // when a client connects, it may request a specific expiry time, use this field to store it
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RequestedExpiry *time.Time // when a client connects, it may request a specific expiry time, use this field to store it
// when a client connects, it may request a specific expiry time, use this field to store it
RequestedExpiry *time.Time

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a comment explaining what is the expiry time and why it might request a dedicated one.

Please consider adding an example.

README.md Outdated Show resolved Hide resolved
oidc.go Show resolved Hide resolved
README.md Outdated
"domain_map": {
".*": "default-namespace"
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, github suggestions does not handle this well, there is a "end of codeblock" missing

@unreality
Copy link
Contributor Author

Ive added some comments and hopefully made a few things clearer, let me know what you think

@kradalby kradalby merged commit fbdfa55 into juanfont:main Oct 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants