Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unlock OS thread after netns is restored #455

Merged
merged 1 commit into from
Mar 4, 2020

Conversation

booxter
Copy link
Contributor

@booxter booxter commented Feb 20, 2020

The current ns package code is very careful about not leaving the calling
thread with the overridden namespace set, for example when origns.Set() fails.
This is achieved by starting a new green thread, locking its OS thread, and
never unlocking it. Which makes golang runtime to scrap the OS thread backing
the green thread after the go routine exits.

While this works, it's probably not as optimal: stopping and starting a new OS
thread is expensive and may be avoided if we unlock the thread after resetting
network namespace to the original. On the other hand, if resetting fails, it's
better to leave the thread locked and die.

While it won't work in all cases, we can still make an attempt to reuse the OS
thread when resetting the namespace succeeds. This can be achieved by unlocking
the thread conditionally to the namespace reset success.

Signed-off-by: Ihar Hrachyshka ihrachys@redhat.com

The current ns package code is very careful about not leaving the calling
thread with the overridden namespace set, for example when origns.Set() fails.
This is achieved by starting a new green thread, locking its OS thread, and
never unlocking it. Which makes golang runtime to scrap the OS thread backing
the green thread after the go routine exits.

While this works, it's probably not as optimal: stopping and starting a new OS
thread is expensive and may be avoided if we unlock the thread after resetting
network namespace to the original. On the other hand, if resetting fails, it's
better to leave the thread locked and die.

While it won't work in all cases, we can still make an attempt to reuse the OS
thread when resetting the namespace succeeds. This can be achieved by unlocking
the thread conditionally to the namespace reset success.

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
@booxter
Copy link
Contributor Author

booxter commented Feb 20, 2020

@rmohr @davidvossel

Copy link
Member

@jellonek jellonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good comments in code.
LGTM.

Copy link
Contributor

@rmohr rmohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I don't think that anything below golang 1.10 is relevant right? With 1.10 we got the reference counting on thread locks. The missing reference counting was afaik a reason for go <=1.9 to not unlock, since people could get strange results when they were using lock/unlock themselves in the callbacks.

If this repo would in general work on golang <= 1.9 we may want to include a // +build go1.10 constraint.

@jellonek
Copy link
Member

We don't have this noted anywhere specifically (and probably this should be changed) that we are trying to be compliant only with last 3 stable versions of golang, so right now with go >=1.11.

Our tests are running only against https://github.com/containernetworking/plugins/blob/master/.travis.yml#L5 these versions.

@rmohr
Copy link
Contributor

rmohr commented Feb 21, 2020

We don't have this noted anywhere specifically (and probably this should be changed) that we are trying to be compliant only with last 3 stable versions of golang, so right now with go >=1.11.

Then the PR should be good. Thanks.

@mars1024
Copy link
Member

@jellonek @rmohr A small question, why should we force-unlock the OS thread? I think the lock will be released after the goroutine exits.

@rmohr
Copy link
Contributor

rmohr commented Feb 21, 2020

@jellonek @rmohr A small question, why should we force-unlock the OS thread? I think the lock will be released after the goroutine exits.

Instead of unlocking the thread, the thread gets reaped when the routine exits when it is not unlocked: https://golang.org/src/runtime/proc.go?s=102346:102367#L3574.

We will get new threads if needed, but it makes the operation definitely heavier.

@jellonek
Copy link
Member

@mars1024 To understand that you need to know about long standing bug in golang which was leading to reusing threads with changed netns by other goroutines, while they were expecting that they are running in the original one netns.
PTAL on https://www.weave.works/blog/linux-namespaces-golang-followup

Copy link
Member

@mars1024 mars1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@jellonek
Copy link
Member

Let's wait for next maintainers meeting (02.26) with merge.

@bboreham
Copy link
Contributor

Historical note: here is the statement that "UnlockOSThread is not safe for use in any library code"
golang/go#20458.
This was fixed in Go 1.10 as noted.

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine post Go 1.10. Thanks!

@dcbw
Copy link
Member

dcbw commented Mar 4, 2020

/lgtm

@mccv1r0 mccv1r0 merged commit 47a9fd8 into containernetworking:master Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants