Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socketpair_unix: avoid double close(), set FD_CLOEXEC #66

Merged
merged 1 commit into from
Jan 31, 2024

Conversation

champtar
Copy link
Contributor

@champtar champtar commented Jan 29, 2024

We were calling the close() syscall multiple time
with the same fd number, leading to random issues like
closing containerd stream port.

In newLaunchedPlugin() we have:

sockets, _ := net.NewSocketPair()
defer sockets.Close()
conn, _ := sockets.LocalConn()
peerFile := sockets.PeerFile()
defer func() {
  peerFile.Close()
  if retErr != nil {
    conn.Close()
  }
}()
cmd.Start()

so we were doing:

close(local) (in LocalConn())
cmd.Start()
close(peer) (peerFile.Close())
close(local) (sockets.Close())
close(peer) (sockets.Close())

If the NRI plugin that we launch with cmd.Start() is not cached or
the system is a bit busy, cmd.Start() gives a large enough window
for another goroutine to open a file that will get the same fd number
as local was, thus being closed by accident.

Fix the situation by storing os.Files instead of ints, so that closing
multiple times just returns an error.

Also set FD_CLOEXEC on the sockets so we don't leak them.

Fixes 1da2cdf

@champtar
Copy link
Contributor Author

ping @klihub (initial author)

@klihub klihub self-requested a review January 30, 2024 16:10
@klihub
Copy link
Member

klihub commented Jan 30, 2024

not tested with containerd / windows version also needs fixing

Yeah, we haven't really had the time to impement and test NRI for Windows, yet. That said, on Windows, SocketPair emulates a socketpair using two ends of a pre-connected TCPv4 socket, which are then stored as sys.Handle's. Shouldn't that protect us from a similar double-close error ?

klihub
klihub previously approved these changes Jan 30, 2024
Copy link
Member

@klihub klihub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me.

@klihub klihub requested a review from fuweid January 30, 2024 16:55
@champtar champtar changed the title net/sockerpair_unix: avoid closing unrelated fds sockerpair_unix: avoid double close(), set FD_CLOEXEC Jan 30, 2024
@champtar champtar force-pushed the multiple-close branch 2 times, most recently from c4da6cf to c1a14af Compare January 31, 2024 00:01
@champtar
Copy link
Contributor Author

champtar commented Jan 31, 2024

@klihub just pushed a version fixing windows and moving all the common functions in socketpair.go (windows part not tested)

@champtar champtar changed the title sockerpair_unix: avoid double close(), set FD_CLOEXEC socketpair: avoid double close(), set FD_CLOEXEC Jan 31, 2024
pkg/net/socketpair_unix.go Outdated Show resolved Hide resolved
@fuweid
Copy link
Member

fuweid commented Jan 31, 2024

Hi @champtar would you please file a ticket to track window part and focus on linux part in this pull request? I don't have too much knowledge on windows platform actually. we can ask other maintainers to take over this. Thanks

@klihub
Copy link
Member

klihub commented Jan 31, 2024

Hi @champtar would you please file a ticket to track window part and focus on linux part in this pull request? I don't have too much knowledge on windows platform actually. we can ask other maintainers to take over this. Thanks

+1 I would also leave Windows out of this now and focus on fixing the bug on un*x, since we anyway don't have all the necessary functionality in place for Windows. Sorry if my earlier question/comment was misleading and could be interpreted otherwise.

@klihub klihub self-requested a review January 31, 2024 07:30
@champtar champtar changed the title socketpair: avoid double close(), set FD_CLOEXEC socketpair_unix: avoid double close(), set FD_CLOEXEC Jan 31, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jan 31, 2024

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (6f5a4d2) 64.50% compared to head (7f878a1) 64.58%.

❗ Current head 7f878a1 differs from pull request most recent head 5d0b52b. Consider uploading reports for the commit 5d0b52b to get more accurate results

Files Patch % Lines
pkg/net/socketpair_unix.go 87.50% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #66      +/-   ##
==========================================
+ Coverage   64.50%   64.58%   +0.07%     
==========================================
  Files           9       10       +1     
  Lines        1834     1838       +4     
==========================================
+ Hits         1183     1187       +4     
  Misses        500      500              
  Partials      151      151              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@champtar
Copy link
Contributor Author

Pushed back the unix only changes, I'll open a second PR once this one is merged

pkg/net/socketpair_unix.go Outdated Show resolved Hide resolved
pkg/net/socketpair_unix.go Outdated Show resolved Hide resolved
@klihub klihub dismissed their stale review January 31, 2024 10:21

Let's wait for final version.

We were calling the close() syscall multiple time
with the same fd number, leading to random issues like
closing containerd stream port.

In newLaunchedPlugin() we have:
sockets, _ := net.NewSocketPair()
defer sockets.Close()
conn, _ := sockets.LocalConn()
peerFile := sockets.PeerFile()
defer func() {
  peerFile.Close()
  if retErr != nil {
    conn.Close()
  }
}()
cmd.Start()

so we were doing:
close(local) (in LocalConn())
cmd.Start()
close(peer) (peerFile.Close())
close(local) (sockets.Close())
close(peer) (sockets.Close())

If the NRI plugin that we launch with cmd.Start() is not cached or
the system is a bit busy, cmd.Start() gives a large enough window
for another goroutine to open a file that will get the same fd number
as local was, thus being closed by accident.

Fix the situation by storing os.Files instead of ints, so that closing
multiple times just returns an error (that we ignore).

Also set FD_CLOEXEC on the sockets so we don't leak them.

Fixes 1da2cdf

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
@champtar
Copy link
Contributor Author

@klihub hopefully final version

@klihub klihub requested a review from fuweid January 31, 2024 12:13
Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fuweid fuweid merged commit 4298b41 into containerd:main Jan 31, 2024
8 checks passed
@champtar champtar deleted the multiple-close branch January 31, 2024 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants