Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasi: fix nonblocking sockets on *NIX (gotip net/http) #1503

Merged
merged 8 commits into from
Jun 12, 2023
Merged

Conversation

codefromthecrypt
Copy link
Contributor

@codefromthecrypt codefromthecrypt commented Jun 3, 2023

This PR fixes nonblocking I/O when handling sockets with net/http.

In this PR we further separate the Unix implementation from the Windows implementation; in the original implementation, we rely on Go's stdlib and the net package: this implementation is now Windows only.

On other Unix platforms, we instead go straight to the syscall layer and we pass-through most parameters.

We also add a baseSockFile struct that implements base behavior for sock files; this is extended as follows:

  • unix: tcpListenerFile, tcpConnFile
  • windows: winTcpListenerFile, winTcpConnFile
  • unsupported: unsupportedSockFile
    (implements only socketapi.TCPSock.Accept(), returns ENOSYS: no need for TCPConn, as nothing is able to return it)

Adrian Cole added 2 commits June 3, 2023 11:06
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Signed-off-by: Adrian Cole <adrian@tetrate.io>
req, err := http.NewRequest(http.MethodPost, "http://"+tcpAddr.String(), body)
require.NoError(t, err)

// TODO: test hangs here
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's the hang. from the log, the request is never read.

Copy link
Contributor

@evacchi evacchi Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like it's stuck in a loop here:

// poll/fd_unix.go
func (fd *FD) Accept() (int, syscall.Sockaddr, string, error) {
	...
	for {
		...
		case syscall.EAGAIN:
			if fd.pd.pollable() {
				if err = fd.pd.waitRead(fd.isFile); err == nil {
					continue
				}
			}
		...

TBH I don't know if you can actually leave that loop ever 😬

	for {
		s, rsa, errcall, err := accept(fd.Sysfd)
		if err == nil {
			return s, rsa, "", err
		}
		switch err {
		case syscall.EINTR:
			continue
		case syscall.EAGAIN:
			if fd.pd.pollable() {
				if err = fd.pd.waitRead(fd.isFile); err == nil {
					continue
				}
			}
		case syscall.ECONNABORTED:
			// This means that a socket on the listen
			// queue was closed before we Accept()ed it;
			// it's a silly error, so try again.
			continue
		}
		return -1, nil, errcall, err
	}

I mean, sure. But it doesn't look like there is a way to interrupt gracefully.

Stupidest I can think, SetDeadline()...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I am now calling SetDeadline(), catching the timeout error, and returning EAGAIN instead. This I think causes the Wasm module to do the loop above 😬 so the wasm module is effectively unstuck but I think it's still looping until the connection becomes available; OTOH this lets other goroutines interleave; in fact I can see a poll_oneoff happening:

	==> wasi_snapshot_preview1.clock_time_get(id=monotonic,precision=0)
	<== (timestamp=81477104823666,errno=ESUCCESS)
	==> wasi_snapshot_preview1.sock_accept(fd=3,flags=)
	<== (fd=,errno=EAGAIN)
	==> wasi_snapshot_preview1.poll_oneoff(in=21422080,out=21446656,nsubscriptions=2)
	<== (nevents=2,errno=ESUCCESS)
	==> wasi_snapshot_preview1.sock_accept(fd=3,flags=)
	<== (fd=,errno=EAGAIN)
	==> wasi_snapshot_preview1.poll_oneoff(in=21422080,out=21446656,nsubscriptions=2)
	<== (nevents=2,errno=ESUCCESS)
	==> wasi_snapshot_preview1.sock_accept(fd=3,flags=)
	<== (fd=,errno=EAGAIN)
	==> wasi_snapshot_preview1.poll_oneoff(in=21422080,out=21446656,nsubscriptions=2)
	<== (nevents=2,errno=ESUCCESS)
	==> wasi_snapshot_preview1.sock_accept(fd=3,flags=)
	<== (fd=,errno=EAGAIN)
	==> wasi_snapshot_preview1.poll_oneoff(in=21422080,out=21446656,nsubscriptions=2)
	<== (nevents=2,errno=ESUCCESS)
	==> wasi_snapshot_preview1.sock_accept(fd=3,flags=)
	<== (fd=,errno=EAGAIN)
	==> wasi_snapshot_preview1.poll_oneoff(in=21422080,out=21446656,nsubscriptions=2)
	<== (nevents=2,errno=ESUCCESS)
	==> wasi_snapshot_preview1.sock_accept(fd=3,flags=)
	<== (fd=,errno=EAGAIN)
	==> wasi_snapshot_preview1.poll_oneoff(in=21422080,out=21446656,nsubscriptions=2)
	<== (nevents=2,errno=ESUCCESS)
	==> wasi_snapshot_preview1.sock_accept(fd=3,flags=)

Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>
@evacchi
Copy link
Contributor

evacchi commented Jun 9, 2023

I have pushed a change to use straight syscalls on *Nix, and fallbacks to the original code on Windows.
I haven't had time to layout the code properly, so it should be better move things in the right _unix and _windows files. Apologies for this not being great.

I have also updated the examples, hopefully correctly: even after the accept() was made pollable, the example failed on io.EOF, which however I think should be fine, so I modified the example to not fail on io.EOF. In that case, the buffer is read correctly and it contains headers+the string wazero.

On Windows I am skipping the new gotip test because it won't work, since we don't support nonblocking sockets (tested and yep, it hangs)

Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>
Copy link
Contributor Author

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this approach and looking forward to having this shipped. We can do a patch even.

Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>
@evacchi
Copy link
Contributor

evacchi commented Jun 12, 2023

pushed a cleaner version. This may be good to go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notice something (unrelated?) happened to flags when logging

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related, probably intended? /cc @codefromthecrypt

Comment on lines 464 to 467
// FIXME: this hangs because the example listens forever;
// however mod.Close() does not result in a clean shudown.
// console := <-ch
// require.Equal(t, "", console)
Copy link
Contributor

@evacchi evacchi Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solving by serving only 1 request and then exiting the Wasm routine; probably worth investigating what happens if we mod.Close() instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hah I was thinking about this route also.

evacchi added 2 commits June 12, 2023 16:23
Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>
Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>
@evacchi evacchi changed the title WIP: http gotip fix wasi: fix nonblocking sockets on *NIX (gotip net/http) Jun 12, 2023
Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>
@evacchi
Copy link
Contributor

evacchi commented Jun 12, 2023

Added description to the head post, marking ready for review.

@evacchi evacchi marked this pull request as ready for review June 12, 2023 15:43
@evacchi evacchi requested a review from mathetake as a code owner June 12, 2023 15:43
Copy link
Collaborator

@achille-roussel achille-roussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌

@mathetake mathetake merged commit f3778ca into main Jun 12, 2023
@mathetake mathetake deleted the http-fix branch June 12, 2023 20:51
<== errno=ESUCCESS
`, "\n"+log.String())
`, "\n"+log.String()) // FIXME? flags==0 prints 'flags='
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fine IMHO because there are no flags and the formatted flags are pipe delimited. that said it is easy to change to special case none differently. @achille-roussel @ncruces any preferences on this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this default applies to every flag type we have in wasi, btw.

<== errno=ESUCCESS
`, "\n"+log.String())
`, "\n"+log.String()) // FIXME? flags==1 prints 'flags=APPEND'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to docs, flags is an fdflags type, so usually in log formatting we format the flag name vs have people need to look up what it was and do the bit flag math in their head. Maybe it was surprising that the logging "fixed" in a different change? https://github.com/WebAssembly/WASI/blob/snapshot-01/phases/snapshot/docs.md#-fdflags-flagsu16

Copy link
Contributor Author

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks so much for sorting this out!

n, errno := file.Write([]byte("wazero"))
file := newTcpConn(tcp)
errno := syscall.Errno(0)
// Ensure we don't interrupt until we get a non-zero errno,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great comment

panic(err)
}
// Once one request was served, close the channel.
close(e.ch)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is what I didn't get right when I tried to have the handler quit after one call and not loop back into the accept loop. glad you got it!

evacchi added a commit to evacchi/wazero that referenced this pull request Jun 15, 2023
)

Signed-off-by: Adrian Cole <adrian@tetrate.io>
Co-authored-by: Edoardo Vacchi <evacchi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants