Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eio_posix: initial support for subprocesses #461

Merged
merged 7 commits into from
Mar 20, 2023

Conversation

talex5
Copy link
Collaborator

@talex5 talex5 commented Mar 15, 2023

To spawn a child executable on Unix, the parent forks a copy of itself, then has the child copy set up the environment and execve the new program.

However, we cannot run any OCaml code in the forked child process. This is because fork only duplicates its own domain. To the child, it appears that all other domains have stopped responding and if it tries to e.g. perform a GC then the child process will hang.

Therefore, the fork and all child actions need to be written in C. To keep this flexible, this PR adds a new fork_action type and lets the user provide a list of such actions to perform.

Some of this support is in Eio_unix as it should be possible to share it with Eio_linux.

At present, there are three actions (chdir, fchdir and execve). It should be fairly easy to plug in new ones (e.g. for renumbering file descriptors, dropping privileges, execveat, etc). Different backends may provide different functions.

In future, it might be good to support vfork too, as well as clone3 on Linux. The SIGCHLD stuff could also be massively simplified on systems with process descriptors, but I think that's only Linux and FreeBSD at the moment (and Eio_linux can do its own thing anyway).

To spawn a child executable on Unix, the parent `fork`s a copy of
itself, then has the child copy set up the environment and `execve` the
new program.

However, we cannot run any OCaml code in the forked child process. This
is because `fork` only duplicates its own domain. To the child, it
appears that all other domains have stopped responding and if it tries
to e.g. perform a GC then the child process will hang.

Therefore, the `fork` and all child actions need to be written in C.
To keep this flexible, this PR adds a new `fork_action` type and lets
the user provide a list of such actions to perform.
@patricoferris patricoferris self-requested a review March 15, 2023 12:33
@talex5 talex5 added the enhancement New feature or request label Mar 15, 2023
uerror("fork", Nothing);
}

CAMLreturn(Val_long(child_pid));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this holds, since a pid_t might not fit on an int. The standard library does the same though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

man system_data_types says:

According to POSIX, [pid_t] shall be a signed integer type, and the implementation shall support one or more programming environments where the width of pid_t is no greater than the width of the type long.

That could indeed be too big for an OCaml int. However, https://unix.stackexchange.com/questions/16883/what-is-the-maximum-value-of-the-process-id seems to indicate that the limit is usually quite small.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I remember Stevens mentioning that the correct practice whas to for example cast a pid_t to (long long) in order to print it safely, since it was impossible to know its size.

child_pid = fork();
if (child_pid == 0) {
eio_unix_run_fork_actions(Int_val(v_errors), v_actions);
} else if (child_pid < 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else if (child_pid < 0) {
} else if (child_pid == -1) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? PIDs can't be negative, so if it returns e.g. -5 then it would be better to raise an exception. The kernel may do interesting things if passed a negative PID. e.g. kill says:

If pid is less than -1, then sig is sent to every process in the process group whose ID is -pid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, PIDs can't be negative and fork cannot return anything below -1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it never returns anything below -1 then this code change makes no difference. And if it does, then the original version was safer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was just more "precise" 😄

Copy link
Collaborator

@patricoferris patricoferris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I did a little bit of stress testing on macOS and nothing seemed to break! Would standard output redirection be another action?

@@ -102,6 +102,8 @@ module Private : sig
(<Eio.Flow.source; Eio.Flow.close; unix_fd> * <Eio.Flow.sink; Eio.Flow.close; unix_fd>) Effect.t (** See {!pipe} *)

module Rcfd = Rcfd

module Fork_action = Fork_action
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Fork_action is a sensible name, but I wonder if people will get confused with Fiber.fork?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be Child_action I suppose. But if you're using the low-level API to get precise control over forking, you probably know what Process.Fork_action means.

@@ -22,3 +22,5 @@ val run : (stdenv -> 'a) -> 'a

module Low_level = Low_level
(** Low-level API for making POSIX calls directly. *)

module Process = Process
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Process be exposed at the top-level of Eio_posix or inside Low_level?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was a bit of a hack because Process depends on Low_level. I've moved it now, by making Children a separate file instead.


val spawn : sw:Switch.t -> Fork_action.t list -> t
(** [spawn ~sw actions] forks a child process, which executes [actions].
The last action should be {!Fork_action.execve}.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to document this more? You could in theory infer that the following is allowed:

let p = Eio_posix.Process.(spawn ~sw [
      Fork_action.chdir "./bench";
      Fork_action.execve "/bin/ls" ~argv:[| "ls" |] ~env:(Unix.environment ());
      Fork_action.chdir "..";
      Fork_action.execve "/bin/ls" ~argv:[| "ls" |] ~env:(Unix.environment ())
    ]) in
ignore (Eio.Promise.await p.exit_status)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment to execve that it only makes sense to put it last.


let rec reap_nonblocking () =
if Mutex.try_lock lock then (
reap ();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance the reap function above raises a different exception than the ones in the pattern-match and we end up not unlocking the lock?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope not. It's not clear whether it would be safe to release the mutex in that case, as we're probably in a bad state anyway.


type t = {
pid : int;
exit_status : Unix.process_status Promise.t; (** Resolves once the process has finished. *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we expose the concrete type for this, or abstract it and add functions to access pid and exit_status ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I've made it abstract now.

haesbaert and others added 3 commits March 16, 2023 15:58
Also, make `Process.t` abstract and document that `execve` can only go
last.

Suggested by Patrick Ferris.
while (len > 0) {
int wrote = write(fd, buf, len);

if (wrote <= 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's worth checking for EINTR here isn't it? Reasonable chance that if there's an error from a forked child, that some signal is coming back to the parent process and interrupting things.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code runs in the child. Possibly we should be masking signals there though.

@talex5
Copy link
Collaborator Author

talex5 commented Mar 17, 2023

Would standard output redirection be another action?

I've made a branch with some experimental support for inheriting FDs here: https://github.com/talex5/eio/compare/spawn...talex5:eio:inherit-fds?expand=1

We can operate directly on the safe wrapper, which also simplifies the
code a bit.

Also, add a test for `fchdir`.
This simplifies the code a bit.

Also, check the type of the function passed to `Val_fork_fn`.
@talex5 talex5 merged commit a48368c into ocaml-multicore:main Mar 20, 2023
@talex5 talex5 deleted the spawn branch March 20, 2023 12:28
@RyanGibb
Copy link
Contributor

RyanGibb commented Apr 3, 2023

This allows fixing the issue described at #435 (comment) by writing custom fork actions wrapping dup2 and ioctl: RyanGibb/ocaml-exec-shell@eedf2b...main#diff-5f310ca7290aedf0d4ea03186711c91b09aa4f211b0c4f3956260ea10ab98c18

How would you feel about exposing this to and documenting for users? (I.e. not putting it behind Eio_unix.Private)

@talex5
Copy link
Collaborator Author

talex5 commented Apr 5, 2023

@RyanGibb : we could just add setsid and setting the tty actions to Eio itself.

If we want to make this public, it will need a bit of work to put the public header files where other packages can find them.

@RyanGibb
Copy link
Contributor

RyanGibb commented Apr 5, 2023

If we want to make this public, it will need a bit of work to put the public header files where other packages can find them.

That makes sense, I did have to duplicate some declarations and definitions from fork_action.h.

we could just add setsid and setting the tty actions to Eio itself.

I'll try and create a separate PR doing that.

talex5 added a commit to talex5/opam-repository that referenced this pull request Apr 11, 2023
CHANGES:

New features:

- Add eio_posix backend (@talex5 @haesbaert ocaml-multicore/eio#448 ocaml-multicore/eio#477, reviewed by @avsm @patricoferris @polytypic).
  This replaces eio_luv on all platforms except Windows (which will later switch to its own backend). It is a lot faster, provides access to more modern features (such as `openat`), and can safely share OS resources between domains.

- Add subprocess support (@patricoferris @talex5 ocaml-multicore/eio#461 ocaml-multicore/eio#464 ocaml-multicore/eio#472, reviewed by @haesbaert @avsm).
  This is the low-level API support for eio_linux and eio_posix. A high-level cross-platform API will be added in the next release.

- Add `Fiber.fork_seq` (@talex5 ocaml-multicore/eio#460, reviewed by @avsm).
  This is a light-weight alternative to using a single-producer, single-consumer, 0-capacity stream, similar to a Python generator function.

Bug fixes:

- eio_linux: make it safe to share FDs across domains (@talex5 ocaml-multicore/eio#440, reviewed by @haesbaert).
  It was previously not safe to share file descriptors between domains because if one domain used an FD just as another was closing it, and the FD got reused, then the original operation could act on the wrong file.

- eio_linux: release uring if Linux is too old (@talex5 ocaml-multicore/eio#476).
  Avoids a small resource leak.

- eio_linux: improve error handling creating pipes and sockets (@talex5 ocaml-multicore/eio#474, spotted by @avsm).
  If we get an error (e.g. too many FDs) then report it to the calling fiber, instead of exiting the event loop.

- eio_linux: wait for uring to finish before exiting (@talex5 ocaml-multicore/eio#470, reviewed by @avsm).
  If the main fiber raised an exception then it was possible to exit while a cancellation operation was still in progress.

- eio_main: make `EIO_BACKEND` handling more uniform (@talex5 ocaml-multicore/eio#447).
  Previously this environment variable was only used on Linux. Now all platforms check it.

- Tell dune about `EIO_BACKEND` (@talex5 ocaml-multicore/eio#442).
  If this changes, dune needs to re-run the tests.

- eio_linux: add some missing close-on-execs (@talex5 ocaml-multicore/eio#441).

- eio_linux: `read_exactly` fails to update file offset (@talex5 ocaml-multicore/eio#438).

- Work around dune `enabled_if` bug on non-Linux systems (@polytypic ocaml-multicore/eio#475, reviewed by @talex5).

- Use raw system call of `getrandom` for glibc versions before 2.25 (@zenfey ocaml-multicore/eio#482).

Documentation:

- Add `HACKING.md` with hints for working on Eio (@talex5 ocaml-multicore/eio#443, reviewed by @avsm @polytypic).

- Improve worker pool example (@talex5 ocaml-multicore/eio#454).

- Add more Conditions documentation (@talex5 ocaml-multicore/eio#436, reviewed by @haesbaert).
  This adds a discussion of conditions to the README and provides examples using them to handle signals.

- Condition: fix the example in the docstring (@avsm ocaml-multicore/eio#468).

Performance:

- Add a network benchmark using an HTTP-like protocol (@talex5 ocaml-multicore/eio#478, reviewed by @avsm @patricoferris).

- Add a benchmark for reading from `/dev/zero` (@talex5 ocaml-multicore/eio#439).

Other changes:

- Add CI for macOS (@talex5 ocaml-multicore/eio#452).

- Add tests for `pread`, `pwrite` and `readdir` (@talex5 ocaml-multicore/eio#451).

- eio_linux: split into multiple files (@talex5 ocaml-multicore/eio#465 ocaml-multicore/eio#466, reviewed by @avsm).

- Update Dockerfile (@talex5 ocaml-multicore/eio#471).

- Use dune.3.7.0 (@patricoferris ocaml-multicore/eio#457).

- Mint exclusive IDs across domains (@TheLortex ocaml-multicore/eio#480, reported by @haesbaert, reviewed by @talex5).
  The tracing currently only works with a single domain anyway, but this will change when OCaml 5.1 is released.
talex5 added a commit to talex5/opam-repository that referenced this pull request Jun 2, 2023
CHANGES:

New features:

- Add eio_posix backend (@talex5 @haesbaert ocaml-multicore/eio#448 ocaml-multicore/eio#477, reviewed by @avsm @patricoferris @polytypic).
  This replaces eio_luv on all platforms except Windows (which will later switch to its own backend). It is a lot faster, provides access to more modern features (such as `openat`), and can safely share OS resources between domains.

- Add subprocess support (@patricoferris @talex5 ocaml-multicore/eio#461 ocaml-multicore/eio#464 ocaml-multicore/eio#472, reviewed by @haesbaert @avsm).
  This is the low-level API support for eio_linux and eio_posix. A high-level cross-platform API will be added in the next release.

- Add `Fiber.fork_seq` (@talex5 ocaml-multicore/eio#460, reviewed by @avsm).
  This is a light-weight alternative to using a single-producer, single-consumer, 0-capacity stream, similar to a Python generator function.

Bug fixes:

- eio_linux: make it safe to share FDs across domains (@talex5 ocaml-multicore/eio#440, reviewed by @haesbaert).
  It was previously not safe to share file descriptors between domains because if one domain used an FD just as another was closing it, and the FD got reused, then the original operation could act on the wrong file.

- eio_linux: release uring if Linux is too old (@talex5 ocaml-multicore/eio#476).
  Avoids a small resource leak.

- eio_linux: improve error handling creating pipes and sockets (@talex5 ocaml-multicore/eio#474, spotted by @avsm).
  If we get an error (e.g. too many FDs) then report it to the calling fiber, instead of exiting the event loop.

- eio_linux: wait for uring to finish before exiting (@talex5 ocaml-multicore/eio#470, reviewed by @avsm).
  If the main fiber raised an exception then it was possible to exit while a cancellation operation was still in progress.

- eio_main: make `EIO_BACKEND` handling more uniform (@talex5 ocaml-multicore/eio#447).
  Previously this environment variable was only used on Linux. Now all platforms check it.

- Tell dune about `EIO_BACKEND` (@talex5 ocaml-multicore/eio#442).
  If this changes, dune needs to re-run the tests.

- eio_linux: add some missing close-on-execs (@talex5 ocaml-multicore/eio#441).

- eio_linux: `read_exactly` fails to update file offset (@talex5 ocaml-multicore/eio#438).

- Work around dune `enabled_if` bug on non-Linux systems (@polytypic ocaml-multicore/eio#475, reviewed by @talex5).

- Use raw system call of `getrandom` for glibc versions before 2.25 (@zenfey ocaml-multicore/eio#482).

Documentation:

- Add `HACKING.md` with hints for working on Eio (@talex5 ocaml-multicore/eio#443, reviewed by @avsm @polytypic).

- Improve worker pool example (@talex5 ocaml-multicore/eio#454).

- Add more Conditions documentation (@talex5 ocaml-multicore/eio#436, reviewed by @haesbaert).
  This adds a discussion of conditions to the README and provides examples using them to handle signals.

- Condition: fix the example in the docstring (@avsm ocaml-multicore/eio#468).

Performance:

- Add a network benchmark using an HTTP-like protocol (@talex5 ocaml-multicore/eio#478, reviewed by @avsm @patricoferris).

- Add a benchmark for reading from `/dev/zero` (@talex5 ocaml-multicore/eio#439).

Other changes:

- Add CI for macOS (@talex5 ocaml-multicore/eio#452).

- Add tests for `pread`, `pwrite` and `readdir` (@talex5 ocaml-multicore/eio#451).

- eio_linux: split into multiple files (@talex5 ocaml-multicore/eio#465 ocaml-multicore/eio#466, reviewed by @avsm).

- Update Dockerfile (@talex5 ocaml-multicore/eio#471).

- Use dune.3.7.0 (@patricoferris ocaml-multicore/eio#457).

- Mint exclusive IDs across domains (@TheLortex ocaml-multicore/eio#480, reported by @haesbaert, reviewed by @talex5).
  The tracing currently only works with a single domain anyway, but this will change when OCaml 5.1 is released.
talex5 added a commit to talex5/opam-repository that referenced this pull request Jun 2, 2023
CHANGES:

New features:

- Add eio_posix backend (@talex5 @haesbaert ocaml-multicore/eio#448 ocaml-multicore/eio#477, reviewed by @avsm @patricoferris @polytypic).
  This replaces eio_luv on all platforms except Windows (which will later switch to its own backend). It is a lot faster, provides access to more modern features (such as `openat`), and can safely share OS resources between domains.

- Add subprocess support (@patricoferris @talex5 ocaml-multicore/eio#461 ocaml-multicore/eio#464 ocaml-multicore/eio#472, reviewed by @haesbaert @avsm).
  This is the low-level API support for eio_linux and eio_posix. A high-level cross-platform API will be added in the next release.

- Add `Fiber.fork_seq` (@talex5 ocaml-multicore/eio#460, reviewed by @avsm).
  This is a light-weight alternative to using a single-producer, single-consumer, 0-capacity stream, similar to a Python generator function.

Bug fixes:

- eio_linux: make it safe to share FDs across domains (@talex5 ocaml-multicore/eio#440, reviewed by @haesbaert).
  It was previously not safe to share file descriptors between domains because if one domain used an FD just as another was closing it, and the FD got reused, then the original operation could act on the wrong file.

- eio_linux: release uring if Linux is too old (@talex5 ocaml-multicore/eio#476).
  Avoids a small resource leak.

- eio_linux: improve error handling creating pipes and sockets (@talex5 ocaml-multicore/eio#474, spotted by @avsm).
  If we get an error (e.g. too many FDs) then report it to the calling fiber, instead of exiting the event loop.

- eio_linux: wait for uring to finish before exiting (@talex5 ocaml-multicore/eio#470, reviewed by @avsm).
  If the main fiber raised an exception then it was possible to exit while a cancellation operation was still in progress.

- eio_main: make `EIO_BACKEND` handling more uniform (@talex5 ocaml-multicore/eio#447).
  Previously this environment variable was only used on Linux. Now all platforms check it.

- Tell dune about `EIO_BACKEND` (@talex5 ocaml-multicore/eio#442).
  If this changes, dune needs to re-run the tests.

- eio_linux: add some missing close-on-execs (@talex5 ocaml-multicore/eio#441).

- eio_linux: `read_exactly` fails to update file offset (@talex5 ocaml-multicore/eio#438).

- Work around dune `enabled_if` bug on non-Linux systems (@polytypic ocaml-multicore/eio#475, reviewed by @talex5).

- Use raw system call of `getrandom` for glibc versions before 2.25 (@zenfey ocaml-multicore/eio#482).

Documentation:

- Add `HACKING.md` with hints for working on Eio (@talex5 ocaml-multicore/eio#443, reviewed by @avsm @polytypic).

- Improve worker pool example (@talex5 ocaml-multicore/eio#454).

- Add more Conditions documentation (@talex5 ocaml-multicore/eio#436, reviewed by @haesbaert).
  This adds a discussion of conditions to the README and provides examples using them to handle signals.

- Condition: fix the example in the docstring (@avsm ocaml-multicore/eio#468).

Performance:

- Add a network benchmark using an HTTP-like protocol (@talex5 ocaml-multicore/eio#478, reviewed by @avsm @patricoferris).

- Add a benchmark for reading from `/dev/zero` (@talex5 ocaml-multicore/eio#439).

Other changes:

- Add CI for macOS (@talex5 ocaml-multicore/eio#452).

- Add tests for `pread`, `pwrite` and `readdir` (@talex5 ocaml-multicore/eio#451).

- eio_linux: split into multiple files (@talex5 ocaml-multicore/eio#465 ocaml-multicore/eio#466, reviewed by @avsm).

- Update Dockerfile (@talex5 ocaml-multicore/eio#471).

- Use dune.3.7.0 (@patricoferris ocaml-multicore/eio#457).

- Mint exclusive IDs across domains (@TheLortex ocaml-multicore/eio#480, reported by @haesbaert, reviewed by @talex5).
  The tracing currently only works with a single domain anyway, but this will change when OCaml 5.1 is released.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants