Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate the POSIX API layer #6600

Open
LemonBoy opened this issue Oct 7, 2020 · 33 comments
Open

Eliminate the POSIX API layer #6600

LemonBoy opened this issue Oct 7, 2020 · 33 comments
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@LemonBoy
Copy link
Contributor

LemonBoy commented Oct 7, 2020

The os namespace (to be renamed posix) is tackling the problem of providing a cross-platform abstraction over the operating system facilities from a, in my opinion, wrong angle. In this essay I will briefly explain why.

The aim here is having a single level of abstraction that works well enough both for direct consumers (users of std.os namespace) or indirect consumers (eg. using std.fs abstractions built on std.os) without leaking too many details about the underlying implementation. The current approach tries to clump everything under the posix name, a name that carries a heavy baggage of do's and dont's that other platforms (mainly Windows) may not agree with.

A few examples:

  • rename doesn't follow the posix semantics on Windows, unless you have a very recent Windows 10 version and FILE_RENAME_FLAG_POSIX_SEMANTICS is used.
  • open accepts a well-defined set of options that's not even shared across all the posix-compatible and for Windows this means we're pinky-swearing to faithfully translate all of them into something equivalent. What if there's no direct equivalent? If you promise posix compatibility I'd expect something akin to mingw.
    • What we're interested in is getting a handle on a given file, with a given set of flags.
  • stat doesn't exist on Windows, GetFileAttributesEx or equivalents are used and the resulting infos are chopped into a posix stat structure. It's a clear case of square peg in a round hole as many fields are unix-specific.
    • What we're interested in is getting a basic set of infos about the file, platform-specific extra stuff may be retrieved/modified with a companion function.
  • preadv/pwritev are not available on Windows and, at the moment at least, you get a nice compile error trying to use them.
    • What we're interested in is a {write,read}Many function, implemented using p{write,read}v or ReadFileScatter/WriteFileGather.
    • iovec is a posix thing, I just want to write/read a [][]u8!
  • copy_file_range is a Linux-only syscall with a fallback on pwrite/pread that in turn falls back on ReadFile/WriteFile on Windows.
    • IMO this doesn't even belong to os as it can be safely implemented somewhere in fs, calling copy_file_range or other platform-specific APIs as needed.
    • It's not portable! It's not part of the posix standard!
    • copy_file_range is all about efficiency as it's done at the FS level, if the fallback path is triggered you get a disappointing read/write pair (not even a loop!)

The point is that we should aim at breaking free from the posix rules and write our own, a full-blown posix compatibility layer is not something that belongs to the stdlib. I see Rust took this very same approach (I'm looking at the filsystem-related part), their API surface is small and comprises all the common bits required by the users (external ones and ones working on the stdlib). A small note lets the user know what platform-specific method is used, but that's it.

The gist of this proposal is:

  • No more posix compatibility layer
  • os becomes the home of all the cross-platform native interactions with the OS. No fallbacks, no posix names, no posix guarantees.
    • No fallbacks really means no half-assed fallbacks, if something can be implemented across all the different platforms with a consistent set of characteristics it belongs to os, otherwise down the platform-specific namespace it goes.
      • ✔️ writeMany implemented with WriteFileGather and pwritev
      • copy_file_range cannot be implemented as zero-copy on Windows! (On other posix-compatible systems sendfile may be used, but the gains are really small unless you're copying a lot of data (I'm using it to copy whole files in std: Make file copy ops use zero-copy mechanisms #6516))

Thanks for watching.

@kprotty
Copy link
Member

kprotty commented Oct 7, 2020

a note on the pread/pwrite bit is that WriteFileGather on windows looks like it only supports page aligned userspace buffers + require null terminating its "iovec equivalent" structure. There exist general purpose vectored IO on windows for sockets using WSASend & friends but the generic "HANDLE" equivalent doesn't seem to be as flexible.

Another point on vectored IO is that having the function take in a generic iovec (which for windows sockets, could be WSABUF) would alleviate the os function from allocating its own in order to represent [][]u8 as the layout of slices is currently both undefined or not matching with the layout of socket iovec types, making casting to and from them probably not correct.

I would like to propose taking the idea of restructuring std.os even further and to get rid of it entirely (or at least hide it from the user), similar to what the Rust stdlib does. Replace the exposed functions by having them in their own higher level structs that they act upon (File/Dir, Socket, Pipe, Process, Thread, Random, IoPoll, Memory, .etc) and not requiring everything to go through a posix-based API as that can restrict and otherwise prohibit certain functionality from non-posix systems like Windows:

Make an std.os.posix which acts like the std.os.linux or partially the current std.os but only for posix systems while windows sticks to using std.os.windows. Then design the apis of the higher level structs based on the greatest common functionality for all stdlib supported OS instead of through the lens of posix only.

@ikskuh
Copy link
Contributor

ikskuh commented Oct 7, 2020

@kprotty i fully support this! Nothing more to add

@katesuyu
Copy link
Contributor

katesuyu commented Oct 7, 2020

I completely agree with this issue. Even Rust isn't a great example of how to do things, although we do fairly well at avoiding the most glaring flaw of Rust's OS abstractions at the moment: not supporting widechar APIs for Windows. I'm not sure we do this everywhere but I've seen Z and W functions enough to be fairly satisfied.

Also, relying on the existence of a given compatibility layer on every platform, even if you restrict it to things you can currently shim over with reasonable precision, has proven to be a huge mistake for stabilized language ecosystems. For example, the introduction of WASI as a target has made the Rust standard library shim over a huge amount of details in a quite inefficient manner, because their APIs and API consumers are hard-dependent on absolute paths and/or being able to access files and directory paths without an existing directory handle. Zig mostly works with directory and file handles already, and supports iterating over WASI preopens instead of pretending paths like /abc or ./abc are even remotely coherent on WASI.

In essence, although plenty of things can be reasonably supported in a cross-platform manner, you cannot commit yourself to such a specific standard like POSIX and expect it to work flawlessly everywhere and fit every use case. std.os should be native rather than trying to stretch itself over unnatural targets, and features elsewhere in the standard library that depend on OS-specific details should not promise features that are impossible to implement correctly for every target. They should be exposed, just not with the proposition that they will work anywhere. An perfect example of this is std.fs.cwd(), which makes the correct choice by emitting a compile error on WASI instead of attempting to shim a concept that is wholly inapplicable to WASI.

@jayschwa
Copy link
Contributor

jayschwa commented Oct 7, 2020

I think the standard library has too much if wasi, else if windows, else posix logic sprinkled throughout it. I can't tell if this proposal would help or hurt that.

My half-baked hot take: a namespace like std.fs shouldn't know what an operating system is. Instead, it should provide a conservative interface for file-like objects that other namespaces (OS or otherwise) can satisfy. io.Reader and io.Writer are good (albeit simpler) examples of what I mean.

This could be layered. If I'm writing a package that allows Zig to target "Jayschwa OS", I could choose to satisfy the std.fs interface directly. Or instead, I could choose to satisfy the lower-level POSIX interface (how BYOS kind of works now) and get its support for std.fs (and probably other interfaces) for free.

@katesuyu
Copy link
Contributor

katesuyu commented Oct 7, 2020

@jayschwa I agree that std.fs should allow overriding the basic file-like objects directly, however I disagree that the switches are bad. Referring to specific concrete APIs, as appropriate, massively improves the readability and navigability of the codebase. Deferring to a duck-typed platform abstraction layer is the real issue. Switches avoid forcing readers of the code to sift through a heap of files to find how each platform implements the same (usually simple) function, and avoid code duplication when multiple platforms are handled by the same switch arm.

Sticking the implementation of everything in separate platform-specific modules is what makes the Rust standard library (and many Rust crates that copy the standard library's organization) horrific to navigate, and the more layers of this you have, the more you end up like our present day std.os (which has already fallen for this trap: see const system and all the conditional usingnamespace declarations).

BYOS should be split up into more discretely implementable interfaces, but there should be no internal abstraction layer: std.fs should keep its switch statements, but allow its structures and functions to be overriden on a case-by-case basis. And specific APIs should not have implementations on platforms where this is impractical!

@Rocknest
Copy link
Contributor

Rocknest commented Oct 7, 2020

I like the idea of a zig-style posix layer that unifies error codes, checks if arguments are correct, takes care of version prefixed/extended/safe apis. This is must have until zig's own higher level apis provide all possible interactions with the os.
However POSIX as a fit-any-os apis is obsolete, existing "posix-compatible" oses continue to diverge from it, some posix abstraction prevent efficient software so most new oses are not compatible with posix.

I think std.os.posix should exist mostly how it is now, however without any fallbacks, so if i want to use a posix api i must check first if it exists: std.os.posix.has("name"). Maybe we should have a common place for fallbacks that std uses, named std.os.@"$internal$" (or other not trivially accessible name), for example copy_file_range would be moved there.

@squeek502
Copy link
Collaborator

squeek502 commented Oct 8, 2020

I've run into confusion regarding this as well, e.g. with realpath (#4658 (comment)):

(note that fs.realpath is just an alias for os.realpath so the following applies to os.realpath)

Right now, the status quo is:

  • On Linux, std.fs.realpath resolves symlinks before ... This matches the system behavior as far as I can tell (cat link/../file will output the contents of linked/../file and fail if it doesn't exist), although there seem to be some edge cases (from my testing, if link is a symlink to linked, then cd link/../dir takes you to ./dir if it exists [ignoring the symlink], otherwise it will take you to linked/../dir [resolving the symlink before ..]).

  • On Windows, std.fs.realpath resolves .. before symlinks. This matches the system behavior as far as I can tell (cd link\..\dir will never take you to linked\..\dir; if .\dir does not exist, it will fail with "The system cannot find the path specified.". Same deal with type link\..\file). Please correct me if I'm wrong on this, but I'm not even sure if there exists a function in the Windows API that resolves symlinks before ... daurnimator has mentioned that Zig might need something like RtlDosPathNameToRelativeNtPathName_U_WithStatus but from using the test code at the bottom of this article, that function does not resolve symlinks before .. either. If there is precedence for Linux-like symlink resolution on Windows, it would be helpful to get that information added to std.fs.realpath bugs/inconsistencies on Windows #4658

It's unclear to me what behavior is 'correct,' and currently it feels like both std.fs and std.os are not very clear about how/if they should handle platform-specific behavior.

@alexnask alexnask added proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library. labels Oct 8, 2020
@qgcarver
Copy link

qgcarver commented Oct 8, 2020

If we go through with this proposal, are we making a decision on #1840 ?

Writing platform code for Windows is pretty shaky right now, and if we're renaming it to posix or not is related. #5037 #4426

Edit: to clarify, #1840 'prefers' ntdll, but #4426 is on shakier ground, no proposal has been accepted.

@Rocknest
Copy link
Contributor

Rocknest commented Oct 8, 2020

@squeek502 in my opinion all platform dependant behaviour in std.fs and other high level apis is a bug. For example in the case of realpath, on windows it should be emulated with few more syscalls, however if some behaviour is unemulatable then it should return error.Unsupported.

@andrewrk andrewrk added this to the 0.8.0 milestone Oct 9, 2020
@ron-wolf
Copy link

I arrived here via the Zig English Telegram group. Here’s a related essay (found on Lobste.rs) about Go’s somewhat similar issues when compared with Rust: “Early Impressions of Go from a Rust Programmer”

@fivemoreminix
Copy link

@squeek502 in my opinion all platform dependant behaviour in std.fs and other high level apis is a bug. For example in the case of realpath, on windows it should be emulated with few more syscalls, however if some behaviour is unemulatable then it should return error.Unsupported.

IMO only common behavior which is known to function fine on most operating systems should be exposed in a high-level interface. There shouldn't be a need for an error.Unsupported when the high-level wrapper over OS functions implements interfaces to common OS functions like File.open(self, URI, mode), File.close(self) etc. OS equivalence should be chosen at compile time, to provide compile-time errors for impossible instructions.

When a user (of the OS library) requires an OS-specific API, they can get it from that OS-specific library. Fits with Zig zen of communicating intent precisely and preferring compile errors opposed to runtime crashes.

@andrewrk
Copy link
Member

andrewrk commented Oct 27, 2020

I generally support this. That said, std.os currently has 5,657 lines of code, and it's useful logic that needs to exist somewhere. So I'm not sure what it would look like to slap an "accept" label on this and then start implementing it. But I would be in favor of and supportive of self-contained patches that start moving the standard library toward's @LemonBoy's vision, provided that we have clear upgrade paths for the existing use cases we support.

@LemonBoy Here are some questions I have about your vision. And note these are not arguments against it; my intent is to help move your project forward.

  • What does it look like when I want to write zig code that calls execve and I don't care about Windows or other operating systems that don't support it? Currently that looks like calling std.os.execve. In C, it looks like #include <unistd.h> and calling execve. Same question for: pipe, waitpid, kevent, and a few more.
  • What about the current pattern of std.time, std.fs, std.process, std.net which are the high level operating system abstractions? How will these be affected by this?
  • How does your vision handle the (underrated IMO) feature of calling libc functions when libc is linked but not otherwise?
  • Can you incorporate the Bring-Your-Own-Operating-System-Layer idea into this vision?

Also just to clarify - nothing in the std lib is required to go through std.os as a "lower level API". That was only ever done when it made sense in terms of code organization. It just happens to be a convenient place to put a bunch of abstractions.

Can we come up with some clear bullet points which act as guidelines for creating patches towards the goal of this issue? Maybe an example patch that incrementally moves the std lib towards this?

(edit: also let's wait until 0.7.0 is tagged before starting to work on this, seems like a lot of breaking changes)

@LemonBoy
Copy link
Contributor Author

What does it look like when I want to write zig code that calls execve and I don't care about Windows or other operating systems that don't support it? Currently that looks like calling std.os.execve. In C, it looks like #include <unistd.h> and calling execve. Same question for: pipe, waitpid, kevent, and a few more.

The idea is for std.os.<os name> to contain all the syscall/libc wrappers with no sugar coating: if a functionality is not available there's no fallback implementation, if the host doesn't support a given syscall return error.Unsupported.
This will be the corner stone for all the platform-specific code.

If a os.posix is still wanted it will only re-export a few fns from the std.os.<os name> namespace.
Again, no cross-platform abstractions here, it should simply act as a convenience layer for the advanced user.

What about the current pattern of std.time, std.fs, std.process, std.net which are the high level operating system abstractions? How will these be affected by this?

The high-level abstractions build upon std.os.<os name> primitives and will bend over backwards to perform a given operation across all the different platforms. Whether to put those abstractions in std.<module> (I love go's approach, they have eg. time_unix and time_windows to keep the different implementations separate and avoid the ir.cpp effect :P) or in the newly-vacated std.os is an open question.

How does your vision handle the (underrated IMO) feature of calling libc functions when libc is linked but not otherwise?

As stated in the first point std.os.<os name> will either syscall or call the libc equivalent (and convert the error codes).
The transparent libc/syscall switch comes with its own set of problems such as potential ABI incompatibilities (eg. the stat structure used by the kernel is likely to be different from the one used by the libc, especially after all the y38k changes) and subtle differences in behaviour (eg. #1337).

Can you incorporate the Bring-Your-Own-Operating-System-Layer idea into this vision?

Well this requires a bit of thought, the easieast solution would be to let high-level abstractions mentioned above add an extra arm in the platform-switching code.
Eg. in std.fs (or std.os) we may want something like this:

// openFile is fn (dir_handle: OSHandle, path: []const u8, options: OpenFileOptions) OSHandle
const openFile = switch (builtin.os.tag) {
    .linux, .openbsd, .netbsd, ... => openFileUnix,
    .windows => openFileWindows,
    .freestanding => rootOs.openFile,
}

But what if the OS you're targeting has no openat semantics? Well you're screwed, unless you restrict the dir_handle parameter to be the cwd... but what if the OS has no concept of cwd?

CC @IridescentRose as the PSP libc suffers from the lack of openat semantics.

You can see that trying to accommodate every single use case means that we're painting ourselves in a corner and will have to target an unknown minimal set of features when designing the interface APIs.

This idea proposed by @jayschwa is interesting on paper as it moves part of the complexity out of the stdlib and back into the BYOS implementer court.

This could be layered. If I'm writing a package that allows Zig to target "Jayschwa OS", I could choose to satisfy the std.fs interface directly. Or instead, I could choose to satisfy the lower-level POSIX interface (how BYOS kind of works now) and get its support for std.fs (and probably other interfaces) for free.

Can we come up with some clear bullet points which act as guidelines for creating patches towards the goal of this issue? Maybe an example patch that incrementally moves the std lib towards this?

Well the first step would be defining the set of high-level abstractions to build and define their behaviour and how to implement it on Linux/BSD/Windows/Wasi.
For example let's focus on a stat replacement:

const FileInfo = struct {
    size: u64,
    kind: enum { Directory, Regular, Pipe, ... },
    mod_time: SomeCoolY38KTimestamp,
    access_time: SomeCoolY38KTimestamp,
    // permissions (as bools? as bitmap + accessors?)
    // file mode
    // a tagged union could be added to hold all the other platform-specific infos
};
fn getFileInfoByHandle(handle: OSHandle) FileInfoError!FileInfo {
    // implement with statx on linux (or stat64 if not available)
    // implement with stat on Darwin/BSD
    // implement with GetFileInformationByHandleEx on WIndows
    // implement with path_filestat_get on Wasi
}

And voilá, we got a nice ergonomic (let's make extensive use of enums and getters) cross-platform abstraction that's even 2038-compliant!

@andrewrk
Copy link
Member

OK thanks for clarifying the vision! I can see how this would work going forward.

@rhencke
Copy link

rhencke commented Nov 1, 2020

@LemonBoy You may find it interesting to look at how SQLite does it, too. I believe it's similar in spirit to your proposal. (But it's also hot-pluggable with support for custom implementations and multiple implementations at once which is pretty neat)

@heidezomp
Copy link
Contributor

@LemonBoy Do you already have an idea of how the high-level abstraction will deal with OS-specific errors? Specifically, should there be error values that will only be returned on specific OSes (like the current NetworkSubsystemFailed error that is only returned on Windows), or should the high-level abstraction abstract away the OS-specific errors as well?

@LemonBoy
Copy link
Contributor Author

LemonBoy commented Nov 3, 2020

You may find it interesting to look at how SQLite does it, too. I believe it's similar in spirit to your proposal. (But it's also hot-pluggable with support for custom implementations and multiple implementations at once which is pretty neat)

Yep, the idea is to build a similar (but wider) cross-platform abstraction in the stdlib so that not every app/library has to invent their own.

Do you already have an idea of how the high-level abstraction will deal with OS-specific errors?

If error unions had values we could just add a OSError: <error-type> and let the caller handle the weird cases.
No idea about errors yet, grouping them loses some part of the information they carry, on the other hand having enormous error sets is not a pleasant experience for the caller who has to pick what to handle and what to rethrow.

@ghost
Copy link

ghost commented Jan 11, 2021

If I'm writing a new OS, I'm not even going to think about bugging Zig proper to upstream support for it until it's well and truly done (not that we would or should even consider it), so until then I'll be dependent on BYOOS for development. If BYOOS is optimised for POSIX, that's going to push me to make it at least POSIX-like. In my eyes, that's discouraging experimentation, and entrenching half-century-old ideas.

I think the interface exposed by os, in std as well as the RSF, should be a bit higher-level, so as to abstract over as-yet-unforeseen system designs. That is, it provides a basic "standard library-esque" interface to spawning threads, creating processes, reading and writing files etc., rather than attempting to exactly provide all system or library calls for the platform. That is, rather than the standard library calling into os for low-level operations within its own high-level logic, os itself would provide the basic logic, and the standard library would provide extra patterns, functionality or ergonomics. (Note it would still be possible to provide a POSIX layer to depend on if such a thing were desired, by including a posix member in os; such a member would of course be included in the standard library's own os.) This way, entirely new classes of systems could be integrated with no legacy and no language contortion.

Apologies if this has already been said, it's 4am and my focus isn't great at the best of times.

matu3ba added a commit to matu3ba/zig that referenced this issue Feb 25, 2023
- Alphabetically sort things, where reasonable.
- Document, that **only** non-portable posix things belong into posix.zig
  * If there is a portable abstraction, do not offer one in posix.zig
  * Reason: Prevent useless abstractions and needless strong coupling.
- Move posix functions into posix.zig
- Move wasi functions into wasi.zig

Closes ziglang#6600.
matu3ba added a commit to matu3ba/zig that referenced this issue Mar 3, 2023
- Alphabetically sort things, where reasonable.
- Document, that **only** non-portable posix things belong into posix.zig
  * If there is a portable abstraction, do not offer one in posix.zig
  * Reason: Prevent useless abstractions and needless strong coupling.
- Move posix functions into posix.zig
- Move wasi functions into wasi.zig

Closes ziglang#6600.
matu3ba added a commit to matu3ba/zig that referenced this issue Mar 4, 2023
- Alphabetically sort things, where reasonable.
- Document, that **only** non-portable posix things belong into posix.zig
  * If there is a portable abstraction, do not offer one in posix.zig
  * Reason: Prevent useless abstractions and needless strong coupling.
- Move posix functions into posix.zig
- Move wasi functions into wasi.zig

Closes ziglang#6600.
matu3ba added a commit to matu3ba/zig that referenced this issue Mar 7, 2023
- Alphabetically sort things, where reasonable.
- Document, that **only** non-portable posix things belong into posix.zig
  * If there is a portable abstraction, do not offer one in posix.zig
  * Reason: Prevent useless abstractions and needless strong coupling.
- Move posix-only functions into posix.zig, which have either incompatible
  or more extensive execution semantics than their counterparts and can be
  grouped into
  * File permission system
  * Process management
  * Memory management
  * IPC
  * Signaling

Work on ziglang#6600.
matu3ba added a commit to matu3ba/zig that referenced this issue Mar 8, 2023
- Alphabetically sort things, where reasonable.
- Document, that **only** non-portable posix things belong into posix.zig
  * If there is a portable abstraction, do not offer one in posix.zig
  * Reason: Prevent useless abstractions and needless strong coupling.
- Move posix-only functions into posix.zig, which have either incompatible
  or more extensive execution semantics than their counterparts and can be
  grouped into
  * File permission system
  * Process management
  * Memory management
  * IPC
  * Signaling

Work on ziglang#6600.
matu3ba added a commit to matu3ba/zig that referenced this issue Mar 12, 2023
- Alphabetically sort things, where reasonable.
- Document, that **only** non-portable posix things belong into posix.zig
  * If there is a portable abstraction, do not offer one in posix.zig
  * Reason: Prevent useless abstractions and needless strong coupling.
- Move posix-only functions into posix.zig, which have either incompatible
  or more extensive execution semantics than their counterparts and can be
  grouped into
  * File permission system
  * Process management
  * Memory management
  * IPC
  * Signaling

Work on ziglang#6600.
@blblack
Copy link
Contributor

blblack commented Mar 8, 2024

As someone with a background in writing systems-level software (as in network daemons and such) in the traditional *nix/POSIX/C world, and who loves the idea of Zig-the-language as a C-the-language replacement for both porting old and developing new systems-level software, I'd like to offer an opinionated take on the current state of affairs, the reasonable stuff I've seen outlined above, and a few opinions of my own. Maybe this can at least restart the debate process towards an implementable outcome everyone can aim towards.

Keep in mind I'm relatively-new to Zig itself and still finding my way around. If I've straight up misunderstood something, please let me know!

Current state of affairs in master

  • The first ~80 lines or so of lib/std/os.zig, up through the definition of pub const system and pub const use_libc, basically pull all the std.c + std.os.somesytem stuff together in a conditional way that results in a std.os.system that will vary based on platform and whether libc is linked, and does a pretty smart job. The std.os.system we end up with here has no portability standards, it's just a convenient namespace to hold the non-portable raw interfaces of whatever platform/libc-ness we happen to be building on. This is seems like Good Thing.
  • The rest of lib/std/os.zig, which is kind of in a messy state. The vast majority of it seems to be POSIX interfaces that try to either be inclusive of non-POSIX platforms through some kind of emulation, and/or they have conditionals to exclude them with some kind of if (builtin.os.tag == .windows) @compileError("Unsupported OS"); sort of thing.
  • Even in the case that the underlying platform is POSIXy, these interfaces tend to do a few things which some might consider to be a bit beyond the most basic way things could be done:
    • They clean up errno -based return statuses into proper Zig errors, mostly while still preserving the ability for calling code to discern the difference for logical reasons.
    • They sometimes have a very different set of arguments than the natural POSIX/C calls would have had. std.os.nanosleep() is a good example. A faithful "real" POSIX interface to this, with errnos made into Errors, would look something like: nanosleep(req: *const timespec, rem: ?*timespec) NanosleepError!void, but the current os.zig variant has the interface: nanosleep(seconds: u64, nanoseconds: u64) void.
    • Relatedly they also make some behavioral abstraction changes vs real POSIX. The best example is that it's very common for many of them to automatically retry themselves on EINTR before returning to the caller. Once again, nanosleep() is a great example of this, and it's what allows the interface divergence above to work at all.
  • Then to bring all this together at the end, we have this going on in std/std.zig, which basically aliases all the os.zig stuff to the namespace std.posix as well:
/// POSIX-like API layer.
pub const posix = @import("os.zig");
/// Non-portable Operating System-specific API.
pub const os = @import("os.zig");

Opinions and what I think might be a reasonable capture or at least interpretation of current consensus, maybe?

  • In the overall, I agree with the gist of @LemonBoy 's initial proposal and followup responses. I do not think POSIX, even conceptually, should be some unifying fundamental abstraction that Zig std aims to interject between all the low-level OS interfaces for all the platforms it supports and the higher-level abstractions within std. The platforms are not all natively POSIX, emulations tend to be poor/leaky/broken abstractions at best, etc.
  • I do think we should have a POSIX abstraction layer for those platforms which are POSIXy, which makes no real attempt to support the non-POSIXy platforms. Its goal should be to make life easier for both systems-level application software that needs to call those interfaces across the POSIXy platforms, and to make it easier to write the POSIX platform support for higher-level abstractions in places like std.fs, std.time, etc (which also support the other platforms via non-POSIX pathways). I think this should live in a true lib/std/posix.zig, separately from lib/std/os.zig. Move most of the POSIX stuff in os.zig out to posix.zig and stop aliasing them.
  • I think if there are cases where std could use a generic non-POSIX wrapper for something that covers portability across most/all platforms, that might belong properly belong directly in the new lib/std/os.zig, like @LemonBoy 's example of "writeMany implemented with WriteFileGather and pwritev".

@blblack
Copy link
Contributor

blblack commented Mar 8, 2024

Further bits (which perhaps belong in a separate proposal?)

Assuming the above seems reasonable, I do have some other thoughts beyond that which might merit either further debate here or perhaps a separate proposal:

  • I think this std.posix abstraction layer should stay as faithful to the basic POSIX way of doing things as it can. It shouldn't be fundamentally changing the inputs and outputs and behaviors (esp EINTR!) in the way that os.nanosleep() currently does. It's going to be std abstraction authors and developers writing systems software that use them, and neither camp probably wants to be hampered by excess abstractions/limitations in this layer.
  • I do think it should do just a little bit more than merely a direct call to os.system.foo() (which you could already do without a std.posix): (a) very light portability work, where required, across the various POSIXy platforms, to paper over minor divergences and (b) translating errno-like errors to distinct real Zig Errors like most of them do today (so that traditional errno-handling logic can still be applied to the various Error cases by the caller).
  • In this way std.posix serves the same basic purpose as a generic libc implementation wrapper does on current POSIX systems in the C world (and no purpose at all on non-POSIXy platforms). From the view of a zig user writing systems software, anywhere they called some libc or syscall interface in a daemon written in C, they'd call either std.posix.foo() (basic usage of common POSIXy interfaces with Zig types and Zig Errors) or either of std.os.system.foo() or std.os.linux.foo() when they know they're using less-portable interfaces from platform-conditional code for a reason and don't mind interpreting errno and such all on their own.
  • I think cases like the auto-EINTR-retrying nanosleep code could live either in the new os.zig (under some other function name to avoid confusion, perhaps), or be something that's handled when necessary in higher-level abstractions up in places like std.time.
  • As for the argv debate: probably std.os.argv shouldn't exist, but std.posix.argv should exist and work only on POSIXy platforms that have a native argv.

@blblack
Copy link
Contributor

blblack commented Mar 19, 2024

Note very-related work ongoing in #19354

@andrewrk andrewrk removed the accepted This proposal is planned. label Mar 19, 2024
@andrewrk andrewrk changed the title A hate letter to std.os Eliminate the POSIX compatibility layer Mar 19, 2024
@andrewrk andrewrk changed the title Eliminate the POSIX compatibility layer Eliminate the POSIX API layer Mar 19, 2024
@andrewrk andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. accepted This proposal is planned. and removed accepted This proposal is planned. labels Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

No branches or pull requests