Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.os.foo functions return usize error codes instead of error unions #17870

Open
shmuelamit opened this issue Nov 5, 2023 · 5 comments
Open

Comments

@shmuelamit
Copy link

Note: There was a small previous discussion I posted on Ziggit regarding this matter. I got recommended to file an issue to discuss this potential change.

I tried writing some code that used Linux-specific syscalls (specifically mount) and was surprised that the implementation of these syscalls returned actual usize integers as return codes and not zig error-sets as with most other functions in std. I will try and present my objections, and hope someone will refute them.
I tried looked into it and found #2380 which reorganized std.os, and it states:

In Zig-flavored POSIX [what is now std.os], errno is not exposed; instead actual zig error unions and error sets are used. When not linking libc on Linux there will never be an errno variable, because the syscall return code contains the error code in it. However for OS-specific APIs, where that OS requires the use of libc, errno may be exposed, for example as std.os.darwin.errno().

I personally do not understand the reasons for the choice of making std.os.linux syscall return a number instead of an error set, and would like someone to explain to me why it makes sense. It degrades syscall error handling to a C-style error handling style. And so for system-specific syscalls Zig shares the faults of the C error handling system (or at least up to ignoring return values):

// C raw syscalls
int main() {
    int fd = open("does_not_exist/foo.txt", O_CREAT);
    write(fd, "hi", 2);
    close(fd);
    return 0;
}
// Zig raw syscalls
pub fn main() !void {
    var fd: isize = @bitCast(linux.open("does_not_exist/foo.txt", 0, linux.O.CREAT));
    _ = linux.write(@truncate(fd), "hi", 2);
    _ = linux.close(@truncate(fd));
}

This style of course half-implicitly ignores any errors returned from these without any try keywords. This leads people who write code which heavily uses OS-specific interfaces to write a wrapper file with error-set wrappers like this one with code that should honestly be in std.
Another problem created which one might notice in the above code is the current interface creates a lot of redundant casting, it would make more sense for std.os.linux.open to return an i32 type with an error set instead of a usize.
A supposedly correct program which at least panics on errors (not even gracefully handles cases) from these functions will look like this:

const std = @import("std");
const linux = @import("std").os.linux;

inline fn panic_on_err(value: usize) void {
    if (linux.getErrno(value) != .SUCCESS) {
        std.debug.panic("There was _some_ error, but to know which enum values to check we must consult manpages! {}", .{linux.getErrno(value)});
    }
}

pub fn main() !void {
    var ret = linux.open("does_not_exist/foo.txt", 0, linux.O.CREAT);
    panic_on_err(ret);

    var fd: i32 = @truncate(@as(isize, @bitCast(ret)));
    ret = linux.write(fd, "hi", 2);
    panic_on_err(ret);

    ret = linux.close(fd);
    panic_on_err(ret);
}

Which, for me feels too unziggy. I would like to understand the reasons for this design which (for me at least) feels like it makes it harder to write code which uses syscalls and propose that a change should be made to these functions.

@squeek502
Copy link
Collaborator

squeek502 commented Nov 5, 2023

One complication is that the full set of errors from a syscall is practically unknowable, and may change based on the use-case. Because of this, 'forcing' users of direct syscall functions to handle the open-endedness of the return value is a slight benefit of how things are currently IMO.

I would also point out that the intended way to use the direct syscall functions is to use getErrno and switch on the resulting enum, and this is how they are used throughout the Zig standard library. For example:

zig/lib/std/os.zig

Lines 1499 to 1523 in e9a6197

const rc = open_sym(file_path, flags, perm);
switch (errno(rc)) {
.SUCCESS => return @as(fd_t, @intCast(rc)),
.INTR => continue,
.FAULT => unreachable,
.INVAL => unreachable,
.ACCES => return error.AccessDenied,
.FBIG => return error.FileTooBig,
.OVERFLOW => return error.FileTooBig,
.ISDIR => return error.IsDir,
.LOOP => return error.SymLinkLoop,
.MFILE => return error.ProcessFdQuotaExceeded,
.NAMETOOLONG => return error.NameTooLong,
.NFILE => return error.SystemFdQuotaExceeded,
.NODEV => return error.NoDevice,
.NOENT => return error.FileNotFound,
.NOMEM => return error.SystemResources,
.NOSPC => return error.NoSpaceLeft,
.NOTDIR => return error.NotDir,
.PERM => return error.AccessDenied,
.EXIST => return error.PathAlreadyExists,
.BUSY => return error.DeviceBusy,
else => |err| return unexpectedErrno(err),
}

Also worth noting that the same pattern (switch on enum) applies to Windows APIs, see the File.stat implementation for example:

zig/lib/std/fs/file.zig

Lines 390 to 400 in e9a6197

const rc = windows.ntdll.NtQueryInformationFile(self.handle, &io_status_block, &info, @sizeOf(windows.FILE_ALL_INFORMATION), .FileAllInformation);
switch (rc) {
.SUCCESS => {},
// Buffer overflow here indicates that there is more information available than was able to be stored in the buffer
// size provided. This is treated as success because the type of variable-length information that this would be relevant for
// (name, volume name, etc) we don't care about.
.BUFFER_OVERFLOW => {},
.INVALID_PARAMETER => unreachable,
.ACCESS_DENIED => return error.AccessDenied,
else => return windows.unexpectedStatus(rc),
}


Anyway, it might make sense to have essentially three layers. Using the open syscall as an example:

  • std.os.linux.syscall.open that performs the syscall and returns usize (like std.os.linux.open currently does)
  • std.os.linux.open that performs the syscall and returns an error set with 1:1 translation of the known possible errors and no interpretation (this doesn't really have any precedence in Zig currently AFAIK, since going from errno -> Zig error set currently always involves interpretation and decisions about what is/isn't unreachable)
  • std.os.open that returns an error set common to all supported platforms, and makes decisions about what is actually reachable (see also Eliminate the POSIX API layer #6600)

@shmuelamit
Copy link
Author

@squeek502 I think the interface you suggested is quite good, maybe renaming the namespace from syscall to raw will be better (after all they are both syscalls).

Because of this, 'forcing' users of direct syscall functions to handle the open-endedness of the return value is a slight benefit of how things are currently IMO.

However I am having trouble understanding this point, can you give me as an example a use case where one would like to manually switch between the usize options, instead of receiving all the options in an error set and using standard error handling? If there will be a 1-1 correspondence between errno and the error set it seems like it's impossible to get an unexpected return value (unless the OS is acting silly).

I think your proposal is good but I am failing to understand the use cases for std.os.linux.syscall.

@squeek502
Copy link
Collaborator

squeek502 commented Nov 6, 2023

I think your proposal is good but I am failing to understand the use cases for std.os.linux.syscall.

The thing I had in mind would be syscalls that can be used for many purposes, where each use case could have a different error set / take different parameter types / return different things. One example I could find of this type of thing is the ioctl syscall, which currently has one usage-specific wrapper in std.os:

zig/lib/std/os.zig

Lines 6941 to 6961 in e74ced2

pub const IoCtl_SIOCGIFINDEX_Error = error{
FileSystem,
InterfaceNotFound,
} || UnexpectedError;
pub fn ioctl_SIOCGIFINDEX(fd: fd_t, ifr: *ifreq) IoCtl_SIOCGIFINDEX_Error!void {
while (true) {
switch (errno(system.ioctl(fd, SIOCGIFINDEX, @intFromPtr(ifr)))) {
.SUCCESS => return,
.INVAL => unreachable, // Bad parameters.
.NOTTY => unreachable,
.NXIO => unreachable,
.BADF => unreachable, // Always a race condition.
.FAULT => unreachable, // Bad pointer parameter.
.INTR => continue,
.IO => return error.FileSystem,
.NODEV => return error.InterfaceNotFound,
else => |err| return unexpectedErrno(err),
}
}
}

And there's another usage within isatty:

zig/lib/std/os.zig

Lines 3275 to 3280 in e74ced2

const rc = linux.syscall3(.ioctl, fd, linux.T.IOCGWINSZ, @intFromPtr(&wsz));
switch (linux.getErrno(rc)) {
.SUCCESS => return true,
.INTR => continue,
else => return false,
}

which could be moved into its own IOCGWINSZ-specific wrapper.

This type of thing is much more relevant for Windows APIs, which can have very disparate and non-overlapping usages for the same function. See #14842 for an example of where a generic wrapper was removed in favor of direct syscalls (or more specific wrappers in the future).

Each caller may need to handle errors differently, different errors might be possible depending on the FILE_INFORMATION_CLASS, etc, and making a wrapper that handles all of those different use-cases nicely seems like it'd be more trouble than it's worth (FILE_INFORMATION_CLASS has 76 different possible values)

@shmuelamit
Copy link
Author

@squeek502 Actually I didn't know about those windows APIs, that's a good point. But what will be the problem for windows APIs like NtQueryInformationFile which you mentioned to just return an all-encompassing error set, in this example it will return an error set which corresponds to every possible value of NTSTATUS, I really didn't find a source which lists which error codes are returned from each function so we may as well assume all of them may be and let the users switch on the possible errors and use else for the default case.
A different procedure can be used for ioctl where ioctl will return an error only for EBADF, EFAULT, EINVAL, ENOTTY (which are universal for ioctl calls afaik) and let the user handle the returned value as an integer however they want. if the resulting integer is an errno then the user can use it as such and use the errno function to parse it. It actually makes logical sense to do 2 error checks for ioctl calls, one for knowing if the ioctl syscall itself failed or the underlying driver code.
Do these solutions seem satisfactory to you?

@squeek502
Copy link
Collaborator

squeek502 commented Nov 6, 2023

in this example it will return an error set which corresponds to every possible value of NTSTATUS

This is one reason I'm a bit skeptical of this approach. Such a function would not really provide much benefit over the current enum return, since the correct way to use it would always be a switch on the error--using try with such a function would basically be a footgun that would balloon your error set (and same with a switch that doesn't properly narrow the error set before returning).

The non-exhaustive-enum part of NTSTATUS and the fact that it can encode more than just a single number complicate things further.


Not sure if I fully follow your ioctl suggestion, think I'd need to see an example of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants