Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: introduce a thread-local CSPRNG for general use #7482

Merged
merged 9 commits into from
Dec 19, 2020
Merged

Conversation

andrewrk
Copy link
Member

std.crypto.random

  • cross platform, even freestanding
  • can't fail. on initialization for some systems requires calling
    os.getrandom(), in which case there are rare but theoretically
    possible errors. The code panics in these cases, however the
    application may choose to override the default seed function and then
    handle the failure another way.
  • thread-safe
  • supports the full Random interface
  • cryptographically secure
  • no syscall required to initialize on Linux (AT_RANDOM)
  • calls arc4random on systems that support it

std.crypto.randomBytes is removed in favor of std.crypto.random.bytes.

I moved some of the Random implementations into their own files in the
interest of organization.

stage2 no longer requires passing a RNG; instead it uses this API.

Closes #6704

cc @jedisct1

lib/std/start.zig Outdated Show resolved Hide resolved
lib/std/crypto/tlcsprng.zig Outdated Show resolved Hide resolved
@andrewrk
Copy link
Member Author

Failing test case in ReleaseSafe mode. ReleaseFast, ReleaseSmall, and Debug do not trip the segfault.

pub fn main() void {}
$ ./zig build-exe test.zig -OReleaseSafe
$ ./test
Segmentation fault (core dumped)

In GDB I see:

Program received signal SIGSEGV, Segmentation fault.
std.crypto.tlcsprng.init (seed=...) at /home/andy/Downloads/zig/lib/std/crypto/tlcsprng.zig:57
57	    csprng_state = std.crypto.core.Gimli.init(initial_state);
(gdb) disassemble 
Dump of assembler code for function std.start.posixCallMainAndExit:
=> 0x0000000000204c04 <+1172>:	vmovaps YMMWORD PTR fs:0xffffffffffffffc0,ymm0

This is why I added those @fence calls but they didn't affect the situation. If I scroll up I see

   0x0000000000204b68 <+1016>:	add    r13,QWORD PTR [rip+0x111d1]        # 0x215d40 <tls_image.3>

So it does look like the TLS code ran before the segfaulting line. This is also evident from strace:

$ strace ./test
execve("./test", ["./test"], 0x7ffdf6d6b550 /* 127 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x215c50)       = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=NULL} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)

@andrewrk
Copy link
Member Author

Strangely enough, the failure seems to be caused by a95458a, although the segfault happens before that line.

@andrewrk
Copy link
Member Author

Issue fixed in f6d6709 - it was a long-standing bug with thread local storage not being properly aligned.

lib/std/start.zig Outdated Show resolved Hide resolved
lib/std/start.zig Outdated Show resolved Hide resolved
} else {
self.state.permute();
}
mem.set(u8, self.state.toSlice()[0..std.crypto.core.Gimli.RATE], 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zeroing the state like this is.. weird?
Let's add a method to Gimli (and/or the random interface) so the caller doesn't touch the internal state?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied from

zig/lib/std/rand.zig

Lines 767 to 776 in ad05509

fn fill(r: *Random, buf: []u8) void {
const self = @fieldParentPtr(Gimli, "random", r);
if (buf.len != 0) {
self.state.squeeze(buf);
} else {
self.state.permute();
}
mem.set(u8, self.state.toSlice()[0..std.crypto.core.Gimli.RATE], 0);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but don't you agree it'd be better to keep this implementation detail inside the implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

@LemonBoy
Copy link
Contributor

Thread-safe but not fork-safe, that's the hardest (?) part.

@andrewrk
Copy link
Member Author

Thread-safe but not fork-safe, that's the hardest (?) part.

I do not recognize fork-safety as a valid use case. Do you?

@LemonBoy
Copy link
Contributor

I do not recognize fork-safety as a valid use case. Do you?

Yes? OpenSSL folks didn't think that could be a problem until they were hit by a nasty CVE...
Given Zig has a way of forking the current process (and C does too, people may use libraries written in Zig from other languages) I think this is a valid concern.

@semarie
Copy link
Contributor

semarie commented Dec 18, 2020

fork-safety is important. but it could be tricky to implement it on wide range of platform.

For posix+libc, it is possible to call pthread_atfork() to pass a function to reseed the internal state.
For others, maybe look how is defined _ARC4_ATFORK in libressl implementation

@andrewrk
Copy link
Member Author

andrewrk commented Dec 18, 2020

Please, can someone tell me what is a use case for an application to ever call fork() without calling execve() directly after?

fork() is OS developers thinking they are clever while avoiding the actual real use case of spawning a child process.

@semarie
Copy link
Contributor

semarie commented Dec 18, 2020

@andrewrk it is a common way to use fork() to have a server handling multiple clients with fully separated context (aka not using threads). a simple example would be Apache MPM prefork.

but I agree that the current state of art for fork based server is to reexec the binary after forking, as it "gives to each process a fresh & unique address space for ASLR, stack protector -- as protection against address space discovery attacks" (citing openbsd fork+exec strategy)

if you prefer to not implement it, I think it should be explicitly documented and a way to force the CSPRNG reseeding should be possible (I didn't check if it is already possible). Some crypto are just completely insecure if some random is reused.

@LemonBoy
Copy link
Contributor

The whole nginx architecture is organized in single-threaded processes that are repatey forked to take advantage of the multi-core processing capabilities.

The idea is to avoid the cross-thread synchronization overhead by giving each process an independent copy of the process address space.

@jedisct1
Copy link
Contributor

Hi,

I still have mixed feelings about using AT_RANDOM.

Looks like AT_RANDOM was mainly designed to randomize stack canaries. It may be better to not reuse a seed for different purposes in order to minimize the implications of a leak.

Zeroing the seed after we use it is great for forward security, however, it may not be far fetched to imagine that other parts of the application, in particular imported C code, may also use AT_RANDOM. Now, that code would use a static seed.

Instead of zeroing, we could overwrite it with the output of our own RNG. But does a scenario where AT_RANDOM is expected to return the same value before and after we modify it exist? Especially with Zig code being embedded in an application written in a different language?

A boring call to getentropy() (if libc is linked) or a simple getrandom syscall (which is actually equivalent) would bring more peace of mind. Sure, this requires a system call, but a single system call at initialization time is not that big of a deal.

There are very valid cases for fork() without exec().

  • Mitigation against speculative execution. Google proved that process-level isolation was the only viable mitigation.
  • Snapshotting. For example, Pincaster and Redis (not sure about current versions, but it definitely used to work that way) can dump a snapshot of their in-memory data structures by simply fork()ing and leveraging COW to save a consistent snapshot without blocking the main process.
  • This is actually as simple and fine way to do concurrency, as long as shared mutable data is not required. For example FTP servers need to fork() because different client connections may have to run under different system user IDs. That can be achieved with fork()+exec() (and OpenBSD's ftpd probably works that way these days), but a single fork() is easier and faster to inherit an existing configuration without having to reparse configuration files, download a configuration from a remote database, etc.
  • fork() exists, people WILL use it. And I quite agree that no having a fork-safe RNG is a CVE waiting to happen.

Transparent and reliable fork() safety is not simple to achieve. However, we could at least reseed the RNG (with getrandom) when Zig's own fork() wrapper is called.

While performance is important for non-cryptographically secure RNGs, preventing misuse may be more important for CSPRNGs, whose main use case is generating secret keys. The speed of a CSPRNG is rarely a bottleneck in real world applications. Plus, we have better ways to improve performance, such as using faster functions to generate the random stream.

@semarie
Copy link
Contributor

semarie commented Dec 18, 2020

I still have mixed feelings about using AT_RANDOM.

libressl is using it as fallback (but with lot of others inputs) for seeding random generator on Linux. But beck@ commented it as "Not as random as you think but we take what we are given". I could ask for more details if wanted.

However, we could at least reseed the RNG (with getrandom) when Zig's own fork() wrapper is called.

yes, it could be a good direction to take.

@andrewrk
Copy link
Member Author

Thank you for the review, @jedisct1 and others.

For others, maybe look how is defined _ARC4_ATFORK in libressl implementation

This is unnecessary because on those targets this functionality simply calls arc4random_buf.

It may be better to not reuse a seed for different purposes in order to minimize the implications of a leak.

Wouldn't this be an argument for having the implementation of std.crypto.random unconditionally make a syscall to getrandom()? I think if we accept the premise of managing our own CSPRNG in application memory space, then it would allow reusing the seed for stack canaries as well as general purpose use.

Instead of zeroing, we could overwrite it with the output of our own RNG. But does a scenario where AT_RANDOM is expected to return the same value before and after we modify it exist? Especially with Zig code being embedded in an application written in a different language?

In the code path where we zero or overwrite the value, it only runs when Zig controls the start code. If Zig code is embedded in an application written in a different language, this code path is not executed, and the lazy initialization with getrandom() is the active path.

In the code path where zig controls the start code, we are guaranteed to be the first code path to inspect AT_RANDOM.

A boring call to getentropy() (if libc is linked) or a simple getrandom syscall (which is actually equivalent) would bring more peace of mind. Sure, this requires a system call, but a single system call at initialization time is not that big of a deal.

This patch does not use AT_RANDOM when libc is linked. It does exactly what you said, it calls getrandom at initialization time.

Snapshotting. For example, Pincaster and Redis (not sure about current versions, but it definitely used to work that way) can dump a snapshot of their in-memory data structures by simply fork()ing and leveraging COW to save a consistent snapshot without blocking the main process.

I do not recognize this as a valid use case. Redis is a database in which it is intended to use close to 100% of the memory available on a system as a cache. Forking to save a snapshot suddenly doubles the amount of memory that the application possibly needs, depending on end-user-controlled usage (hitting the cache). Enough cache writes, and redis gets OOM killed. It invites nondeterminism into whether your snapshot will cause a production service outage. This is a design flaw in Redis and it caused a production outage when I worked at OkCupid.

However,

fork() exists, people WILL use it. And I quite agree that no having a fork-safe RNG is a CVE waiting to happen.

This is enough for me. I agree.

Transparent and reliable fork() safety is not simple to achieve. However, we could at least reseed the RNG (with getrandom) when Zig's own fork() wrapper is called.

That sounds fine, although I will want to call the raw version for std.ChildProcess which does not need random data between fork() and execve().

Now that you have convinced me fork safety is something that we want to achieve, we have quite a few tricks up our sleeves.

@semarie pointed out a possible solution for posix+libc:

For posix+libc, it is possible to call pthread_atfork() to pass a function to reseed the internal state.

For systems with arc4random the problem is solved.

For Windows, can we agree we do not need fork safety? Implementing fork() on Windows is possible but it is quite a hack.

For systems that don't have pthread_atfork and need libc:

  • If we provide the libc and it is statically linked, we can patch it to add fork safety that integrates with the zig standard library.
  • If we provide the libc and it is dynamically linked, I think it is possible to provide a fork() wrapper that C code will call, which implements the fork safety, and then calls the real fork function.

There are not many OS's left after this flow chart.


I can also offer my own argument for why to ignore AT_RANDOM and rely exclusively on getrandom(): it's useless startup code that does not need to run when an application does not use std.crypto.random. Maybe the lazy init cost is a better tradeoff.

@LemonBoy
Copy link
Contributor

The usual trick is to use minherit on BSD/Darwin and MADV_WIPEONFORK on Linux, this way you get the state/flag cleared for free once the os forks the parent process.

std.crypto.random

* cross platform, even freestanding
* can't fail. on initialization for some systems requires calling
  os.getrandom(), in which case there are rare but theoretically
  possible errors. The code panics in these cases, however the
  application may choose to override the default seed function and then
  handle the failure another way.
* thread-safe
* supports the full Random interface
* cryptographically secure
* no syscall required to initialize on Linux (AT_RANDOM)
* calls arc4random on systems that support it

`std.crypto.randomBytes` is removed in favor of `std.crypto.random.bytes`.

I moved some of the Random implementations into their own files in the
interest of organization.

stage2 no longer requires passing a RNG; instead it uses this API.

Closes #6704
 * get rid of the pointless fences
 * make seed_len 16 instead of 32, which is accurate since it was
   already padding the rest anyway; now we do 1 pad instead of 2.
 * secureZero to clear the AT_RANDOM auxval
 * add a flag root source files can use to disable the start code. This
   is in case people want to opt out of the initialization when they
   don't depend on it.
@jedisct1
Copy link
Contributor

Wouldn't this be an argument for having the implementation of std.crypto.random unconditionally make a syscall to getrandom()?

I'd be all for this. It would not only reduce the code size, but also avoid being limited to a 128 bit seed.

@andrewrk
Copy link
Member Author

@LemonBoy thanks for the tip, will have a new commit for you to review momentarily. On BSD/Darwin we don't need minherit because we call arc4random_buf. Yeesh, align(4096) just for 1 boolean tho, my goodness.

Everybody gets what they want!

 * AT_RANDOM is completely ignored.
 * On Linux, MADV_WIPEONFORK is used to provide fork safety.
 * On pthread systems, `pthread_atfork` is used to provide fork safety.
 * For systems that do not have the capability to provide fork safety,
   the implementation falls back to calling getrandom() every time.
 * If madvise is unavailable or returns an error, or pthread_atfork
   fails for whatever reason, it falls back to calling getrandom() every
   time.
 * Applications may choose to opt-out of fork safety.
 * Applications may choose to opt-in to unconditionally calling
   getrandom() for every call to std.crypto.random.fillFn.
 * Added `std.meta.globalOption`.
 * Added `std.os.madvise` and related bits.
 * Bumped up the size of the main thread TLS buffer. See the comment
   there for justification.
 * Simpler hot path in TLS initialization.
@andrewrk
Copy link
Member Author

andrewrk commented Dec 18, 2020

Everybody gets what they want!

const std = @import("std");

pub fn main() !void {
    std.debug.print("Before fork, random = {d}\n", .{std.crypto.random.int(i32)});
    const pid = try std.os.fork();
    std.debug.print("pid = {d}, random = {d}\n", .{ pid, std.crypto.random.int(i32) });
}

fork safety demonstrated on linux 5.4.77, no libc.

strace reveals the MADV_WIPEONFORK:

[nix-shell:~/Downloads/zig/build]$ ./zig build-exe test.zig

[nix-shell:~/Downloads/zig/build]$ ./test
Before fork, random = -1031859893
pid = 20828, random = 1654510737
pid = 0, random = -1388834103

[nix-shell:~/Downloads/zig/build]$ ./test
Before fork, random = -1009874062
pid = 20830, random = -1607484668
pid = 0, random = 2094790016

[nix-shell:~/Downloads/zig/build]$ strace ./test
execve("./test", ["./test"], 0x7ffea71be610 /* 127 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x23f000)       = 0
rt_sigaction(SIGSEGV, {sa_handler=0x22d5f0, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204dc0}, NULL, 8) = 0
rt_sigaction(SIGILL, {sa_handler=0x22d5f0, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204dc0}, NULL, 8) = 0
rt_sigaction(SIGBUS, {sa_handler=0x22d5f0, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204dc0}, NULL, 8) = 0
madvise(0x23e000, 64, MADV_WIPEONFORK)  = 0
getrandom("\xd6\xb3\xb2\x63\xb1\xd7\xbc\x9a\xa8\xbf\x2e\x99\x0e\x0d\x9d\xf2\xf3\xc3\x9e\x86\xb6\x2a\x74\x84\x52\xe8\xdd\x9a\xda\xa0\x5b\x21"..., 48, 0) = 48
write(2, "Before fork, random = ", 22Before fork, random = )  = 22
write(2, "1458597416", 101458597416)              = 10
write(2, "\n", 1
)                       = 1
fork()                                  = 20835
write(2, "pid = ", 6pid = )                   = 6
pid = 0, random = -1923794652
write(2, "20835", 520835)                    = 5
write(2, ", random = ", 11)             = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20835, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
write(2, ", random = ", 11, random = )             = 11
write(2, "-503010136", 10-503010136)              = 10
write(2, "\n", 1
)                       = 1
exit_group(0)                           = ?
+++ exited with 0 +++

fork safety demonstrated on linux 4.13, with libc

madvise did not have MADV_WIPEONFORK yet. pthread_atfork gets called here:

[nix-shell:~/Downloads/zig/build]$ ./zig build-exe test.zig -target native-linux.4.13...4.13-gnu -lc --dynamic-linker /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib/ld-linux-x86-64.so.2 

[nix-shell:~/Downloads/zig/build]$ ./test
Before fork, random = 1636270633
pid = 20876, random = 1766424307
pid = 0, random = 219361779

[nix-shell:~/Downloads/zig/build]$ strace ./test
...
rt_sigaction(SIGSEGV, {sa_handler=0x2269f0, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x7f58335d10f0}, NULL, 8) = 0
rt_sigaction(SIGILL, {sa_handler=0x2269f0, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x7f58335d10f0}, NULL, 8) = 0
rt_sigaction(SIGBUS, {sa_handler=0x2269f0, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x7f58335d10f0}, NULL, 8) = 0
getrandom("\xdf\x8f\x26\x52\xaa\x30\xb0\xb7\xff\xcd\x74\xd5\x5f\xb0\xfe\x3f\x12\x64\xaa\x02\xca\xa6\x42\xf5\xa4\xc9\x88\x6a\x18\xd8\xa2\x78"..., 48, 0) = 48
write(2, "Before fork, random = ", 22Before fork, random = )  = 22
write(2, "-636077314", 10-636077314)              = 10
write(2, "\n", 1
)                       = 1
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f583341ea50) = 20882
write(2, "pid = ", 6pid = )                   = 6
pid = 0, random = -652967661
write(2, "20882", 520882)                    = 5
write(2, ", random = ", 11, random = )             = 11
write(2, "180791767", 9180791767)                = 9
write(2, "\n", 1)                       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20882, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
write(2, "\n", 1
)                       = 1
exit_group(0)                           = ?
+++ exited with 0 +++

fork safety demonstrated on linux 4.13, without libc.

No safety is available so it always calls getrandom().

[nix-shell:~/Downloads/zig/build]$ ./zig build-exe test.zig -target native-linux.4.13...4.13-gnu 

[nix-shell:~/Downloads/zig/build]$ ./test
Before fork, random = 896978600
pid = 21207, random = 537954109
pid = 0, random = 731750322

[nix-shell:~/Downloads/zig/build]$ ./test
Before fork, random = -1562203124
pid = 21209, random = -1223503357
pid = 0, random = -1921193724

[nix-shell:~/Downloads/zig/build]$ strace ./test
execve("./test", ["./test"], 0x7ffc1367fd20 /* 127 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x23c050)       = 0
rt_sigaction(SIGSEGV, {sa_handler=0x22d390, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204c20}, NULL, 8) = 0
rt_sigaction(SIGILL, {sa_handler=0x22d390, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204c20}, NULL, 8) = 0
rt_sigaction(SIGBUS, {sa_handler=0x22d390, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204c20}, NULL, 8) = 0
getrandom("\x14\x79\xfb\x0c", 4, 0)     = 4
write(2, "Before fork, random = ", 22Before fork, random = )  = 22
write(2, "217807124", 9217807124)                = 9
write(2, "\n", 1
)                       = 1
fork()                                  = 21215
getrandom("\x6d\x5b\x60\x21", 4, 0)     = 4
write(2, "pid = ", 6pid = )                   = 6
write(2, "21215", 5pid = 212150, random = 447144867)                    = 5

write(2, ", random = ", 11, random = )             = 11
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=21215, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
write(2, "559962989", 9559962989)                = 9
write(2, "\n", 1
)                       = 1
exit_group(0)                           = ?
+++ exited with 0 +++

fork safety explicitly disabled by the application:

const std = @import("std");

pub const crypto_fork_safety = false;

pub fn main() !void {
    std.debug.print("Before fork, random = {d}\n", .{std.crypto.random.int(i32)});
    const pid = try std.os.fork();
    std.debug.print("pid = {d}, random = {d}\n", .{ pid, std.crypto.random.int(i32) });
}
[nix-shell:~/Downloads/zig/build]$ ./zig build-exe test.zig

[nix-shell:~/Downloads/zig/build]$ ./test
Before fork, random = -888418107
pid = 21299, random = 366249267
pid = 0, random = 366249267

[nix-shell:~/Downloads/zig/build]$ strace ./test
execve("./test", ["./test"], 0x7ffd9905c710 /* 127 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x23f000)       = 0
rt_sigaction(SIGSEGV, {sa_handler=0x22d380, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204c20}, NULL, 8) = 0
rt_sigaction(SIGILL, {sa_handler=0x22d380, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204c20}, NULL, 8) = 0
rt_sigaction(SIGBUS, {sa_handler=0x22d380, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204c20}, NULL, 8) = 0
getrandom("\x32\x08\x1c\xc7\x18\xf6\x51\x10\x9a\x70\xed\xd9\x73\x86\x77\x56\x7e\x77\x07\xe0\x96\x06\x3f\xeb\xff\x69\x28\x5f\x88\x32\xae\xf6"..., 48, 0) = 48
write(2, "Before fork, random = ", 22Before fork, random = )  = 22
write(2, "1997434850", 101997434850)              = 10
write(2, "\n", 1
)                       = 1
fork()                                  = 21304
write(2, "pid = ", 6pid = pid = )                   = 6
0, random = -1237140226
write(2, "21304", 521304)                    = 5
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=21304, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
write(2, ", random = ", 11, random = )             = 11
write(2, "-1237140226", 11-1237140226)             = 11
write(2, "\n", 1
)                       = 1
exit_group(0)                           = ?
+++ exited with 0 +++

Explicitly request to always make the OS syscall:

const std = @import("std");

pub const crypto_always_getrandom = true;

pub fn main() !void {
    std.debug.print("Before fork, random = {d}\n", .{std.crypto.random.int(i32)});
    const pid = try std.os.fork();
    std.debug.print("pid = {d}, random = {d}\n", .{ pid, std.crypto.random.int(i32) });
}
[nix-shell:~/Downloads/zig/build]$ ./zig build-exe test.zig

[nix-shell:~/Downloads/zig/build]$ ./test
Before fork, random = -1195260306
pid = 24401, random = -1970555568
pid = 0, random = -95624964

[nix-shell:~/Downloads/zig/build]$ strace ./test
execve("./test", ["./test"], 0x7ffebb37fd70 /* 127 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x23c008)       = 0
rt_sigaction(SIGSEGV, {sa_handler=0x22cd40, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204c00}, NULL, 8) = 0
rt_sigaction(SIGILL, {sa_handler=0x22cd40, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204c00}, NULL, 8) = 0
rt_sigaction(SIGBUS, {sa_handler=0x22cd40, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x204c00}, NULL, 8) = 0
getrandom("\x24\xf0\x77\x38", 4, 0)     = 4
write(2, "Before fork, random = ", 22Before fork, random = )  = 22
write(2, "947384356", 9947384356)                = 9
write(2, "\n", 1
)                       = 1
fork()                                  = 24412
getrandom("\x78\x3a\x08\xbe", 4, 0)     = 4
write(2, "pid = ", 6pid = )                   = 6
write(2, "24412", 524412)                    = 5
write(2, ", random = ", 11, random = )             = 11
pid = 0write(2, "-1106757000", 11, random = 1509413993-1106757000
)             = 11
write(2, "\n", 1
)                       = 1
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24412, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
exit_group(0)                           = ?
+++ exited with 0 +++

@andrewrk
Copy link
Member Author

andrewrk commented Dec 19, 2020

That bcrypt issue is tripping on both linux (with -lc) and freebsd:

Test [602/1634] crypto.bcrypt.test "bcrypt"... 
Thread 1 received signal SIGBUS, Bus error.
0x000000000050e918 in crypto.utils.secureZero (s=...)
    at /usr/home/andrewrk/zig/lib/std/crypto/utils.zig:48
48	    @memset(ptr, 0, length);
(gdb) bt
#0  0x000000000050e918 in crypto.utils.secureZero (s=...)
    at /usr/home/andrewrk/zig/lib/std/crypto/utils.zig:48
#1  crypto.bcrypt.strHashInternal (password=..., rounds_log=5 '\005', salt=...)
    at /usr/home/andrewrk/zig/lib/std/crypto/bcrypt.zig:230
#2  0x000000000037f7e9 in crypto.bcrypt.strHash (rounds_log=5 '\005', password=...)
    at /usr/home/andrewrk/zig/lib/std/crypto/bcrypt.zig:266
#3  crypto.bcrypt.test "bcrypt" () at /usr/home/andrewrk/zig/lib/std/crypto/bcrypt.zig:295
#4  0x0000000000421ddb in std.special.main ()
    at /usr/home/andrewrk/zig/lib/std/special/test_runner.zig:61
#5  start.callMain () at /usr/home/andrewrk/zig/lib/std/start.zig:345
#6  start.initEventLoopAndCallMain () at /usr/home/andrewrk/zig/lib/std/start.zig:287
#7  start.callMainWithArgs (argc=<optimized out>, argv=<optimized out>, envp=...)
    at /usr/home/andrewrk/zig/lib/std/start.zig:250
#8  start.main (c_argc=<optimized out>, c_argv=<optimized out>, c_envp=<optimized out>)
    at /usr/home/andrewrk/zig/lib/std/start.zig:257
(gdb) down
Bottom (innermost) frame selected; you cannot go down.
(gdb) disassemble 
Dump of assembler code for function crypto.bcrypt.strHashInternal:
=> 0x000000000050e918 <+2136>:	movaps %xmm0,0x137(%rsp)

The @memset in secureZero seems to be trying to do a SIMD instruction onto a non-16-byte aligned stack address. I'm confused why this would be happening though because the pointer is 1 byte aligned:

pub fn secureZero(comptime T: type, s: []T) void {
    const ptr = @ptrCast([*]volatile u8, s.ptr);
    const length = s.len * @sizeOf(T);
    @memset(ptr, 0, length);
}

Relevant bcrypt code is:

    var password_buf: [73]u8 = undefined;
    // ...
    utils.secureZero(u8, &password_buf);

It's also inconsistent. It happens or does not happen depending on how many other tests are run. If I isolate the bcrypt test it works fine.

Edit: I double checked the LLVM IR output for secureZero:

; Function Attrs: nobuiltin nounwind
define internal fastcc void @crypto.utils.secureZero(%"[]u8"* nonnull readonly align 8 %0) unnamed_addr #1 !dbg !288689 {
Entry:
  %ptr = alloca i8*, align 8
  %length = alloca i64, align 8
  call void @llvm.dbg.declare(metadata %"[]u8"* %0, metadata !288691, metadata !DIExpression()), !dbg !288695
  %1 = getelementptr inbounds %"[]u8", %"[]u8"* %0, i32 0, i32 0, !dbg !288696
  %2 = load i8*, i8** %1, align 8, !dbg !288696
  store i8* %2, i8** %ptr, align 8, !dbg !288697
  call void @llvm.dbg.declare(metadata i8** %ptr, metadata !288692, metadata !DIExpression()), !dbg !288698
  %3 = getelementptr inbounds %"[]u8", %"[]u8"* %0, i32 0, i32 1, !dbg !288699
  %4 = load i64, i64* %3, align 8, !dbg !288699
  %5 = mul nuw i64 %4, 1, !dbg !288700
  store i64 %5, i64* %length, align 8, !dbg !288700
  call void @llvm.dbg.declare(metadata i64* %length, metadata !288694, metadata !DIExpression()), !dbg !288701
  %6 = load i8*, i8** %ptr, align 8, !dbg !288702
  %7 = load i64, i64* %length, align 8, !dbg !288703
  call void @llvm.memset.p0i8.i64(i8* align 4096 %6, i8 0, i64 %7, i1 true), !dbg !288704
  ret void, !dbg !288705
}

It's calling @llvm.memset with align 4096 which is completely wrong.

Edit 2: I think I have it figured out, there was another long standing alignment bug that this is exposing.

fn childAtForkHandler() callconv(.C) void {
    const wipe_slice = @ptrCast([*]u8, &wipe_me)[0..@sizeOf(@TypeOf(wipe_me))];
    std.crypto.utils.secureZero(u8, wipe_slice);
}

This is the new code from the PR. wipe_slice gets alignment 4096 for some configurations. secureZero takes a []u8 slice parameter which has 1 byte alignment, but somehow the overalignment of the slice makes it to llvm ir codegen. However it does not go into the generic function parameter cache hash because a []u8 is supposed to be 1 byte aligned.

And so secureZero for all []u8 is generated with a highly aligned slice, which is incorrect and leads to the crash.

@andrewrk andrewrk merged commit 506af7e into master Dec 19, 2020
@andrewrk andrewrk deleted the tlcsprng branch December 19, 2020 04:57
@mikdusan mikdusan added the release notes This PR should be mentioned in the release notes. label Jan 24, 2021
@andrewrk andrewrk added this to the 0.9.0 milestone Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release notes This PR should be mentioned in the release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

introduce a thread-local Cryptographically Secure Pseudo Random Number Generator for general use
6 participants