Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design flaw: semantics of loading values with undefined bits #19634

Open
mlugg opened this issue Apr 12, 2024 · 3 comments
Open

design flaw: semantics of loading values with undefined bits #19634

mlugg opened this issue Apr 12, 2024 · 3 comments
Labels
research This proposal requires a considerable amount of investigation before it can be considered. use case Describes a real use case that is difficult or impossible, but does not propose a solution.
Milestone

Comments

@mlugg
Copy link
Member

mlugg commented Apr 12, 2024

Whilst working on #19630, it was suggested by @andrewrk to introduce the semantic rule that when loading a value from a pointer, if any bits are undefined, the entire value is considered to be undefined. This rule simplifies the language considerably, because it avoids the need to deal with the awkward concept of a comptime-known value with some undefined bits. However, it causes breakages in some code.

The main example I've noticed is std.PackedIntArray, which uses a backing buffer to store a sequence of integer values bit-packed in memory. The problem is that if we initialize the buffer to undefined, any load which crosses a boundary onto an adjacent element -- even if the element being loaded was set previously -- is considered undefined. Thus, the only way to obey the new semantics is to always @memset the buffer on creation, which is potentially wasteful.

This issue serves to track this design complexity and potential solutions to it. The example I brought up above could potentially be solved with runtime bit-pointers (related: #16271), but I can't say for sure whether that would solve all potential problems.

For now, this rule is never enforced at runtime, only at comptime. Workarounds have been placed in the standard library where necessary behind @inComptime() checks (such checks link to this issue).

@mlugg mlugg added use case Describes a real use case that is difficult or impossible, but does not propose a solution. research This proposal requires a considerable amount of investigation before it can be considered. labels Apr 12, 2024
@mlugg
Copy link
Member Author

mlugg commented Apr 16, 2024

There's also some occurrences of what's definitely a bug: after a store, the value is turned back into its "true" type for our comptime representation. So this code fails:

test {
    comptime var x: u32 = undefined;
    const buf: *[4]u8 = @ptrCast(x);
    buf[0] = 123;
    // The byte is changed, but the value is then read back as a `u32`, so becomes `undefined`!
    // So, this check fails:
    try @import("std").testing.expectEqual(123, buf[0]);
}

This will be solved by adding a representation to MutableValue for a "bundle of bytes" -- or, more accurately, a set of bit-packed values of arbitrary types. This representation will only be applicable to types which are not comptime-only.

@jacobly0
Copy link
Member

jacobly0 commented Apr 16, 2024

The main example I've noticed is std.PackedIntArray, which uses a backing buffer to store a sequence of integer values bit-packed in memory. The problem is that if we initialize the buffer to undefined, any load which crosses a boundary onto an adjacent element -- even if the element being loaded was set previously -- is considered undefined. Thus, the only way to obey the new semantics is to always @memset the buffer on creation, which is potentially wasteful.

Not sure if it's a good idea or not, but my first thought as to a possible solution:

diff --git a/lib/std/packed_int_array.zig b/lib/std/packed_int_array.zig
index 02c721e7cf..6c8243d9c5 100644
--- a/lib/std/packed_int_array.zig
+++ b/lib/std/packed_int_array.zig
@@ -123,7 +123,7 @@ pub fn PackedIntIo(comptime Int: type, comptime endian: Endian) type {
 
             //read existing bytes
             const target_ptr: *align(1) Container = @ptrCast(&bytes[start_byte]);
-            var target = target_ptr.*;
+            var target = @freeze(target_ptr.*);
 
             if (endian != native_endian) target = @byteSwap(target);
 

mlugg added a commit to mlugg/zig that referenced this issue Apr 16, 2024
This commit reverts the handling of partially-undefined values in
bitcasting to transform these bits into an arbitrary arithmetic value,
like happens on `master` today.

As @andrewrk rightly points out, ziglang#19634 has unfortunate consequences
for the standard library, and likely requires more thought. To avoid
a major breaking change, it has been decided to revert this design
decision for now, and make a more informed decision further down the
line.
mlugg added a commit to mlugg/zig that referenced this issue Apr 16, 2024
This commit reverts the handling of partially-undefined values in
bitcasting to transform these bits into an arbitrary numeric value,
like happens on `master` today.

As @andrewrk rightly points out, ziglang#19634 has unfortunate consequences
for the standard library, and likely requires more thought. To avoid
a major breaking change, it has been decided to revert this design
decision for now, and make a more informed decision further down the
line.
mlugg added a commit to mlugg/zig that referenced this issue Apr 17, 2024
The operation `undefined & 0` ought to result in the value `0`, and likewise for
zeroing only some bits. `std/packed_int_array.zig` tests were failing because
this behavior was not implemented -- this issue was previously masked by faulty
bitcast logic which turned `undefined` values into `0xAA` on pointer loads.

Ideally, we would like to be able to track the undefined bits at comptime.
This is related to ziglang#19634.
mlugg added a commit to mlugg/zig that referenced this issue Apr 17, 2024
The operation `undefined & 0` ought to result in the value `0`, and likewise for
zeroing only some bits. `std/packed_int_array.zig` tests were failing because
this behavior was not implemented -- this issue was previously masked by faulty
bitcast logic which turned `undefined` values into `0xAA` on pointer loads.

Ideally, we would like to be able to track the undefined bits at comptime.
This is related to ziglang#19634.
mlugg added a commit to mlugg/zig that referenced this issue Apr 17, 2024
This commit reverts the handling of partially-undefined values in
bitcasting to transform these bits into an arbitrary numeric value,
like happens on `master` today.

As @andrewrk rightly points out, ziglang#19634 has unfortunate consequences
for the standard library, and likely requires more thought. To avoid
a major breaking change, it has been decided to revert this design
decision for now, and make a more informed decision further down the
line.
mlugg added a commit to mlugg/zig that referenced this issue Apr 17, 2024
The operation `undefined & 0` ought to result in the value `0`, and likewise for
zeroing only some bits. `std/packed_int_array.zig` tests were failing because
this behavior was not implemented -- this issue was previously masked by faulty
bitcast logic which turned `undefined` values into `0xAA` on pointer loads.

Ideally, we would like to be able to track the undefined bits at comptime.
This is related to ziglang#19634.
jedisct1 added a commit to jedisct1/zig that referenced this issue Apr 18, 2024
* master: (4430 commits)
  std.Build: revert --host-target, --host-cpu, --host-dynamic-linker
  Sema: cap depth of value printing in type names
  Sema: correctly make inferred allocs constant
  std.zig.system.linux: Use arm.zig detection for AArch64 CPU features
  Value: convert undefined values to 0xAA for bitwise operations
  Value: fix out-of-bounds slice access writing zero-bit undef value
  compiler: un-implement ziglang#19634
  test: disable newly failing test
  compiler: rework comptime pointer representation and access
  x86_64: fix miscompilation regression in package fetching code
  std: improve std.once tests
  lib/std/Build/Step/CheckObject: dump section as string
  zig init: Replace deprecated LazyPath.path with Build.path()
  std.Build.Step.ConfigHeader: better error message
  windows_argv standalone test: Only test against MSVC if it's available
  ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW
  fix namespacing of std.fs.Dir.Walker.Entry
  std.fs.Dir.Walker: maintain a null byte in path names
  wasi: change default os version to `0.1.0`
  Target: cleanup
  ...
jedisct1 added a commit to jedisct1/zig that referenced this issue Apr 18, 2024
* master: (29 commits)
  cmake: support setting the dynamic linker
  std.Build: revert --host-target, --host-cpu, --host-dynamic-linker
  Sema: cap depth of value printing in type names
  Sema: correctly make inferred allocs constant
  std.zig.system.linux: Use arm.zig detection for AArch64 CPU features
  Value: convert undefined values to 0xAA for bitwise operations
  Value: fix out-of-bounds slice access writing zero-bit undef value
  compiler: un-implement ziglang#19634
  test: disable newly failing test
  compiler: rework comptime pointer representation and access
  x86_64: fix miscompilation regression in package fetching code
  std: improve std.once tests
  lib/std/Build/Step/CheckObject: dump section as string
  zig init: Replace deprecated LazyPath.path with Build.path()
  std.Build.Step.ConfigHeader: better error message
  windows_argv standalone test: Only test against MSVC if it's available
  ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW
  fix namespacing of std.fs.Dir.Walker.Entry
  std.fs.Dir.Walker: maintain a null byte in path names
  wasi: change default os version to `0.1.0`
  ...
@xxxbxxx
Copy link
Contributor

xxxbxxx commented Apr 24, 2024

related: #18137

TUSF pushed a commit to TUSF/zig that referenced this issue May 9, 2024
This commit reverts the handling of partially-undefined values in
bitcasting to transform these bits into an arbitrary numeric value,
like happens on `master` today.

As @andrewrk rightly points out, ziglang#19634 has unfortunate consequences
for the standard library, and likely requires more thought. To avoid
a major breaking change, it has been decided to revert this design
decision for now, and make a more informed decision further down the
line.
@andrewrk andrewrk added this to the 0.15.0 milestone Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
research This proposal requires a considerable amount of investigation before it can be considered. use case Describes a real use case that is difficult or impossible, but does not propose a solution.
Projects
None yet
Development

No branches or pull requests

4 participants