Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZON #20271

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

ZON #20271

wants to merge 13 commits into from

Conversation

MasonRemaley
Copy link
Contributor

@MasonRemaley MasonRemaley commented Jun 12, 2024

ZON

This PR implements ZON, or Zig Object Notation (think JSON, but with Zig syntax instead of Javascript.)

In particular, it implements:

  • Runtime parsing of ZON data
  • Runtime stringification of ZON data
  • Compile time importing of ZON data

A lot of files are added since I added a lot of test cases, the bulk of the work is in src/zon.zig and lib/std/zon. If there's anything I can do to make this easier to review, let me know.

Tests cover all new features.

Runtime

The runtime code can be found at lib/std/zon.

Parsing

lib/std/zon/parse.zig

Detailed doc comments are provided in the source. At a high level, this module provides the following functions and some configuration options:

  • parseFromSlice
  • parseFromAst
  • parseFromAstNoAlloc
  • parseFromAstNode
  • parseFromAstNodeNoAlloc
  • parseFree

Most people will just want parseFromSlice and maybe parseFree, but the more fine grained functions are available for those who need them.

Stringify

lib/std/zon/stringify.zig

Detailed doc comments are provided in the source. At a high level, this module provides the following functions and some configuration options:

  • stringify
  • stringifyMaxDepth
  • stringifyArbitraryDepth
  • stringifier

Most users will just need stringify, or one of its variants.

However, stringifier returns a much more fine grained interface that is used in the implementation of the other functions. It may be used directly if you need to serialize something piece by piece--perhaps to apply different configuration to different parts of the value, or because the value does not actually exist laid out in memory in the same form in which you want to stringify it.

Compile Time

Almost all of the logic for this is in src/zon.zig.

This PR also implements importing ZON at compile time:

const Foo = @import("foo.zon");

Things that may change later...

& In ZON syntax

Right now, ZON slices need to be prefixed with & to distinguish them from tuples. This is necessary because we don't know the result type when importing. In the future I think ZON should not distinguish between tuples and slices, it should just coerce to whatever the destination type is.

Update: & is not allowed in ZON, ZON will automatically coerce tuples to arrays or slices when given a destination type.

Untagged nonexhaustive enum values not yet supported

Zig does not currently have a way to represent an untagged enum value as a literal, and as such there's no way to represent one in ZON.

We could resolve this by adding a syntax for untagged enum literals to Zig, or by allowing ZON to coerce integers to untagged enum literals (provided the destination type is known.) Personally I prefer the former solution.

  • After this PR is merged I'll file an issue to track this if still relevant.

Type mismatches do not always point at helpful location

This is an existing issue, but it can make type errors in ZON imports annoying.

Consider this incorrect code:

    const Struct = struct {
        outer: struct { inner: u8 },
    };
    const some_value = .{ .outer = .{ .inner = 1.5 } };
    const casted: Struct = some_value;

It produces the following error:

src/main.zig:22:28: error: fractional component prevents float value '1.5' from coercion to type 'u8'
    const casted: Struct = some_value;

This isn't super helpful. The location information should be pointed to the struct literal, or a note should be added that points to the struct literal.

This comes up all the time with ZON, e.g. any time you do something like this and make a mistake:

    const Struct = struct {
        outer: struct { inner: u8 },
    };
    const casted: Struct = @import("foo.zon");
  • After this PR is merged I'll file an issue to track this.

Allowing eliding outer braces may make sense

There's case to be made to allow eliding braces on an outer struct in ZON, like this:

.x = 10,
.y = 20,
  • I'll file an issue with some motivating use cases after this is merged, easy change if we decide to make it

Divergences from Zig

Zig has no NaN or inf literals right now. Zon does, since otherwise it would have no way to represent these values.

IMO we should add the same literals to Zig.

  • After this PR is merged I'll file an issue to track this.

See Also

@MasonRemaley MasonRemaley mentioned this pull request Jun 12, 2024
src/Module.zig Outdated Show resolved Hide resolved
@@ -88,6 +88,11 @@ pub const Case = struct {
expect_exact: bool = false,
backend: Backend = .stage2,
link_libc: bool = false,
/// A list of imports to cache alongside the source file.
imports: []const []const u8 = &.{},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added support for importing files into test cases so I could test importing ZON. I'm sure there's a better way, but this seems to work fine as a stop gap. IIUC there are plans to rework this eventually anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed that the existing compiler error tests pass if there is no error when one was expected. It's possible this has been resolved since I noticed it I have not checked recently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed that the existing compiler error tests pass if there is no error when one was expected.

Oh, that sounds like a problem! Any chance you could check if that's been fixed, and if not try to fix it up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I phrased this poorly--I meant that it was a preexisting bug, not an issue with my changes to the test runner.

I'll check if it's still an issue before merging and file an issue to track it if it is (I don't want to fix it in this PR because it'll likely result in updates to a bunch of unrelated tests.)

src/Module.zig Outdated Show resolved Hide resolved
Copy link
Member

@mlugg mlugg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding & in ZON...

You're correct that ZON definitely shouldn't include this. Our main use of ZON today (build.zig.zon) already creates variable-length structures without this syntax. To avoid a divergence between ZON you can @import and build.zig.zon, I would consider this a merge blocker, but that's Andrew's call of course.

Implementation-wise, the ZIR instruction corresponding to @import should be modified to provide a result type if one is present (you can use .none to represent the case where there isn't one). Then, the ZON analysis logic should traverse into this result type as needed. This has the additional advantage of providing better source locations if the structure of an imported file doesn't match what's expected, since the type error will occur whilst parsing the relevant ZON expression.

src/Module.zig Outdated Show resolved Hide resolved
src/Module.zig Outdated Show resolved Hide resolved
src/Module.zig Outdated Show resolved Hide resolved
src/Module.zig Outdated Show resolved Hide resolved
src/Module.zig Outdated Show resolved Hide resolved
src/Module.zig Outdated Show resolved Hide resolved
@MasonRemaley
Copy link
Contributor Author

MasonRemaley commented Jun 12, 2024

Regarding & in ZON...

You're correct that ZON definitely shouldn't include this. Our main use of ZON today (build.zig.zon) already creates variable-length structures without this syntax. To avoid a divergence between ZON you can @import and build.zig.zon, I would consider this a merge blocker, but that's Andrew's call of course.

Implementation-wise, the ZIR instruction corresponding to @import should be modified to provide a result type if one is present (you can use .none to represent the case where there isn't one). Then, the ZON analysis logic should traverse into this result type as needed. This has the additional advantage of providing better source locations if the structure of an imported file doesn't match what's expected, since the type error will occur whilst parsing the relevant ZON expression.

Hmm I think I agree, better to knock it out now than to release it as is then immediately change the syntax. I would've done this up front but incorrectly thought it required major changes to the compiler.

I'll take a look at making this change. If no result type is specified, I'll have it default to a tuple rather than a slice unless there are any objections to that default.

@mlugg
Copy link
Member

mlugg commented Jun 12, 2024

Yup. I agree that it makes sense to use a tuple when there's no result type.

By the way, in case you're unfamiliar with the AstGen code, you'll just want try rl.ri.resultType(gz, node) orelse .none in the corresponding case in AstGen.builtinCall.

tiehuis added a commit to tiehuis/zig that referenced this pull request Jun 14, 2024
There were two primary issues at play here:
 1. The hex float prefix was not handled correctly when the stream was
    reset for the fallback parsing path, which occured when the mantissa was
    longer max mantissa digits.
 2. The implied exponent was not adjusted for hex-floats in this branch.

Additionally, some of the float parsing routines have been condensed, making
use of comptime.

closes ziglang#20271
tiehuis added a commit to tiehuis/zig that referenced this pull request Jun 14, 2024
There were two primary issues at play here:
 1. The hex float prefix was not handled correctly when the stream was
    reset for the fallback parsing path, which occured when the mantissa was
    longer max mantissa digits.
 2. The implied exponent was not adjusted for hex-floats in this branch.

Additionally, some of the float parsing routines have been condensed, making
use of comptime.

closes ziglang#20271
@andrewrk
Copy link
Member

andrewrk commented Jun 16, 2024

+ echo Looking for non-conforming code formatting...
+ stage3-debug/bin/zig fmt --check .. --exclude ../test/cases/ --exclude ../doc/ --exclude ../build-debug
Looking for non-conforming code formatting...
../test/behavior/zon/a_neg.zon:1:1: error: expected type expression, found '-'
../test/behavior/zon/slice-1.zon:1:1: error: expected type expression, found '&'
../test/behavior/zon/void.zon:1:1: error: expected type expression, found '{'
../test/behavior/zon/slice-empty.zon:1:1: error: expected type expression, found '&'
../test/behavior/zon/false.zon
../test/behavior/zon/slice-abc.zon:1:1: error: expected type expression, found '&'

In case you haven't found why this is happening, it's here:

# TODO: move this to a build.zig step (check-fmt)
echo "Looking for non-conforming code formatting..."
stage3-release/bin/zig fmt --check .. \
--exclude ../test/cases/ \
--exclude ../doc/ \
--exclude ../build-debug \
--exclude ../build-release

Edit: I'll address this in a PR shortly. Sit tight for a bit.

@andrewrk
Copy link
Member

andrewrk commented Jun 17, 2024

After #20321 you should see the same failures locally when running zig build test, and you can check only code formatting conformance with zig build test-fmt.

Edit: I think if you rebase and resolve that build.zig conflict, those CI failures will disappear.

@MasonRemaley
Copy link
Contributor Author

MasonRemaley commented Jun 19, 2024

Nice thanks! I'll do the rebase and get started on removing & from the syntax when I'm back in town next week.

lib/std/zon.zig Outdated Show resolved Hide resolved
@MasonRemaley
Copy link
Contributor Author

MasonRemaley commented Jun 28, 2024

By the way, in case you're unfamiliar with the AstGen code, you'll just want try rl.ri.resultType(gz, node) orelse .none in the corresponding case in AstGen.builtinCall.

Thanks, I'm not familiar with this code so this is helpful!

I started implementing storing the result type on imports, but got a bit stuck.

AstGen used to use addStrTok(...) to store the import path. Since we now want to store both the import path and a result type, I store a Zir.Inst.Bin Heres's what AstGen does now:

// AstGen.builtinCall
const operand_node = params[0];
const rhs = try expr(gz, scope, .{ .rl = .none }, operand_node);
try gz.addPlNode(.import, node, Zir.Inst.Bin{
    .lhs = result_type,
    .rhs = rhs,
});

Zir used to use .str_tok to get the import path, that's also been updated:

// Zir.zirImport
const inst_data = sema.code.instructions.items(.data)[@intFromEnum(inst)].pl_node;
const extra = sema.code.extraData(Zir.Inst.Bin, inst_data.payload_index).data;
const src = block.nodeOffset(inst_data.src_node);
const operand = try sema.resolveConstString(
    block,
    src,
    extra.rhs,
    .{ .needed_comptime_reason = "import path must be comptime-known" },
);

When this code runs, resolveConstString fails:

thread 7410 panic: index out of bounds: index 123056708, len 16
Analyzing lib/std/std.zig
    > %344 = import(%3306982439, %123057044)
      %345 = break_inline(%343, %344)
    For full context, use the command
      zig ast-check -t lib/std/std.zig

  in lib/std/std.zig
    > %436 = decl_val("start")

/home/mason/Documents/zon/zig/src/Sema.zig:299:25: 0x6884afc in contains (zig)
        return map.items[@intFromEnum(key) - @intFromEnum(map.start)] != .none;
                        ^
/home/mason/Documents/zon/zig/src/Sema.zig:255:26: 0x649170c in get (zig)
        if (!map.contains(key)) return null;
                         ^
/home/mason/Documents/zon/zig/src/Sema.zig:1857:39: 0x6491605 in resolveInst (zig)
        const inst = sema.inst_map.get(i).?;
                                      ^
/home/mason/Documents/zon/zig/src/Sema.zig:1887:42: 0x6d84bcf in resolveConstString (zig)
    const air_inst = try sema.resolveInst(zir_ref);
                                         ^
/home/mason/Documents/zon/zig/src/Sema.zig:13928:48: 0x68ae4b4 in zirImport (zig)
    const operand = try sema.resolveConstString(

It appears that the value I get back in Sema for extra.rhs is garbage, instead of the one that I put in during AstGen.

Any pointers would be much appreciated--I can also push the WIP commit if that would be helpful, but I imagine that I'm probably missing something that will be obvious to people more familiar with this part of the codebase.

@mlugg
Copy link
Member

mlugg commented Jun 28, 2024

You probably just need to clear your cache directory -- ZIR is cached based on zig version, so if you haven't created a commit with your changes, the compiler will use the old cached ZIR!

However, your lowering strategy for import isn't really suitable. Since the operand must be a string literal, it's just a waste of bytes and time to use expr to actually create a str instruction. So, you shouldn't use Zir.Inst.Bin as the payload here -- instead, create a new payload type which stores result_ty: Ref, name: NullTerminatedString, and use that as the payload. (For the union field, pl_node is fine; pl_tok would also be fine.)

@MasonRemaley
Copy link
Contributor Author

You probably just need to clear your cache directory -- ZIR is cached based on zig version, so if you haven't created a commit with your changes, the compiler will use the old cached ZIR!

Ha that was it, thanks! I made the same mistake yesterday and didn't quite understand how I fixed it--that had been bugging me glad to have it resolved.

However, your lowering strategy for import isn't really suitable. Since the operand must be a string literal, it's just a waste of bytes and time to use expr to actually create a str instruction. So, you shouldn't use Zir.Inst.Bin as the payload here -- instead, create a new payload type which stores result_ty: Ref, name: NullTerminatedString, and use that as the payload. (For the union field, pl_node is fine; pl_tok would also be fine.)

Makes sense, will do!

src/print_zir.zig Outdated Show resolved Hide resolved
@MasonRemaley
Copy link
Contributor Author

MasonRemaley commented Jul 31, 2024

@ianprime0509 thank you very much--realizing I was missing that flag unblocked me on this and that smaller repro is very helpful!

One of the three failing tests actually was a bug on my end, I fixed that one. The other two appear to be preexisting issues to I've filed separate issues to track them here:

@MasonRemaley
Copy link
Contributor Author

Alright, this should be ready for review now!

Copy link
Member

@mlugg mlugg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work -- this is good progress. Most of this looks good.

Unfortunately, the strategy for incorporating type information when lowering ZON isn't ideal. Rather than using the type information just to take refs when needed, and coercing the whole aggregate at the end of lowering, we should instead coerce as we go. There are a few reasons for this:

  • It gives better error messages; if a single field is wrong somewhere deep in the ZON, this will give the error at that field, rather than a less-than-helpful error at the @import.
  • It's faster; we construct only the final values, rather than constructing a bunch of "temporary" values and then a bunch of finalized coerced values.
  • It's necessary once Remove anonymous struct types from the language #16865 gets in, since under that proposal, you won't be able to coerce e.g. pointer-to-anon-struct to pointer-to-concrete-struct.

There are some other comments here, but that's the biggest issue.

@@ -427,7 +427,7 @@ pub fn build(b: *std.Build) !void {
const optimization_modes = chosen_opt_modes_buf[0..chosen_mode_index];

const fmt_include_paths = &.{ "lib", "src", "test", "tools", "build.zig", "build.zig.zon" };
const fmt_exclude_paths = &.{"test/cases"};
const fmt_exclude_paths = &.{ "test/cases", "test/behavior/zon" };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these excluded from formatting?

Copy link
Contributor Author

@MasonRemaley MasonRemaley Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto format is good on code for readability, but normalizing the zon test data to standard formatting reduces coverage. (Tests cover invalid syntax and non standard formatting)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I think incorrect formatting is something we could better test with some kind of automated testing (something to jiggle around whitespace in all of the behavior tests and check they work the same), and I don't see any particularly compelling reason to treat the ZON behavior tests differently to the other Zig behavior tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m open to more thorough tests being added for this in the future, but in the meantime, normalizing test data strictly decreases the quality of the tests and provides no upsides

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see your point -- I'll defer to @andrewrk's opinion on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @mlugg's idea exactly describes fuzz testing combined with using these existing tests as an input corpus. If we had that sub-project already finished and fuzzing done regularly, then I think it would make sense to go with that suggestion. Until that is implemented, I think @MasonRemaley's stance makes more sense. I'm actively working on it.

Comment on lines +19 to +22
//! * slices
//! * notated as a reference to a tuple literal
//! * this syntax will likely be removed in the future, at which point ZON will not distinguish
//! between slices and tuples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hasn't this already been removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops yes--will fix

@@ -9120,7 +9113,18 @@ fn builtinCall(
} else if (str.len == 0) {
return astgen.failTok(str_lit_token, "import path cannot be empty", .{});
}
const result = try gz.addStrTok(.import, str.index, str_lit_token);
const res_ty = try ri.rl.resultType(gz, node) orelse .none;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A possible ZIR optimization here would now be to replace the rvalue call below with special handling in the ty case, since Sema will do the coercion for us. Not a merge blocker, just worth noting!

src/Zcu.zig Show resolved Hide resolved
src/zon.zig Outdated
file_index: Zcu.File.Index,

/// Lowers the given file as ZON. `res_ty` is a hint that's used to add indirection as needed to
/// match the result type, actual type checking is not done until assignment.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actual type checking is not done until assignment

This is kind of problematic for two reasons:

  • It leads to less useful error messages. Coercing as we lower will give more helpful type errors since the error location will be the specific sub-expression with an incorrect type.
  • More importantly: once we remove anonymous struct types (Remove anonymous struct types from the language #16865), coercing structs through pointers -- which this presumably relies on -- won't work.


// error
// backend=stage2
// output_mode=Exe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than setting output_mode=Exe on these tests, just use some export fn rather than pub fn main.

// output_mode=Exe
// imports=zon/array.zon
//
// found 'struct{comptime comptime_int = 97, comptime comptime_int = 98, comptime comptime_int = 99}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message looks incomplete?

// output_mode=Exe
// imports=zon/large_number.zon
//
// 2:20: error: type 'i66' cannot represent integer value '36893488147419103232'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include the leading colon, i.e. :2:20: error: ...

@@ -0,0 +1 @@
--1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more missing trailing newlines

@@ -88,6 +88,11 @@ pub const Case = struct {
expect_exact: bool = false,
backend: Backend = .stage2,
link_libc: bool = false,
/// A list of imports to cache alongside the source file.
imports: []const []const u8 = &.{},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed that the existing compiler error tests pass if there is no error when one was expected.

Oh, that sounds like a problem! Any chance you could check if that's been fixed, and if not try to fix it up?

@MasonRemaley
Copy link
Contributor Author

MasonRemaley commented Aug 6, 2024

@mlugg Thanks for looking this over!

It gives better error messages; if a single field is wrong somewhere deep in the ZON, this will give the error at that field, rather than a less-than-helpful error at the @import.

I agree, the error messages are very poor. You can also trigger the same issue without ZON like this:

const Struct = struct {
    outer: struct { inner: u8 },
};
const some_value = .{ .outer = .{ .inner = 1.5 } };
const casted: Struct = some_value;
src/main.zig:22:28: error: fractional component prevents float value '1.5' from coercion to type 'u8'
    const casted: Struct = some_value;

My plan was to get this merged, and then make a separate PR that fixes the handling of this case, thereby improving error handling overall instead of just fixing it for ZON. This also saves me from duplicating logic for type checking that already exists.

However...

Am I correct in understanding that after #16865 it won't even be possible to trigger this example case? If anonymous structs are no longer special, then you're not even allowed to attempt this coercion in the first place right? Instead of getting an error about the field with unhelpful line number info, you'd get an error just saying those are two completely different struct types.

If so, then my plan doesn't make sense, because I'd just be fixing the handling of a case that won't even be possible in Zig in the near future, and relying on existing error handling that will have been removed. If that's the case then I think it's better to do the type checking here as you suggested.

@mlugg
Copy link
Member

mlugg commented Aug 7, 2024

@MasonRemaley yes, your understanding of #16865 is correct. Sorry, I don't know why I focused on pointers in my explanation!

So, yeah, that example will fail after #16865 is implemented, because some_value will have a concrete struct type which doesn't coerce to Struct. Your thinking is completely right, and I reach the same conclusion that the type checks / coercions must happen during lowering.

Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't finish looking at all the changes in this massive patch, but between this feedback, the comments from @mlugg that are still unaddressed, and the rebase that needs doing, I think it's probably enough to keep you busy for a while.

ident_buf: []u8,

/// Configuration for the runtime parser.
pub const ParseOptions = struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://ziglang.org/documentation/master/#Avoid-Redundant-Names-in-Fully-Qualified-Namespaces

Applies to std.zon.Parse.ParseOptions as well as other declarations in this file.

Comment on lines +12 to +15
gpa: Allocator,
ast: *const Ast,
status: ?*ParseStatus,
ident_buf: []u8,
Copy link
Member

@andrewrk andrewrk Aug 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the struct that is this file has fields, the convention is Parse.zig.

@@ -298,3 +360,143 @@ test parseAlloc {
try expect(eql(u8, "foo", try parseAlloc(alloc, "\"f\x6f\x6f\"")));
try expect(eql(u8, "f💯", try parseAlloc(alloc, "\"f\u{1f4af}\"")));
}

/// Parses one line at a time of a multiline Zig string literal to a std.io.Writer type. Does not append a null terminator.
pub fn MultilineParser(comptime Writer: type) type {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the intended use case for this API. This ZON implementation is based on having the complete source buffer, an already made list of tokens, and an already made AST, so there's no point in going back to streaming.

strLitNodeAsString in AstGen.zig shows an implementation of converting multi line string tokens into a string (in this case, appended to an array list of bytes). I think a more useful API would accept an ArrayList and append to it. That would also be reusable by this AstGen code.

lib/std/zon.zig Show resolved Hide resolved
Comment on lines +75 to +97
pub const ParseOptions = @import("zon/parse.zig").ParseOptions;
pub const ParseStatus = @import("zon/parse.zig").ParseStatus;
pub const parseFromSlice = @import("zon/parse.zig").parseFromSlice;
pub const parseFromAst = @import("zon/parse.zig").parseFromAst;
pub const parseFromAstNoAlloc = @import("zon/parse.zig").parseFromAstNoAlloc;
pub const parseFromAstNode = @import("zon/parse.zig").parseFromAstNode;
pub const parseFromAstNodeNoAlloc = @import("zon/parse.zig").parseFromAstNodeNoAlloc;
pub const parseFree = @import("zon/parse.zig").parseFree;

pub const StringifierOptions = @import("zon/stringify.zig").StringifierOptions;
pub const StringifyValueOptions = @import("zon/stringify.zig").StringifyValueOptions;
pub const StringifyOptions = @import("zon/stringify.zig").StringifyOptions;
pub const StringifyContainerOptions = @import("zon/stringify.zig").StringifyContainerOptions;
pub const Stringifier = @import("zon/stringify.zig").Stringifier;
pub const stringify = @import("zon/stringify.zig").stringify;
pub const stringifyMaxDepth = @import("zon/stringify.zig").stringifyMaxDepth;
pub const stringifyArbitraryDepth = @import("zon/stringify.zig").stringifyArbitraryDepth;
pub const stringifier = @import("zon/stringify.zig").stringifier;

test {
_ = @import("zon/parse.zig");
_ = @import("zon/stringify.zig");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub const ParseOptions = @import("zon/parse.zig").ParseOptions;
pub const ParseStatus = @import("zon/parse.zig").ParseStatus;
pub const parseFromSlice = @import("zon/parse.zig").parseFromSlice;
pub const parseFromAst = @import("zon/parse.zig").parseFromAst;
pub const parseFromAstNoAlloc = @import("zon/parse.zig").parseFromAstNoAlloc;
pub const parseFromAstNode = @import("zon/parse.zig").parseFromAstNode;
pub const parseFromAstNodeNoAlloc = @import("zon/parse.zig").parseFromAstNodeNoAlloc;
pub const parseFree = @import("zon/parse.zig").parseFree;
pub const StringifierOptions = @import("zon/stringify.zig").StringifierOptions;
pub const StringifyValueOptions = @import("zon/stringify.zig").StringifyValueOptions;
pub const StringifyOptions = @import("zon/stringify.zig").StringifyOptions;
pub const StringifyContainerOptions = @import("zon/stringify.zig").StringifyContainerOptions;
pub const Stringifier = @import("zon/stringify.zig").Stringifier;
pub const stringify = @import("zon/stringify.zig").stringify;
pub const stringifyMaxDepth = @import("zon/stringify.zig").stringifyMaxDepth;
pub const stringifyArbitraryDepth = @import("zon/stringify.zig").stringifyArbitraryDepth;
pub const stringifier = @import("zon/stringify.zig").stringifier;
test {
_ = @import("zon/parse.zig");
_ = @import("zon/stringify.zig");
}
pub const Parse = @import("zon/Parse.zig");
pub const stringify = @import("zon/stringify.zig");
test {
_ = Parse;
_ = stringify;
}

suggest for the titular function to be named write or serialize, resulting in a fully-qualified name of std.zon.stringify.write or std.zon.stringify.serialize respectively.

Comment on lines +251 to +255
if (@inComptime()) {
// Happens if given e.g. @typeOf(null), the default error we get is hard
// to understand.
@compileError("Runtime parser cannot run at comptime.");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use @inComptime. This is a perfect example of why I almost vetoed including it in the language. #868 (comment)

}

fn maxIdentLength(comptime T: type) usize {
// Keep in sync with `parseExpr`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Keep in sync with `parseExpr`.
_ = foobar_logic;

along with this somewhere:

/// rename when the foobar logic changes
const foobar_logic = {};

and another _ = foobar_logic; inside parseExpr.

🪄 logic synchronization safety 🪄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh neat, I'm gonna start using this trick in some of my own stuff too

@@ -427,7 +427,7 @@ pub fn build(b: *std.Build) !void {
const optimization_modes = chosen_opt_modes_buf[0..chosen_mode_index];

const fmt_include_paths = &.{ "lib", "src", "test", "tools", "build.zig", "build.zig.zon" };
const fmt_exclude_paths = &.{"test/cases"};
const fmt_exclude_paths = &.{ "test/cases", "test/behavior/zon" };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @mlugg's idea exactly describes fuzz testing combined with using these existing tests as an input corpus. If we had that sub-project already finished and fuzzing done regularly, then I think it would make sense to go with that suggestion. Until that is implemented, I think @MasonRemaley's stance makes more sense. I'm actively working on it.

src/Zcu.zig Outdated
Comment on lines 757 to 457
/// Whether the file is Zig or ZON. This field is always populated.
mode: Ast.Mode,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this is trivially computable from another field (sub_file_path). why bother storing it?

/// try serializer.value(2.5);
/// try vec2.finish();
/// ```
pub fn Stringifier(comptime Writer: type) type {
Copy link
Member

@andrewrk andrewrk Aug 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use std.io.AnyWriter you can avoid this being a generic type. I think that's probably the right move, since it will result in less bloat, and there's already a more optimal, streaming output API, that also happens to have a simpler implementation. The json equivalent is std.json.WriteStream. In fact that's such a good abstraction I bet this stringification code could be written cleanly and optimally on top of (the zon equivalent of) it.

@MasonRemaley
Copy link
Contributor Author

@andrewrk Thanks for the feedback! I'll work through all the review comments, and then ping here again when I'm done.

@VisenDev
Copy link

I hope this PR isn't abandoned! Excited to finally have a ZON parser in the stdlib

@MasonRemaley
Copy link
Contributor Author

@VisenDev it's not abandoned--this feature is very important to my work. I'm working on some other stuff w/ tight deadlines right now, but I'm planning on getting back to this in about a week and a half.

* Gets all tests passing after rebase
* Resolves a bunch of TODOs
* WIP commit while I build release version of main
* Errors on embedded nulls in identifiers
* Parsing identifiers properly without allocating
* Properly frees parsed idents
* Fixes integer negation logic
* Uses existing big int parser
* Cleans up runtime int parsing
* Some investigation into float issue
* More float investigation
* Adds missing test
* Resolves TODOs
* Fixes some error tokens
* Cleans up remaining parsing todos
* Removes TODOs
* Excludes tests format formatting
* Formats
* Parses directly as desired float type
* Moves zon lowering into own file

Records result type for imports, does not yet use

Fixes print zir for imports, better encoding of instruction

Uses the result type to add indirection where necessary

Does not yet remove & from syntax

Removes & from comptime ZON syntax

Yet to be removed from runtime syntax

Fixes after rebase

Allow coercion to slices, not to any other pointer type, adds tests

No longer allows `&` in the runtime parser

Also fixes merge

Fixes bug where you could parse structs as tuples and vice versa

Also cleans up error handling

Reworks runtime error handling

Explains usage of unreachable

Fixes CI failure, and exposes notes in API

Shows notes on integer literal errors when parsing ZON

Fixes free of undefined value if struct with allocated fields is missing a field

Skips tests failing due to C backend issue

Notes why tests are skipped, renames to make easier to use test filter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants