Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace anytype #17198

Open
the-argus opened this issue Sep 19, 2023 · 32 comments
Open

replace anytype #17198

the-argus opened this issue Sep 19, 2023 · 32 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@the-argus
Copy link

the-argus commented Sep 19, 2023

EDIT: this issue is about both anytype and comptime T: type. All generics in zig, and how the types are constrained. for the most part, when I say anytype, just think "generics".

anytype is a tool to defer all type logic into the function, where it can be dealt with imperatively at compile time. The result is a much poorer experience with language servers and poor ordering of information: the type constraints of the arguments of a function are the first thing you want to know. And yet anytype leaves them for later.

The frustration of wanting to know what a function does and being greeted with args: anytype is what this issue is about. It sucks, and I know zig can do better. I'm going to try to summarize some previous issues on the subject, and offer some suggestions/proposals, but I don't want this issue to be closed if a suggestion is rejected, because the point of this issue is discussion, not proposal. If people like the idea then someone can make another issue marked as proposal. This issue will be closed if it's deemed not an issue but a feature, or if all solutions are out of scope.

Example

Bad example which was originally in the issue `std.Build.dependency` is a nightmare. The function signature is `pub fn dependency(b: *Build, name: []const u8, args: anytype) *Dependency` and it takes three go-to-definitions to eventually arrive at `std.Build.applyArgs` where we can actually learn what `args` does. There are no documentation comments above any of these stdlib functions. Perhaps this could be solved by using an options struct of some kind. But nonetheless `anytype` is what got used and it's a problem that zig made it available for use in place of other options.

Ideally, we'd like some sort of annotation in the function signature of dependency which makes it clear that args should contain a target and user input options. And give a helpful error message when it doesn't, checking at the call site of dependency instead of three functions down in applyArgs.

EDIT: the above example is a bad one, as pointed out here. Consider instead this snippet from the HashMap implementation:

// If you get a compile error on this line, it means that your generic eql
// function is invalid for these parameters.
const eql = ctx.eql(key, test_key.*);

Where ctx is a generic type. This constraint is quite difficult to find if you don't find it the hard way, as it is buried in the private API of the HashMap. I think the use of anytype in the HashMap implementation is smart but it suffers in readability because there is no separation between comptime type validation logic and runtime logic (the former of which is much more important when learning the API). I think the most conservative solution to this problem would be if zig had some way of annotating a block as "for the type validation logic". Then some comptime asserts with nice error messages or comments could replace the example seen above.

Precedent

First, let's take a look at an existing solution, Zig compile-time contracts. We know type constraints can be accomplished in the language already (and that's part of why a proposed solution needs to be really good) so it's worth looking at how it would be implemented given the status quo now. A sample from zig-ctc readme:

fn signature_contract(t: anytype) contracts.RequiresAndReturns(
    contracts.is(@TypeOf(t), u8),
    void,
) {}

Compile time logic has to be inserted as a weird wrapper function around the return type. This could be helped a bit with infer T from #9260, but it still ends up pretty weird to read. This feels like something that should be a native language feature. It could be a stdlib feature instead, and maybe it should be.

In #1669, a highly related issue, a user proposed using comptime functions which return bool as types. It's readable, offers a nice experience to a user with an editor (they go-to-definition on the type, see instead it's a function, read the logic). Maybe not so nice for someone reading the code on a webpage. And, if #9260 were to be added to the language, we could remove the implicit function call and get the type of the thing in scope of the function signature:

// some hypothetical function which takes a slice of types and an input type to compare against
const oneOf = @import("std").meta.traits.oneOf;

fn doSomethingWithAStringAndANumber(str: []u8, n: oneOf(&.{u8, u16}), infer T) void {
  // do something with n. it is a u8 or a u16!
}

An improvement which I think is worth considering, in the spirit of rethinking these old, still relevant issues. But it doesn't solve the real problem, as we'll see in a moment.
Another, extremely related proposal is #6615, which proposed something like this:

fn Constraint(comptime T: type, predicate: fn (type) bool) type {
    if (!comptime predicate(T)) @compileError("wrong type");
    return T;
}

fn write2(w: Constraint(@TypeOf(w), isWriter), data: []const u8) void {
    w.write(data);
}

Additionally there were #7232 and #8008 (the latter of which was quite well thought out, proposing an @trait builtin. However. all these proposals were ultimately rejected:

There are a lot of problems with generic code. Generic code is harder to read, reason about, and optimize than code using concrete types. Even if it compiles successfully for one type, you may see errors only later when a user passes a different type. Generic code with type validation code has an even worse problem - the validation code has to match with the implementation when it changes, and there’s no way to validate that. So the position of Zig is that code using concrete types should be the primary focus and use case of the language, and we shouldn’t introduce extra complexity to make generics easier unless it provides new tools to solve these problems.

Here we come to understand that Zig's fundamental focus on clarity conflicts with generics, not that there is some problem with the proposal, really. I said the earlier example of zig-ctc felt like "should be a native language feature." I thought "hey, with some added features to Zig, this could be much more readable, and maybe also in the stdlib." But it's clear now that the zig maintainers don't want to solve that problem in the compiler to have to add a bunch of complexity to zig only to get a slightly more readable version of the exact same functionality. However, anytype is still in the language and it still is used for things other than anonymous tuples and it still sucks.

Distinct types offer a partial solution

A big part of the goal here is just to provide more readable function signatures. Yes, it would be nice to do checking in the function signature, but maybe it's enough to just provide aliases for anytype? Like a slightly better doc comment. Maybe anytype could be replaced by something like anontuple and then only allow anytype when defining type aliases. Consider #5132 (typedef for zig) and #1595 (distinct types).

Do something!

anytype is a problem when used for anything other than anonymous tuples in string formatting. Is the best answer the zig language can give doc comments? We need to come up with some functionality which actually improves the experience of writing type-constrained generic code. Something that solves problems like the type checking code buried in applyArgs.

Conclusion/Suggestion

It doesn't make sense to say "generic code is bad, its hard to reason about, so we will not add facilities to make it easier because we don't want to encourage its use" when anytype still exists and still gets used. It seems to me that people would much rather write generic code with bad facilities like anytype and deferred type-constraining logic than maintain multiple well-tested but highly duplicated implementations of the same thing. Anything short of removing comptime T: type and anytype seems unable to stop this.

Maybe the reasons we have for anytype being bad are enough to warrant reconsidering the previously closed issues like #1669? The fact that constraints on anytype are evaluated later, and not at the function where they're passed in, leads to confusing error messages. Additionally, it makes the function signature harder to read, and obfuscates information from tooling like ZLS.

My only original suggestion is that Zig could automatically generate tests for functions with anytype or comptime T: type or equivalent. These tests would only test compilation of the function with all types, not actual functionality. It would require some new builtin @constrainedTypeCompileError or something along those lines, to skip tests for types you know shouldn't work with the function. Doing such an out-of-language solution means sort of giving up and saying "there's no way to make a language with maintainable generic code, so let's just test most of the possible inputs to these bad functions." I am hoping that someone else can come up with something better.

@ghost
Copy link

ghost commented Sep 19, 2023

@the-argus,

fn doSomethingWithAStringAndANumber(str: []u8, n: unsigned8BitOr16Bit) void {
    // use n. its either a u8 or a u16!
}

This syntax is certainly going to be rejected, because unsigned8BitOr16Bit looks like a concrete type, while actually being generic.

Otherwise this looks similar to #6615; have you seen that proposal?

Edit: outdated comment

@the-argus
Copy link
Author

the-argus commented Sep 19, 2023

@zzyxyzz I have edited the original issue so now its almost totally unrecognizable. Sorry to make you read the scrapped original version, but at the time I didn't know about all the issues before me that had tried to do the same thing. In fact, in some cases, the exact same thing. GitHub's search feature is a bit of a joke if you don't know the exact name of the issue you're looking for. So thanks for linking that issue, I very well might not have found it otherwise. The new version of this issue is more of an editorial summary of the state of anytype and type-constraining features in zig, instead of a proposal like the original one was.

@ghost
Copy link

ghost commented Sep 19, 2023

@the-argus,
My personal 2c on this is that Zig simply took a wrong turn with its procedural generics. In addition to what you mention (readability of type signatures, complexity of tooling, need to manually implement things that could be language features), there are also a lot of semantic dark corners where the procedural and declarative parts of the type system (and also runtime/comptime) clash. For a language that generally discourages use of generics and tries to be small enough to "fit into one's head" completely, it would have been better to have a simple parametric type system with interfaces and minimal conditional logic. (Plus introspection, but certainly not reification.)

However, at this point, the ship has sailed for any such radical redesign of the type system. FYI, the official position of the core team is that the language is mostly done and most open proposals will be rejected. As for the existing antype-style generics, a lot of people have thought long and hard about how to improve them, but without too much success. Procedural generics are kind of contagious, and it's really hard to carve out a declarative subset that is 1) general enough 2) will not cause problems elsewhere and 3) offers substantial benefits over a manual type assertion at the top of the function. My own proposal (#9260) is one of the last ones open, and it is ultimately fairly superficial.

In short, I would not get my hopes up high that there will be a substantial paradigm shift in Zig's generics. They are as powerful as they are unergonomic, but it is genuinely difficult to shift this tradeoff. Since you're not actually proposing any solution, I'd recommend closing this issue.

P.S.: Concerning the args parameter of Build.dependency, the idea is that such things will be carefully documented, even though right now they aren't.

@mlugg
Copy link
Member

mlugg commented Sep 19, 2023

Ideally, we'd like some sort of annotation in the function signature of dependency which makes it clear that args should contain a target and user input options.

Okay, but... that's not what we want. The reason args there is anytype is because it can be literally any struct - the arguments passed to a dependency are arbitrary key-value mappings in the form of a struct value, which act the same way -Dfoo=bar options do to the "root" build function. By convention, we commonly pass target and optimize to dependencies to be picked up in the dependency by standardOptimizeOption and standardTargetOptions, but this is not a requirement, and there are cases where you wouldn't want this. The args parameter there is just like the args parameter in std.fmt.format.

[...] and the zig maintainers don't want to solve that problem in the compiler.

To be clear, none of this is really a problem of implementational complexity - any of the ideas talked about here could be integrated into the compiler fairly trivially. The problem is not that we don't want to bother solving the problem in the compiler, but instead that we don't necessarily think any such idea is a good fit for Zig as a language.

It doesn't make sense to say "generic code is bad, its hard to reason about, so we will not add facilities to make it easier because we don't want to encourage its use" when anytype still exists and still gets used.

I don't think this is really anyone's position on the matter, at least not to this extreme. It's not controversial to say that generic code can be useful, and often clearly the right tool for the job. The danger is that if Zig has language-level support for more complicated generic patterns - the kind of thing that requires a lot of complex validation logic - then we implicitly encourage such constructs, which often results in a web of unnecessary abstraction - look at the modern C++ standard library for instance. Status quo Zig allows these patterns, and allows you to easily integrate validation logic when you need (see the std.HashMap "adapted" APIs for a great example of this), but does not implicitly encourage such patterns, so they are only used when they are clearly the right tool for the job.

anytype is a problem when used for anything other than anonymous tuples in string formatting. Is the best answer the zig language can give doc comments?

I'd just like to close out by saying that as someone who's been using Zig as their primary language for a few years now, I've never actually had any major issues with anytype. It's quite rare to see generic functions which use it (much more common are comptime arguments, particularly comptime T: type), and when I do see them, it's not the kind of thing which could be trivially specified in a type signature: you'd need a function containing separate validation logic anyway (e.g. the adapter API), so being able to move this validation up one line into the signature rather than the body doesn't seem in any way helpful. The other case where I commonly see anytype is in functions polymorphic over numeric types (e.g. a lot of functions in std.math), but here it's always obvious from context what is valid: this being explicit in the signature would not really be helpful, because there's no confusion to begin with.

(As a note, I do remain in support of the infer T proposal, but that's essentially a small syntactic sugar to avoid using @TypeOf all over the place - I don't like the extensions people have suggested to that proposal for things like reversing type-creating functions.)

@the-argus
Copy link
Author

Ideally, we'd like some sort of annotation in the function signature of dependency which makes it clear that args should contain a target and user input options.

Okay, but... that's not what we want. The reason args there is anytype is because it can be literally any struct - the arguments passed to a dependency are arbitrary key-value mappings

I think it is odd that args accepts both target and the build options all in one big tuple. I can see that they are all quite literally command line arguments, though, so args is descriptive. I guess this example is just flawed and more of just a problem with lack of documentation.

if Zig has language-level support for more complicated generic patterns - the kind of thing that requires a lot of complex validation logic - then we implicitly encourage such constructs, which often results in a web of unnecessary abstraction - look at the modern C++ standard library for instance. Status quo Zig allows these patterns, and allows you to easily integrate validation logic when you need (see the std.HashMap "adapted" APIs for a great example of this), but does not implicitly encourage such patterns, so they are only used when they are clearly the right tool for the job.

From std.HashMap:

// If you get a compile error on this line, it means that your generic eql
// function is invalid for these parameters.
const eql = ctx.eql(key, test_key.*);

Is this really... good? bits of the type validation end up buried in the code.

What if you could just go-to-definition on the anytype or type argument and hop right to the code that constrains the type? That's purely a QOL improvement just by allowing some way of annotating type-constraining code. I think that if something like interfaces were to be added, it would encourage people to constantly look for commonalities and sometimes over-architect their code. But would adding a way of defining where type-constraining code is found do the same?

It's quite rare to see generic functions which use it (much more common are comptime arguments, particularly comptime T: type),

I think that anytype and comptime T: type are effectively interchangeable when it comes to most of the problems I posed.

it's not the kind of thing which could be trivially specified in a type signature

This is true! And to allow for specifying such complex constraints, we should not create a complex type declaration syntax. And then make it accidentally turing complete. And then make that a feature. That would be bad. Instead, we could just... put the type constraining code in a block and give it a helpful name.

@the-argus
Copy link
Author

However, at this point, the ship has sailed for any such radical redesign of the type system. FYI, the official position of the core team is that the language is mostly done and most open proposals will be rejected.

This issue was sort of a cry for help... I know proposals are not being accepted but to me generics feel like a huge hole in what is otherwise an awesome language. I just want perfection, is that too much to ask? /s

In short, I would not get my hopes up high that there will be a substantial paradigm shift in Zig's generics. They are as powerful as they are unergonomic,

Reminds me of C++.

Since you're not actually proposing any solution, I'd recommend closing this issue.

Yeah... all I've got is the suggestion to test all type combinations. I'm going to leave it open for a day or so but then I will close it based on your input.

P.S.: Concerning the args parameter of Build.dependency, the idea is that such things will be carefully documented, even though right now they aren't.

yay :)

@iacore
Copy link
Contributor

iacore commented Sep 19, 2023

You can avoid anytype if you put everything inside of fn xxxx(comptime T: type) { ... }.

Then, you have to use xxxx(WHATEVER_HERE).f(foo) instead of f(foo).

@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Sep 19, 2023
@andrewrk andrewrk added this to the 0.13.0 milestone Sep 19, 2023
@andrewrk
Copy link
Member

proposals are not being accepted

I see you have some actual experience using zig in earnest so it's ok 👍

@iacore
Copy link
Contributor

iacore commented Sep 20, 2023

proposals are not being accepted

I see you have some actual experience using zig in earnest so it's ok 👍

@andrewrk Can you update the issue template to clarify this? I was banned once so it seemed arbitrary.

@marler8997
Copy link
Contributor

marler8997 commented Sep 20, 2023

Thanks for providing a nice history/summary!

Let me restate the fundamental problem to set the stage for an idea:

anytype makes it hard to figure out what a parameter type is supposed to be.

I would categorize the following as symptoms of this fundamental problem:

  • it's hard for developers to know what they can pass into a function
  • it's hard for IDE's doc generators to know what to do with that parameter
  • it's hard for the compiler to provide good error messages, is it a problem with the parameter type or in the function itself?

So far the ideas we've seen give the developer additional tools they can use to help alleviate the fundamental problem. However, I wonder if we could improve the situation without requiring any new langauge syntax nor any extra specification from library authors? It seems to me there's already alot of information about an anytype parameter that can already be derived. Take this example:

fn example(c: anytype) void {
    c.foo();
}

It's clear that c is some type that needs to have a function foo. My understanding is that Zig re-analyzes generic functions for every unique set of concrete types it receives, however, one could imagine an additional pass that attempts to analyze the nature of it's generic arguments, and derives a contract for them. This could be used to make error messages much nicer, i.e.

// call the example above
example(42);
// zig: error: function `example` requires a type that has a function 'foo' but you provided comptime_int

This could also be used in doc generation:

fn example(c: anytype)

c: a struct that has a method `fn foo() void`

Now, this is a very simple example. Deriving a contract for a type based on code is hard to generalize I think. Take this example:

fn example(c: anytype) void {
    if (some_condition) {
        return c.foo();
    } else {
        return c.bar();
    }
}

Now what is the compiler to do? c might need to have a method foo or a method bar. Now make some_condition some arbitrarily convoluted comptime code and it's unclear what should be done.

I think this is a bit of an unexplored space, but what's nice is if we can find improvements here, you make the lives of all Zig developers better at no cost to them, no new language syntax to learn/use. Also keep in mind that if we don't explore this direction, then any new feature that we add to the language to alleviate the problem may actually be rendered unnecessary because the code was already giving the compiler all the information it needed.

@ghost
Copy link

ghost commented Sep 20, 2023

@marler8997,
Such analysis would be limited to simple cases, because nested functions, comptime conditionals and custom compile errors make it infeasible to derive type constraints from code in general.

This is a problem shared by all languages that do not have formal interfaces and use some form of duck typing for polymorphism (Python, Julia, C++ templates, etc.): faults can only be consistently detected for concrete instantiations, so you often have an error bubbling up from somewhere deep in the call stack and have no help in diagnosing the actual problem. Zig is actually not the worst offender in this regard, because building very many layered comptime abstractions is not idiomatic, but the basic problem is the same.

The only real countermeasure is to provide detailed (and up-to-date) documentation and manually insert comptime checks.

@marler8997
Copy link
Contributor

marler8997 commented Sep 20, 2023

I see the concerns but I have a feeling we could do a lot more here than it may seem at first. For example, nested functions doesn't seem like an issue, the caller simply inherits the contract established in the callee. It's hard to say where the real pain points will be so let's explore. Let's take a look at std.mem.len:

pub fn len(value: anytype) usize {
    switch (@typeInfo(@TypeOf(value))) {
        .Pointer => |info| switch (info.size) {
            .Many => {
                const sentinel_ptr = info.sentinel orelse
                    @compileError("invalid type given to std.mem.len: " ++ @typeName(@TypeOf(value)));
                const sentinel = @ptrCast(*align(1) const info.child, sentinel_ptr).*;
                return indexOfSentinel(info.child, sentinel, value);
            },
            .C => {
                assert(value != null);
                return indexOfSentinel(info.child, 0, value);
            },
            else => @compileError("invalid type given to std.mem.len: " ++ @typeName(@TypeOf(value))),
        },
        else => @compileError("invalid type given to std.mem.len: " ++ @typeName(@TypeOf(value))),
    }
}

The way I imagine this working is that we analyze the len function like we would with any concrete type except use special Contract objects for every generic parameter. The analyzer sends these Contract objects all the operations the code performs on them, and it tracks those operations.

We start our "Generic Type Analysis" pass of the len function and come to this expression:

switch (@typeInfo(@TypeOf(value)))

Since value is a Contract object, the @TypeOf builtin will return a special object, let's call it TypeContract that will forward any operations performed on it to the underlying Contract object, except that those operations put constraints on the type of value rather than on value itself.

The same is done for @typeInfo, where operations performed on it will apply to a TypeInfoContract object and put constraints on the type info of the type of value rather than value itself.

So next the analyzer sees that we call switch on the TypeInfoContract for value. Zig can immediately see that the only valid tag is Pointer, because all other tags result in a @compileError. So the analyzer tells the TypeInfoContract that it must be a pointer, which tells the TypeContract object it must be a pointer, which tells Contract it must have a pointer type! So now if we see this code, zig has everything it needs to generate its own error message!

std.mem.len(0);
// error: std.mem.len requires a pointer but received comptime_int

Keep in mind that with this sort of analysis, len no longer needs to provide its own error message in this switch statement, the compiler can already derive the proper error message.

pub fn len(value: anytype) usize {
    switch (@typeInfo(@TypeOf(value))) {
        .Pointer => {
            // ...
        },
        else => @compileError(), // no need for message!
    }
}

I don't know about you but this alone already seems promising enough to explore. Even if it's not possible to solve this in every case, it could make using the language greatly improved I think.

@marler8997
Copy link
Contributor

marler8997 commented Sep 20, 2023

Some afterthoughts. Note that since this idea adds an "analysis pass" on every generic function, we can not only detect issues with the caller types, we can also detect errors in the function body regardless of the parameter types passed in.

fn example(c: anytype) void {
    c.foo(x);
}

In this example, when zig does its "analysis pass", it sees that x is undefined regardless of what c is. Zig can now emit an error with the example function without it even being instantiated and knows it doesn't need to involve any of the caller's context in it's "notes"..the buck stops at this function.

Also to add to my previous comment. Capturing type constraints like I outlined above would explode if you try to perfectly capture all the specific cases code operations can affect a type, but, keep in mind that this analysis doesn't have to be perfect. The type constraints don't affect compilation, they only affect error messages and documentation. So maybe you have some sort of nested branching code that's 10 levels deep based on the type of a value, so you end up with thousands of possible type combinations depending upon the input, but we don't need to worry about perfectly tracking all those combinations. Instead we can reduce them to a flat set of valid/invalid general types. For example, maybe the type of x needs to be a u32 if y is greater than 0 but then x should be a []const u8 if y is less than -10 but only on sundays....we don't really need to capture all that detail. Our analysis captured that x needs to be a u32 or []const u8 and that's probably all we need.

Here's a shorter way of saying the same thing:

We don't need to know exactly WHEN an anytype can or can't be something...we only need to know IF it can. I think not having to know the WHEN might be the key to preventing this from exploding.

@marler8997
Copy link
Contributor

marler8997 commented Sep 20, 2023

Afterthoughts Part 2. First, note that this "type analysis pass" also applies to any function that accepts a type parameter not just values that declare they are anytype.

Second, if Zig employed this "type analysis pass", we could introduce a new builtin @genericTypeError which would enable us to rewrite std.mem.len to this:

pub fn len(value: anytype) usize {
    switch (@typeInfo(@TypeOf(value))) {
        .Pointer => |info| switch (info.size) {
            .Many => {
                const sentinel_ptr = info.sentinel orelse @genericTypeError(),
                const sentinel = @ptrCast(*align(1) const info.child, sentinel_ptr).*;
                return indexOfSentinel(info.child, sentinel, value);
            },
            .C => {
                assert(value != null);
                return indexOfSentinel(info.child, 0, value);
            },
            else => @genericTypeError(),
        },
        else => @genericTypeError(),
    }
}

This gives the programmer some control over making sure that Zig is able to make sense of what their code is doing. If for example you called @genericTypeError and there's no generic type in the current context, i.e.

pub fn example() void {
    @genericTypeError();
}
// error: misuse of `@genericTypeError`, there are no generic types in the current context

Or if you're inside a generic function and call @genericTypeError but the Contract object isn't currently tracking any conditionals, you would get another misuse of @genericTypeError

pub fn example(c: anytype) void {
    @genericTypeError();
}
// error: misuse of `@genericTypeError`, all generic types `c` are currently outside any conditional

@the-argus
Copy link
Author

@marler8997 thanks for giving me hope that this experience can be improved! This is an exciting potential solution and I think this level of introspection from tooling should be considered required by any language with duck typing elements. I know the pyright LSP implementation is capable of doing stuff like this, and it can be very nice to work with. Other languages with duck typing and no tooling for it (C++ and clangd, zig and zls) really make it a pain. Particularly when you call or write a generic function and then immediately lose all completion functionality. I think the analysis you're describing generalizes to namespace resolution for generic types- the language server can autocomplete fields and functions which exist in the contract. Maybe only to a certain nested function call depth depending on the exploding you mentioned.

@moshe-kabala
Copy link

Hey there

I'm reaching out as I'm currently learning Zig and making some investigations into runtime polymorphism. While diving into Zig, one thing that's been bugging me is the lack of convenient built-in compile-time polymorphism. and force me to use anytype as described in this proposal.

So after reading some discussions and articles around this topic, I want to suggest a type of constraint mechanism that looks to me as native to the language (just compile time checking step)

Current code:

const std = @import("std");

fn main() void {
    const some1 = SomeStruct{ .data = 42 };
    const some2 = SomeAntherStruct{ .data = [_]u8{1,2,3,4} };

    print(some1);
    print(some2);
}

fn print(f: anytype ) void {
    f.x();
}

// return type is the same as the parameter type
fn printAndReturn(f: anytype ) @Type(f) {
    f.x();
    return f;
}


const SomeStruct = struct {
    const Self = @This();
    data: u8,
    fn print(self: Self) void {
        std.debug.print("data = {}\n", .{self.data});
    }
};

const SomeAntherStruct = struct {
    const Self = @This();
    data: [4]u8,
    fn print(self: Self) void {
        std.debug.print("data = {}\n", .{self.data});
    }
};

Suggested:

const std = @import("std");

fn main() void {
    const some1 = SomeStruct{ .data = 42 };
    const some2 = SomeAntherStruct{ .data = [_]u8{1,2,3,4} };

    print(some1);
    print(some2);
}

// ~ means type constraints - in this case it must contain the Printer methods 
fn print(f: ~Printer ) void {
    f.x();
}

// return type is the same as the parameter type
fn printAndReturn(f: ~Printer ) @Type(f) {
    f.x();
    return f;
}

// kind of interface/trait
const Printer = struct {
    const Self = @This();
    // use ~ again for self type
    print: fn(self: ~Self) void,
};

const SomeStruct = struct {
    const Self = @This();
    data: u8,
    fn print(self: Self) void {
        std.debug.print("data = {}\n", .{self.data});
    }
};

const SomeAntherStruct = struct {
    const Self = @This();
    data: [4]u8,
    fn print(self: Self) void {
        std.debug.print("data = {}\n", .{self.data});
    }
};

The proposed changes introduce type constraints using the ~ symbol, allowing developers to specify required methods that a parameter must implement at compile time. This would solve these problems mentioned by @marler8997

  • it's hard for developers to know what they can pass into a function
  • it's hard for IDE's doc generators to know what to do with that parameter
  • it's hard for the compiler to provide good error messages, is it a problem with the parameter type or in the function itself?

Node:
about the ~ it can be anything else such as
fn print(f: >Printer ) void
fn print(f: contains Printer ) void
fn print(f: @contains(Printer) ) void

@VisenDev
Copy link

VisenDev commented Oct 8, 2023

I would also prefer the sort of syntax proposed by @moshe-kabala so as to formally document type constraints on generic input parameters

However I think, this sort of interface-like syntax has been proposed and rejected in the past

My own two cents is that it seems much better for there to be specific syntax for when anytype should just be an anonymous tuple, like the previously suggested anontuple type.

And that instead of anytype, something along the lines of this could be done

pub const Foo = struct {
   a: u32 = 0,
   b: f32 = 0,

   pub fn print(self: *@This()) void {
      ...
   }
}

pub fn doSomething(printable: @hasDecl("print")) void {
    foo.print();
    ...
}

const foo = Foo{};
doSomething(foo);

@hasDecl just asserts that the input anytype has a declaration named print, however, this wouldn't really tell you anything about the details of the print function. A more formal interface type would do this, however, I'm pretty sure that has been rejected as I said before

But something along the lines of this might help clarify what exactly the anytype parameter needs to implement in order for compilation to be successful

@the-argus
Copy link
Author

@moshe-kabala As VisionDev said, stuff like what youre describing has already been rejected, with the explanation that I quoted in the original post. #1669 in particular is similar to what you've proposed. There have been multiple other similar proposals closed for the same reason, such as the one proposing @trait which I linked in the original post.

@the-argus
Copy link
Author

the-argus commented Oct 8, 2023

@VisenDev I think this is promising! @hasDecl reminds me of @cInclude, maybe it could have a similar functionality with something like @decls

const MyConstraint = @decls({
  @hasDecl("read"),
  @hasDecl("write"),
  @hasDecl("print"),
});

pub fn doSomething(readWritePrintable: MyConstraint) void {
    readWritePrintable.print();
    ...
}

And this is interesting. Only being able to ask for decls and not specific function interfaces reduces complexity. But I think this is the worst of both worlds, not the best of both worlds.

The type returned by your proposed @hasDecl (or my @decls) does not reflect what is expected to come after a colon in a function declaration. In my example code, MyConstraint looks like a type and it in the same place as a type for the function definition, but it is actually some simple type-validation code which is instantiated at compile time. Unless @hasDecl resolves to anytype and is just a hint for the user? In which case you would have to do duck typing or type-validation code, and then also provide this @hasDecl hint. Seems like code duplication, and only useful for tooling (for which I think @marler8997's proposal better accomplishes, and without any additions to language syntax or builtins).

I think you ultimately you get all the problems of adding @trait to the language but without the ability to use procedural programming at compile time to do any sort of complex checking.

@VisenDev
Copy link

VisenDev commented Oct 8, 2023

I think you ultimately you get all the problems of adding @trait to the language but without the ability to use procedural programming at compile time to do any sort of complex checking.

Yes, that is a fair point

Another idea would be to add a constraint which ensures an input type is derived from a specific generic
For example

pub fn Writer(T) type {
    return struct {
          pub fn print() void {
              ...
          }
     };
}

pub fn doSomething(foo: Writer(anytype)){
     ....
}

The syntax is just a placeholder, but the idea is that you ensure that the anytype is derived from a specific generic function

Other syntax:

pub fn doSomething(foo: @any(Writer)){}
pub fn doSomething(foo: Writer(comptime T: type)){}

@ghost
Copy link

ghost commented Oct 8, 2023

@the-argus, @VisenDev
You might be interested in proposals #1268 and #6615 (both rejected, unfortunately). They are fairly similar to your current line(s) of thinking.

Edit: oh, I see that #6615 is already referenced in the top post; sorry.

@the-argus
Copy link
Author

Yeah, @VisenDev I don't think there's much to be thought about here that hasn't already been thought about. Marler's idea is the most novel and really the thing keeping me from closing this issue. I feel pretty certain that there is no language feature we could add which would solve this problem and also keep everyone happy.

@VisenDev
Copy link

Thats fair, I guess we just have to live with the current situation

It seems the main issue here is the difficulty in creating the proper tooling to accompany the language, not the language design itself

@bb010g
Copy link

bb010g commented Nov 13, 2023

If Zig is extended to allow comptime struct fields without default values and referring to previous comptime struct fields (e.g. via a builtin that allows access to previous comptime fields of the current struct, which I'll call @thisPtr() here), then fully replicating anytype is trivial:

pub const AnyType = packed struct {
    comptime T: type,
    value: @thisPtr().T,
};

This doesn't address your discontent with comptime T: type, but it at least removes a redundant feature with strange limitations.

@nektro
Copy link
Contributor

nektro commented Nov 13, 2023

comptime fields afaik are essentially a differently-accessed decl and have the same value for all instances of the type. so generics and ducktyping are still the only way

@bb010g
Copy link

bb010g commented Nov 15, 2023

comptime fields afaik are essentially a differently-accessed decl and have the same value for all instances of the type.

In that case, what about the following that returns fresh instantiations of the type?

pub fn AnyType() type {
    return packed struct {
        comptime T: type,
        value: @thisPtr().T,
    };
}

The important thing here is delaying resolution of the value of the field T, and thus the type of the field value that depends on the value of the field T, until AnyType(){ .T = u0, .value = 0 } fully instantiates the type with a value for the field T and anything depending on it can also be resolved. Zig can already somewhat handle this sort of partial knowledge at comptime, as shown by the following:

const std = @import("std");

pub const ComptimeAnyTypeConstPtr = struct {
    T: type,
    ptr: *const anyopaque,

    pub fn dereferencePtrType(comptime PtrT: type) type {
        return switch (@typeInfo(PtrT)) {
            .Pointer => |ptrInfo| ptrInfo.child,
            else => @compileError("ComptimeAnyTypeConstPtr ptr must be a pointer"),
        };
    }

    pub fn init(comptime ptr: anytype) ComptimeAnyTypeConstPtr {
        const T: type = dereferencePtrType(comptime @TypeOf(ptr));
        return .{ .T = T, .ptr = @as(*const anyopaque, @ptrCast(ptr)) };
    }

    pub fn get(comptime self: *const ComptimeAnyTypeConstPtr) *const self.T {
        return @as(*const self.T, @alignCast(@ptrCast(self.ptr)));
    }

    pub fn set(comptime self: *ComptimeAnyTypeConstPtr, ptr: *const self.T) void {
        self.ptr = @as(*const anyopaque, @ptrCast(ptr));
    }
};

pub fn main() !void {
    const str: *const []const u8 = comptime blk: {
        var ptr = ComptimeAnyTypeConstPtr.init(&@as([]const u8, "base"));
        ptr.set(&@as([]const u8, "codebase"));
        break :blk ptr.get();
    };
    std.debug.print("All your {s} are belong to us.\n", .{str.*});
}

The big restriction here is that instances of ComptimeAnyTypeConstPtr have to fully exist within comptime, whereas comptime fields are meant to allow instances of structs to only have to partially exist within comptime. Zig handles the assignment of the field T in ComptimeAnyTypeConstPtr.init, instead of in a field default, just fine here.

@octanejohn
Copy link

i think related but other proposals are closed, is this project of interest https://github.com/permutationlock/zimpl

@mbartelsm
Copy link

mbartelsm commented May 18, 2024

There are a lot of problems with generic code. Generic code is harder to read, reason about, and optimize than code using concrete types. Even if it compiles successfully for one type, you may see errors only later when a user passes a different type. Generic code with type validation code has an even worse problem - the validation code has to match with the implementation when it changes, and there’s no way to validate that. So the position of Zig is that code using concrete types should be the primary focus and use case of the language, and we shouldn’t introduce extra complexity to make generics easier unless it provides new tools to solve these problems.

I find this quote very interesting because it is conflating two very distinct things: making generics easier to implement, and making generics easier to use.

I agree that generic code is hard. Hard to implement correctly, hard to test, hard to use, hard to reason about. But, for better or worse, Zig has generics. That is something that cannot be ignored. The presence of generic capabilities means that generic code will be written; most of the std relies on generic code.

Andrew once described the idea of friction:

The Zig language is designed to prevent bugs by making strategic use of friction. For example this is why coercing a u16 to a u32 works without any syntax at all, yet to go the other way requires the friction of @intCast. @intCast does not mean your code is wrong; it documents code that correlates with a higher chance of bugs.

Following this train of thought, I agree that, if generics are hard to get right, that there should be friction to implementing generic code. anytype is the exact opposite of friction. anytype is as smooth as butter.

By making the implementation of generics this easy, all of the reasoning overhead is pushed down onto the consumers of our code: developers using the std, developers using zig libraries.

What these proposals are attempting to do is to is to push that friction back to the implementors. It is not making generics easier to implement, it's adding gotchas and checks that can be used to verify our own generic code while at the same time informing our users about our expectations, so that they can have an easier time, instead of us.

Zig already supports expressions as types, thanks to comptime evaluation. It stands to reason, to me, to allow pushing that one step further and either allow comptime function predicates to be passed in place of types, to be evaluated and checked on compilation and to inform useds via readable names and centralized logic what a given function expects them to provide as an argument.

This would be equivalent to this proposal

pub fn fooable(comptime T: type) bool { ... }
pub fn foo(value: fooable) void { ... }

I've also seen the following alternative that is essentially equivalent

pub fn fooable(comptime T: type) type {
    // --- comptime checks ---
    return T;
}
pub fn foo(value: fooable(@TypeOf(value))) void { ... }

But to me this is just a roundabout way of doing predicates that is harder to read for little gain. An argument could be made that additional arguments can be passed to the function but this can also be achieved with the predicate option if we treat it as a function pointer

pub fn sizedFooable(size: u8) fn(comptime T: type) bool {
    return struct {
        const inner_size: u8 = size;
        pub fn fooable(comptime T: type) bool { ... }
    }.fooable;
}

pub fn foo(value: sizedFooable(4)) void { ... }

It's more of a hassle to write but that aligns with the idea of friction for the implementer. Meanwhile, the user gets to reap the benefits by getting a readable type constraint whose specific constraints are contained in a centralized place.

@mdsitton
Copy link

mdsitton commented Jul 11, 2024

I like the route that @moshe-kabala is suggesting though might i suggest the name "shape" which is effectively just a set of expectations that a given type should meet during compile time. Then the $ token can be used instead of "~" Additionally i've added the ability to construct sets of shapes similar to how error types work currently.

const std = @import("std");

fn main() void {
    const some1 = SomeStruct{ .data = 42 };
    const some2 = SomeAntherStruct{ .data = [_]u8{1,2,3,4} };

    print(some1);
    print(some2);
}

fn print(comptime f: $Printer ) void {
    f.print();
}

fn printAndSend(comptime f: $SendPrinter) void {
    f.print();
    f.send();
}

// A "shape" is a set of expectations a given type must fulfill at compile time 
const Printer = shape {
    const Self = @This();
    fn print(self: $Self) void;
};

const Sender = shape {
    const Self = @This();
    fn send(self: $Self) void;
};

const SendPrinter = $Printer || $Sender; // Merge multiple shapes into a set similar to error sets

const SomeStruct = struct {
    const Self = @This();
    data: u8,
    fn print(self: Self) void {
        std.debug.print("data = {}\n", .{self.data});
    }
    fn send(self: Self) void {
        // send over a socket
    }
};

const SomeAntherStruct = struct {
    const Self = @This();
    data: [4]u8,
    fn print(self: Self) void {
        std.debug.print("data = {}\n", .{self.data});
    }
};

Additionally this comment to me best describes the general issue here and why this is so much needed in the language:
#1669 (comment)

TLDR: It's the fact that duck typing makes knowing the expectations of a set of functions extremely difficult for more complex codebases. Declaring those expectations up front using this type of "Shape" based system resolves the issues i see with the current system.

Also note: This is something that does not solve the "problem" of multiple concrete types being declared as parameters of a function. It's explicitly around declaring the "shape" of structs during compile time, basically a more explicit form of duck typing.

edit:
The terminology here is inspired by a C# language proposal from several years back dotnet/csharplang#164

@jfalcon
Copy link

jfalcon commented Aug 28, 2024

Full disclosure, I haven't read all the replies yet. So, if this was covered... oops. But, when designing a language (or anything really) there are two camps generically speaking:

Camp 1

Those who don't want change... much. These people who came from C, etc. "We did it this way for years and that's just the way it is."

Camp 2

Those who don't want change... :). These people may have come from a higher level language and are used to newer concepts like generics for instance.

To be completely objective, you have to ask yourself who's right and who's wrong here? And also, who's the target audience(s) and what is the goal of Zig?

If the goal of Zig is to be "C evolved" and low level enough for C devs but modern and friendly enough for non C devs, then it behooves us to think not only like a C dev but also like a modern dev.

To bring it back to generics, generics help reduce code complexity and promote code re-usability. Literally just about every modern language supports them. And the compiler is smart enough to figure out the typing. Can they be abused? Sure. But so can pointers. This is where the artistic side of coding comes in.

You had mentioned that type inference may be hard on the developer, but as long as Zig provides a way to have type constraints and the compiler errors as a result of a generic not meeting that type, the developer using such a routine will very soon figure out type A or type B is not allowed.

In closing, I'll just say... if Zig is just a renamed C then there's less incentive to use Zig and just stick with C. So we need to meet in the middle between old skool and modern.

@sweetbbak
Copy link
Contributor

sweetbbak commented Dec 11, 2024

why is something like this out of the cards?

pub const Writer = interface {
    fn write(b: []u8) !usize,
};

pub const Reader = interface {
    fn read(b: []u8) ![]u8,
};

// and you could potentially chain them like errors
pub const ReadWriter = interace{ Reader && Writer }

// fulfills writer
pub const NetworkThing = struct {
    funny_thing: usize,

    pub fn write() !void {
        // implementation of a net connection
    }

    // not relevant to writer implementation
    pub fn irrelevant() !void {
        return;
    }
};

// fulfills writer and reader
pub const FileThing = struct {
    pub fn write() !void {
        // implementation of writing to a file
    }
    pub fn read() ![]u8 {
        // implementation of reading from a file
    }
};

pub fn Hello(writer: Writer) !void {
    writer.Write("Hello, world!\n");
}

pub fn main() !void {
    const thing = NetworkThing{ .funny_thing = 1 };
    const thing2 = FileThing{};
    try Hello(thing);
    try Hello(thing2);
}

basically just a contract that an object is expected to have XYZ functions or struct members but making no assumptions about the underlying implementation of those things. If an object fulfills that contract, then it implements it that interface type. Then you can ensure that a type passed to a function is guaranteed to have those functions. It would be nice if this resulted in a compile time check that the passed type does indeed fulfill the contract. Then the LSP can also infer that an object like writer: anytpe has functions like writer.writeAll etc...

A lot of the ideas here look very complicated to me and I'm not really sure how this is worse than anytype. Im not design expert obviously, just wanted to throw my 2 cents in.

@the-argus
Copy link
Author

@sweetbbak As for why this is not in the cards, one of the first comments on this issue:

at this point, the ship has sailed for any such radical redesign of the type system. FYI, the official position of the core team is that the language is mostly done and most open proposals will be rejected. As for the existing antype-style generics, a lot of people have thought long and hard about how to improve them, but without too much success. Procedural generics are kind of contagious, and it's really hard to carve out a declarative subset that is 1) general enough 2) will not cause problems elsewhere and 3) offers substantial benefits over a manual type assertion at the top of the function

There are plenty of proposals for how to do this. There are also some excellent arguments for why it should be done. I really like @mbartelsm 's point about friction, in particular. I like your proposal a lot, better than any of my own. But like ghost said, the feature is contagious: if you make an API with it, other users have to implement your interface/shape, or whatever. It changes the whole face of the language and the type of problems its users have to solve. And also as ghost said: the language is basically done in terms of features.

On a kind of unrelated note, I am back at this issue more than a year after posting it so I wanted to mention/reply to @mdsitton who linked this comment above which better describes the issue (and why it should be considered urgent) than any of the many paragraphs I have written here.

That comment^ mentions static analysis to provide runtime type documentation through tooling. Such a thing was proposed above by @marler8997 and it really seems doable. If you like zig but this is a deal breaking issue for you, I think the clearest plan of action is to implement a contract system in a language server such as ZLS rather than proposing a feature for the language.

I'm still leaving this issue open because I agree with @mbartelsm :

generic code is hard. Hard to implement correctly, hard to test, hard to use, hard to reason about. But, for better or worse, Zig has generics. That is something that cannot be ignored. The presence of generic capabilities means that generic code will be written; most of the std relies on generic code.

That is, this is certainly still an "issue" for us, by the english meaning of the word. If a core team member decides this is out of scope for the project, I'll let them close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests