Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: User definable type constraints on polymorphic parameters #1669

Closed
tgschultz opened this issue Oct 20, 2018 · 19 comments
Closed

Proposal: User definable type constraints on polymorphic parameters #1669

tgschultz opened this issue Oct 20, 2018 · 19 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@tgschultz
Copy link
Contributor

tgschultz commented Oct 20, 2018

As shown in #1268 (and to an extent #130), several users have requested some form of comptime interfaces/contracts/concepts/typeclasses/insert-terminology-here. While the current status-quo, duck typing, is sufficient to provide the desired functionality, it does not offer many clues as to what the parametric function is looking for, and errors thrown for performing an illegal operation on the passed-in type can occur in the middle of large function bodies, obscuring that source of the problem (the function was passed an inappropriate type).

I propose a change to the var keyword allowing it to take a parameter of type fn(type)bool. At comptime, when a new type is passed in the parametric parameter, the compiler will evaluate the function against the type of the passed parameter and, if the result is false, throw an appropriate compile error identifying the constraint function, the passed parameter type, and the call site of the parametric procedure.

std.meta.trait, recently merged in #1662, provides several simple introspection functions that could be used in this way:

const trait = std.meta.trait;

pub fn reverseInPlace(list: var<isIndexable>) void {
    ...
}

pub fn createWorld(allocator: var<hasFn("alloc")>) []Things {
    ...
}

const isVector = multiTrait(
    TraitList.{
        hasField("x"),
        hasField("y"),
        hasField("z"),
    }
);

pub fn dot(a: var<isVector>, b: @typeOf(a)) @typeOf(a) {
    ...
}

But since the constraint functions are completely user-definable, they can be arbitrarily complex.

pub fn isChildOf(comptime P: type) trait.TraitFn {
    return struct.{
        pub fn trait(comptime T: type) bool {
            switch(@typeId(T)) {
                builtin.TypeId.Pointer => {
                    const info = @typeInfo(T).Pointer;
                    switch(info.Size) {
                        builtin.TypeInfo.Pointer.Size.One => {
                            if(@typeId(info.child) == builtin.TypeId.Array) return meta.Child(info.child) == T;
                            return info.child == T;
                        },
                        else => return info.child == T,
                    }
                },
                builtin.TypeId.Array,
                builtin.TypeId.Optional,
                builtin.TypeId.Promise => return meta.Child(P) == T,
                else => return false,
            }
        }
    }.trait;
}

pub fn indexOf(list: var<isIndexable>, item: var<isChildOf(@typeOf(list))>) !usize {
    ...
}

See this gist for an overly complex example.

@andrewrk andrewrk added this to the 0.5.0 milestone Oct 23, 2018
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Oct 23, 2018
@williamcol3
Copy link

I'm a much bigger fan of this form of constraint checking compared to any other option I've seen (e.g. an interface definition). It is a much more general solution that avoids locally maximizing.

@Hejsil
Copy link
Contributor

Hejsil commented Apr 10, 2019

We can take this to the extreme if we want. We really don't need var with a system like this in place. We can define, that a parameter is Keyword_comptime? <Identifier>: <Expr>, where <Expr> is any expression that evaluates into something of type type or fn(type)bool.

With this, we can define var like this if we want to:

fn isAny(comptime T: type) bool {
    return true;
}

fn generic(any: isAny) @typeOf(any) {
    return any;
}

I don't think we lose much readability. type variables should be TitleCase and functions should be snakeCase according to the style guide, so it should be clear that isAny is not a type based on the name alone.

It's always good to remove features when possible, and with a system like this, var is no longer needed.

@Hejsil
Copy link
Contributor

Hejsil commented Apr 10, 2019

And another thing that would be nice, is if functions without a body was allowed to be declared. This would allow us to have an abstraction like this:

const isSomething = interface(struct {
    x: u8,
    y: u8,

    pub fn add(@This(), @This()) @This();
});

Currently, this is not possible:

pub fn a() void; // test.zig:1:5: error: non-extern function has no body

test "" {
    @compileLog(@typeOf(a)); // test.zig:4:25: error: use of undeclared identifier 'a'
}

I really think the error error: non-extern function has no body should happen when a leaves comptime and goes into runtime (aka, trying to call a or storing it in a variable).

@Rocknest
Copy link
Contributor

Rocknest commented Apr 14, 2019

@Hejsil i do like explicitness of var.

..and for example if i would like to use var just like now then i would probably need to declare something like this fn func(any: std.meta.any) @typeOf(any) or create the function instead of importing.

@ghost
Copy link

ghost commented May 12, 2019

Just want to share my idea of what zig interfaces could look like. Note the angle brackets to distinguish interface implementations from normal types:

// wrap "iface" keyword in compiletime function to get generics
fn createIEventListener(comptime message: type){
	return iface{
		fire : fn(@This(),message) void,
	}
}

// defining the interface identifier
const IEventListenerMessageU8 = createIEventListener(u8);
const IEventListenerMessageMouseEvent = createIEventListener(MouseEvent);

const MyListenerU8 = struct{
	comptime{
			// tell compiler that instances of this struct shall comply to the given interface
		assertImplements(IEventListenerMessageU8,This());
	}
	fn fire(msg : u8) void{
		switch(msg){ ....
		}
	}
}

// (an artificial) example of making a parameter taking interface implementations
fn fireIfGameOver(gamestate: *GameState, listener : <IEventListenerMessageU8>){
	if(...) listener.fire(5)
}

// passing implementations into a function taking an interface:
// -> compile error if the passed type was determined to not implement the interface properly.
fn main(){
	...
	const listener = MyListenerU8.init();
	while(gamestate.running()){
		fireIfGameOver(gamestate,listener);
	}
}

Another example here: https://gist.github.com/user00e00/85f106624557b718673d51a8458dbab8

@daurnimator
Copy link
Contributor

With @Hejsil's proposal above, how would we tell if the thing should be called or used as a type?
Just based on type (fn a(b: c) if @TypeOf(c) == type use as is, otherwise call as a function?)? Would this be unfriendly to static analysis?

@hryx
Copy link
Contributor

hryx commented Jan 6, 2020

@daurnimator It is ambiguous, yep. It would break in this case:

fn f(a: fn(comptime T: type) bool) void {
    // The compiler converted `a` to `type` because it thought it was a type constraint
    _ = a(usize); // error: expected function, found 'type'
}

So I think a syntactic change per the original proposal would be necessary.

Note that since var is a keyword we could use parens unambiguously, as opposed to introducing pointy brackets. This precedent exists with align(x).

pub fn write(w: var(isWriter)) !usize {
    // ...
}

@ghost
Copy link

ghost commented Jan 16, 2020

@hryx

pub fn write(w: var(isWriter)) !usize {
    // ...
}

I must say I really like this. Earlier, I was thinking about writing a proposal for allowing the aliasing of var when it is used as a parameter type, with something like @VarAlias(predicate: fn(type) bool)

fn isActionListener(b : type) bool { ...}

/// Single point of reference for documentation
const ActionListener = @VarAlias(isActionListener);

fn doActionIfCondition(al : ActionListener, cond: bool) void { ...}
fn doActionIfOk(al : ActionListener, state: State) void { ...}

With your idea, the above becomes

/// Single point of reference for documentation
fn isActionListener(b : type) bool { ...}

fn doActionIfCondition(al : var(isActionListener), cond: bool) void { ...}
fn doActionIfOk(al : var(isActionListener), state: State) void { ...}

I prefer the latter. It is simpler, and makes it easier to see the difference between a type constraint and a normal type.

@pron
Copy link

pron commented May 22, 2020

@hryx Can you explain the ambiguity? The type of the expression fn(comptime T: type) bool is type not fn(type)bool.

@eriksik2
Copy link

Why not

fn ActionListener(b : type) type { ...}

fn doActionIfCondition(al : ActionListener(var), cond: bool) void { ...}
fn doActionIfOk(al : ActionListener(var), state: State) void { ...}

This doesn't introduce any new mental overhead of having a special syntax for type constraints, since it's just a generic generalization of the already existing type syntax. I also think this would be better than user defined constraints since with those you would have to write a fn isGenericType(type) bool for every fn GenericType( ...) type, and then also have to keep both those names in mind when working with GenericType.

This is also more versatile than var(isActionListener) since you can state exactly what you want to constrain, enabling stuff like

fn Array(len: usize, T : type) type { ...}

fn doSmtnWithArrayOfInt(arr : Array(var, i32)) void { ...}
fn doSmtnWithArrayOfLen5(arr : Array(5, var)) void { ...}

Essentially pattern matching, and since Zig doesn't have function overloading we automatically avoid the problem of ambiguity that usually comes with it.

I don't think this would be a complicated implementation either. An expression FnName(var) would simply be the same as var but also compile error if given a type that doesn't originate from a call to FnName. Zig should already know where all types originate from.
An expression OtherFn(var, u8, var) would only allow types that originate from a call to OtherFn where the second argument was u8, etc. (Maybe a step too far into complexity?)

On the parsing side you wouldn't be able to distinguish between a type constraint and a regular function call until after the parsing stage, I don't think. But this doesn't really matter since Zig already doesn't distinguish between types and values and could easily enforce a type constraint to only be used where a type is expected at the semantic analysis stage, just like with types.

A downside of taking this approach is that you can't express more complicated constraints than "this is this kind of generic type" in the function declaration, whereas user defined constraints could express anything that can be expressed in comptime code, for example, "this is any struct which has a field 'count' with type usize". But I think the user defined approach is a non-solution to the problem it's trying to solve for these reasons:

  • The reason you'd want a constraint in the function declaration is so that you can glance at it and immediately know what types are accepted. But if the constraint is a call to a different function you still have to know the implementation of that function to know what types are accepted.
  • User defined type constraints are already supported as comptime asserts in the function body. Moving them to the function arguments does almost literally nothing in terms of benefits, while adding language complexity.

I think pattern-matching-like type constraints are better suited for language support because:

  • You can look at the function declaration and know what types are supported. No semi-hidden function call.
  • More advanced type constraints can still go in the function body. (Where, IMO, they belong).
  • This would work way better with eventual intellisense like tooling.
  • There is currently no way to refer to "any instance of some generic type", this would be syntax for that. std.fmt.format would become something like fn format(std.io.OutStream(var), []const u8, var). This is clear and intuitive.

@tgschultz
Copy link
Contributor Author

At first reading I had some criticisms of the above proposal, however while typing them up I realized they weren't really problems at all. To me, this is a sign that the idea is a good one. The more I think about it, the more I think @eriksik2's proposal is better than my original idea.

@ifreund
Copy link
Member

ifreund commented Jun 12, 2020

I feel like I'm missing something, but is this not just slightly more sugary syntax for the following?

fn ActionListener(b : type) type { ...}

fn doActionIfCondition(T: type, al : ActionListener(T), cond: bool) void { ...}
fn doActionIfOk(T: type, al : ActionListener(T), state: State) void { ...}

@pfgithub
Copy link
Contributor

@ifreund That requires you know the type of al, which isn't always possible.

doActionIfCondition(? what is T ?, .{"1", "2", "3"}, true);

This is slightly more sugary (and more readable) syntax for:

fn doActionIfCondition(al: var, cond: bool) void {
    checkActionListener(@TypeOf(al));
}

@pfgithub
Copy link
Contributor

@hryx It is ambiguous, yep. It would break in this case:

fn f(a: fn(comptime T: type) bool) void {
    // The compiler converted `a` to `type` because it thought it was a type constraint
    _ = a(usize); // error: expected function, found 'type'
}

I don't think this is an issue

@TypeOf(fn(comptime T: type) bool) == type, while @TypeOf(typeConstraintFn) == fn(comptime T: type) bool

If @TypeOf() it is type, treat it normally and do casting and stuff

If @TypeOf() it is fn(comptime T: type) bool treat it like anytype and use the function as a type constraint

Also, even if it was a type constraint, @TypeOf(a) would be the type of the argument you passed in, not type.

fn f(a: typeConstraintFn) void {
    _ = a(usize); // error: expected function, found 'comptime_int'
}
f(25); // note: called from here

@Rocknest
Copy link
Contributor

Rocknest commented Sep 12, 2020

After reading @eriksik2's comment i think that we have three district usecases for var/anytype

The first is status quo anytype:

fn eatsEverything(x: anytype) void {...}
// in terms of existing semantics that would be expressed as
fn eatsEverything2(x: @TypeOf(x)) void {...} 
// status quo: use of undeclared

Second is proposed type constraints (a more generalised version of the first usecase):

fn typeConstraint(x: var/anytype(isSomething)) void {...}
// compile error when type constraint is not satisfied 
fn isSomething(comptime T: type) bool {...}

And the third being pattern matching (an unrelated use case):

fn List(comptime len: usize, comptime T: type) type {...}

fn listLenAny_Typei32(x: List(var/anytype, i32)) void {...}
fn listLen5_TypeAny(x: List(5, var/anytype)) void {...}

 
The first and the second use cases could be unified if we were to allow @TypeOf to refer to the parameter it is defining the type of, and possibly remove a keyword that disguises itself as a type

fn eatsEverything(x: @TypeOf(x)) void {...}
fn typeConstraint(x: IsSomething(@TypeOf(x))) void {...}
fn isSomething(comptime T: type) type {
  if (std.meta.trait.hasFn("alloc")(T)) {
    return T;
  } else {
    @compileError("type does not have function 'alloc'");
  }
}

// another possibility is to define a new builtin instead of generalising @TypeOf()
fn eatsEverything2(x: @Infer()) void {...}
fn typeConstraint2(x: IsSomething(@Infer())) void {...}
// @Infer() works in a scope of type declaration of a parameter of the function
// something like @This()

 
And pattern matching, as a separate proposal, may work like this:

fn List(comptime len: usize, comptime T: type) type {...}

fn listLenAny_Typei32(x: List(@Anything(), i32)) void {...}
fn listLen5_TypeAny(x: List(5, @Anything())) void {...}
// @Anything() is not a real type, but it makes compiler to create a type pattern
// if the type of value matches this pattern then it is accepted otherwise its a compile error

const list7i32 = List(7, i32).init(.{-1, 1000, 8, 2, -345, 4, 5});
const list5u8 = List(5, u8).init(.{0x1, 0x2, 0x3, 0x4, 0x5});
listLenAny_Typei32(list7i32); // ok
listLen5_TypeAny(list5u8); // ok
listLenAny_Typei32(list5u8); // error: expected type List(_, i32) found List(5, u8)
listLen5_TypeAny(list7i32)l // error: expected type List(5, _) found List(7, i32)

@kristoff-it
Copy link
Member

kristoff-it commented Oct 8, 2020

I support the anytype(isFoo) syntax.
One important use-case to consider is when a library is asking the user to provide a type that conforms to a given interface. In that case, the type-checking function should also provide hints as to why the constraint was not fullfilled.

All of this can (and should, IMO) be implemented in userland. I'm doing something similar in my Redis client:
https://github.com/kristoff-it/zig-okredis/blob/master/src/traits.zig#L18-L47

Having dedicated functions in, say, std.meta.constraint would help automate and streamline this process.
For example (from a discussion with @MasterQ32):

fn isRedisParser(comptime T: type) void
{
  if (std.meta.hasDecl(T, "Redis")) {
     std.meta.constraint.hasFunction(T, "parse", fn(u8,comptime type, anytype) !T);
     std.meta.constraint.hasFunction(T, "destroy", fn(…) void);
  }
}

If you look at this composition then you can also see why it might make sense to have type-checking functions to have a void return value, instead of bool: if something is wrong, you expect the function to @compileError.

@SpexGuy
Copy link
Contributor

SpexGuy commented Dec 6, 2020

Discussed this issue with @andrewrk and @marler8997 .

This proposal has clear ergonomic benefits for writing generic code. However, everything that this proposal introduces is already possible by calling the validation functions in the first few lines of the function.

There are a lot of problems with generic code. Generic code is harder to read, reason about, and optimize than code using concrete types. Even if it compiles successfully for one type, you may see errors only later when a user passes a different type. Generic code with type validation code has an even worse problem - the validation code has to match with the implementation when it changes, and there’s no way to validate that. So the position of Zig is that code using concrete types should be the primary focus and use case of the language, and we shouldn’t introduce extra complexity to make generics easier unless it provides new tools to solve these problems.

Since everything in this proposal is possible with the current language, we don’t think this is worth the complexity it adds. Especially since this feature is only for generic functions, which should be used sparingly.

Because of that, we've decided to reject this proposal, with the aim of keeping the language simple. We know this may be somewhat unexpected, given the popularity of this issue. However, having a simple language means that there will always be places where it would improve ergonomics to have a little bit more language. In order to keep the language small, we will have to reject many proposals which introduce sugar for existing features.

@jumpnbrownweasel
Copy link
Contributor

@SpexGuy I can accept this outcome for my personal use, but the following statement just doesn't match reality and makes me worry about the future of Zig:

So the position of Zig is that code using concrete types should be the primary focus and use case of the language

The Zig stdlib has tons of generic code for mundane things like array lists, hash maps, equality, etc. These are not all that different than code that users will write, especially for larger projects. The only way I can understand the statement above is if you only expect Zig to be used for small projects and that very few libraries will be created.

Generics are also very commonly used in all other statically typed languages I've tried. The one exception I know of, Go, is now adding generics.

It is unconvincing to say that concrete types are better than generic types, when people commonly use generic types and they're used commonly in Zig's code as well.

@ThadThompson
Copy link

I'll add 2 cents along with @jumpnbrownweasel. Here are my pain points as a new Zig user, implementing a serialization library for Zig.

Readable Discovery
I would like to be able to abstractly serialize to an output stream of some kind. Digging into Zig's JSON serializer for inspiration, I see this:

pub fn stringify(
    value: anytype,
    options: StringifyOptions,
    out_stream: anytype,
) @TypeOf(out_stream).Error!void {
// ...

... seeming to indicate that value and out_stream could be anything. Which of course not only is that not true, but it is not true in very specific ways: value actually could be almost anything that can reasonably be JSON serialized, whereas out_stream is a struct required to have a very specific form. From a glance at what is readable in my window, out_stream must:

  • Have a writeByte function
  • Have a writeAll function
  • Be something I can pass to std.fmt.formatFloatScientific and do whatever it needs
  • Be something I can pass to std.fmt.formatIntValue and do whatever it needs
  • Be something I can pass to child_whitespace.outputIndent and do whatever it needs
  • Be something I can pass to whitespace.outputIndent and do whatever it needs

Both value and out_stream represent types that must satisfy a contract to compile correctly. However, most of value's contract is sitting here in the stringify function ("oh, we can't send down an untagged union - got it - thanks for the useful error message") whereas out_stream's contract is smeared out in this function and down at least four other functions.

The requirements of out_stream are a protocol - without a centralized place for a protocol definition. Anyone implementing an out_stream or receiving a compatible struct, has to dig through all consumers to see what it needs to do. Yes, it's compile time checked - which is a major step forward - but seeing an anytype gives me the similar feels as when I come to a C function that takes void *.

Tooling Discovery
In functions using an anytype parameter, our tooling also doesn't know anything about it. But - if our language server or code editing tool knows that out_stream in the above example is a Writer then it can surface documentation for it, give auto-complete suggestions for the methods it implements (and documentation for those) and provide inline feedback when you do something wrong, such as passing the wrong parameter type to one of it's methods. Without that protocol definition somewhere, our tools don't know what to do with it.

Other languages have solved this with types/interfaces/contracts/traits - and while having to write and work around those contractual definitions does create overhead, having that information available in a centralized place at development time is really, really handy - especially when interfacing with other people's code (or my code over time).

On the other hand, if there were a tool that could dynamically run the Zig compiler or otherwise do whole program comptime analysis, and surface that contractual information in a coherent way during development... that could be a whole new ballgame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests