-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Function Definition Types #8383
Comments
I don't see why we shouldn't allow that. I think its much nicer than a new |
What does it mean to align a pointer to a function? Or to make it const vs non-const? What about volatile or allowzero? |
uh, it means the body of the function i.e. the machine code, is located at an address that has said alignment.
same as for any other pointer
|
That's not what it means. Function pointers in the 1717 proposal are the address of a function label, which is an abstract comptime-only entity. Writing to one of these pointers does not write the function code, it swaps out the underlying comptime-only function label. This is precisely why it's misleading. |
That wasn't my understanding: my understanding is that a function label is a 'label' for the machine code, as an address isn't known at comptime (it's only known at link-time). At link-time the label finally gets a 'real' address, and any static function pointers are updated to point at this link-time location (for dynamic executables/libraries this linking happens at load time). |
I'm talking about this specific part of the proposal which aims to solve this issue and is an accepted part of 1717. For the language to make sense on the whole, this code needs to work: const a = fn() void { print("a"); };
const b = fn() void { print("b"); };
test "what is a fn ptr" {
const fptr = comptime blk: {
var x = a;
var p = &x;
p.* = b;
break :blk p;
};
fptr();
} But this means that the address of a function is not the address of some machine code. Instead it's the address of a comptime-only object. The fact that such a thing can exist at runtime is super weird and leads to misunderstandings about what it actually is, like yours. This proposal is an attempt to fix that, by separating function pointers and function definitions into separate categories, so that a pointer to a comptime function label object can be differentiated from a pointer to machine code. |
In the linked proposal this is a compile error:
|
That's at top level scope, which makes it a runtime variable. Runtime variables cannot contain comptime-only values, so that's a compile error. In my example above, I am using a comptime var, which works just fine. |
@SpexGuy, const foo = fn(x: i32) i32 { return 2*x; }
const a = foo; // function alias, preserves all metadata
var b = foo; // probably not allowed
const p = &foo; // constant function pointer
var q = &foo; // variable function pointer
export const x = foo; // exported function
export const y = &foo; // exported function pointer |
The problem is that this model is different from the rest of the language, and doesn't make sense with comptime. Let me go through your example and annotate it to show the problem. const foo = fn(x: i32) i32 { return 2*x; }
// has type `fn(i32) i32`
const a = foo; // function alias, preserves all metadata
// If you can do this, functions are not pinned. This doesn't create a new function. Therefore fn types are comptime references, not values.
// You can make a comptime var of this, take a pointer to that var, and mutate the value through the pointer.
// So a pointer to this is a pointer to a comptime reference to the function data.
var b = foo; // probably not allowed
// This needs to be disallowed for a specific reason. In this case, it's disallowed because functions are comptime-only
// types, and therefore you cannot have a runtime var of this type. But you can have a comptime var of this type.
// To disallow that is unreasonable and has no precedent in the language. It would be thoroughly unexpected and make no sense.
const p = &foo; // constant function pointer
// This has the type *const fn(i32) i32. The const comes from the fact that `foo` is stored in constant memory.
var q = &foo; // variable function pointer
// Same type, but this is allowed. This means you can have a runtime pointer to comptime-only data.
// So `q.*` is a compile error because the result is comptime-only but `q` is not comptime-known.
// Normally pointers to comptime-only types are themselves comptime-only, so this breaks all those rules as well.
// So now the conditions for is-comptime-only has these weird exceptions:
// comptime only primitive -> true
// fn -> true
// pointer -> payload is fn -> fn is not generic -> false
// pointer -> payload is comptime only -> true
// aggregate -> any field is comptime only -> true
// else -> false
// Whereas before it was very simple:
// comptime only primitive -> true
// pointer -> payload is comptime only -> true
// aggregate -> any field is comptime only -> true
// else -> false
// Looking at this type, you would think that the `const` here means that the function code is constant.
// But that is a lie, that's not what it means. The `const` here means that the underlying function reference is constant
// and cannot be changed through this pointer. Attributes on this pointer apply to `foo`, which is the label that
// was copied to `a`, not some list of function code.
export const x = foo; // exported function
// This is fine and make sense
export const y = &foo; // exported function pointer
// This is also fine, no problem here. The binary representation of a function pointer
// is the pointer to machine code, even though the attributes apply to the underlying comptime object. The other important problem this solves is parameter names being stored in decls. That absolutely has to go, it's not where that information belongs. |
@SpexGuy I don't see how functions behaving a bit differently is a problem. The tradeoff is either a) increase the number of rules for comptime-only-ness from four to five, the new rule concerning a core part of the language that it is absolutely reasonable to expect a user to take the time to understand or b) add an entirely new feature, keyword and builtin to the language, because we don't respect functions enough to give them the space to work how they should with existing features. It's like natural language: the most often used verbs are the irregular ones, because it's reasonable to expect a speaker to learn all the ins and outs because they use them so often. As far as I'm concerned, most of this proposal is ugly and superfluous. An alternative solution would be:
This has all the necessary functionality at comptime, and does not require any modification to work at rumtime; the only downside is a few more characters to type, and the upside is no additional language features. Also, that
Quite literally the only downside of this would be that the type does not name parameters, but this is a non-issue with the use of comments, as seen above. One thing I do like is the naming scheme of |
Yes, function literals are not "values" in the same sense as integers or structs, since they cannot be copied, inspected or modified. Function literals are only touched directly by the compiler. The programmer only gets a handle. You could call this handle a "function lablel", but there's no real need to officially call it anything. It's an implementation detail. The programmer only needs to know that assigning a function to a variable creates an alias and not a bitwise copy. This is different from how assignment works for other types, but it shouldn't really surprise anybody, since it is the only reasonable behavior.
See below.
I actually don't see any big problems with allowing function vars. Since we already agree that functions are only handles (however we call them), function vars would be the effectively function references (i.e. function pointers that are fully typechecked and do not support pointer arithmetic). These could be usable both at runtime and comptime.
The
I think it's reasonable to disallow dereferencing function pointers, for much the same reasons that you can't dereference a void pointer in C.
Yes, it's an exception. But it's rooted in a real underlying difference, so why not? I think that jumping through hoops to create consistency for consistency's sake is the wrong choice here.
Parameter names can be part of the function literal, since they don't belong into the type. But I'm out of my depth here implementation-wise. |
The proposal feels very much like constant lua tables, but with more precise information. @SpexGuy Could you elaborate shortly, what the Function Definition Types should and could be used for?
In lua everything is a table that you can modify and metaprogramming could use the same functionality (copy + change stuff would be safer). If that is a smart idea, performant or efficiently to implement in the compiler, would be the other question. |
People seem to be objecting a lot to the new keyword, but that's not actually necessary to this proposal. We could use Similarly, this proposal introduces no new functionality over 1717, just a different syntax to bring it about. 1717 still has conversion from functions to function pointers and back, runtime function pointers, and comptime functions. So the behavior here is not more complex, this flavor just allows things to behave more like other parts of the language, which I think is highly valuable.
You have it exactly backwards. The solution in 1717 either doesn't work with comptime var or behaves unexpectedly with pointer modifiers. This proposal is the version that gives functions the space to work how they should with existing features.
Consistency is extremely valuable. It allows a language to be intuited. People can guess how a feature should work, and then it does work that way. Zig is a highly consistent language, much more than other languages. This is a large part of what makes it feel simple. Arrays are value types because it's more consistent, even though C's approach is more pragmatic. Comptime loops and runtime loops use the same syntax because it's consistent. Errors and error unions behave like normal values because it's consistent. I think there's a very high cost to breaking consistency, and I would like to avoid that in a core part of the language like this. Not being able to make a comptime var of a specific type is unprecedented in the language, and raises all kinds of other questions. Can you embed one of these values in a mutable comptime-only struct instance? What about a comptime field? If not, why not? If so, can you take a pointer to it? What happens if you mutate through that pointer? Are all function pointers const? If not, how does one create a mutable function pointer? What does it mean to dereference it? Why can I create a pointer to a function but not a pointer to a comptime_int, even though both are comptime only? There's a huge amount of complexity involved in breaking consistency here, in the form of all of these questions. The answers are irrelevant, the problem is that these questions exist at all. Every newcomer to the language will ask them. They will be a stumbling point forever. You cannot look to any other feature in the language to help answer these questions, because this is unlike any other part of the language. I would not so easily introduce that sort of complexity.
They aren't necessary, I was just giving examples of information that makes sense in a function declaration type but not in a function pointer type. I don't know that TZIR representation is really relevant here.
They exist to make function pointers behave like pointers to instruction data, and functions behave like comptime objects. This is what people intuitively expect from these types. Function pointers don't need to be a special case in terms of comptime behavior. We can have their behavior be intuitive from other parts of the language. Making this distinction allows that. |
There's that word again. Personally, having only generic function types and anti-pinning function values is the most intuitive solution to me; the proposal as written is just bizarre (are generic non-pointer function types a thing? They're not mentioned, but seem to be implied...?). Optimising for intuition always assumes something of the developer which is not universally applicable. Attempting to have No Exceptions™ to the pointer rules is already a lost cause -- we already break the rules for opaque types (no dereferencing) and variable-width types (pointers cannot be read at runtime). Contorting the developer interface to functions to make them fit the rules for specifically non-opaque, fixed-width data, when functions in reality are neither of those things, I think is ten times more bizarre and crazy than making pointers to them work a bit differently. |
@SpexGuy, End rant. What I'm saying is
As mentioned before, I don't see a problem with function (handles) being vars, comptime or otherwise.
Sure.
Not a comptime pointer. The
You can't. Footgun eliminated, no?
Function pointers themselves can be reassigned. The function they point to cannot be modified. Direct dereferencing is not allowed, for the reasons pointed out by @EleanorNB above, in addition to the fact that there's nothing you could realistically do with the "value" of the function.
Because it makes sense and is in fact useful. Whether a useful analogon can be found for
The answers are very much relevant, IMO. Many of the questions sound rather hypothetical. The disallowed things are disallowed because it would be an error or undefined behavior to attempt them. And I don't think it will be a problem for learners. When they try to do these things, they will get an error message informing them that they are trying to do something impossible. |
I agree with most of what @zzyxyzz is saying, save three points:
|
After sleeping on it, I realized that I've been talking past @SpexGuy yesterday and missing some important points. There are some constraints to this problem that are not easy to satisfy simultaneously. Here I've tried to document them, since they weren't clearly explained before: So long as functions only live in stack variables, the language has some leeway to make them "just work" without burdening the programmer with the details. We might even excuse some special behavior here and there. There are some cases, however, where implementation details are forced to a point. For example, you might have a struct that supports dynamic behavior by assigning a function to a field: object.action = fn() void {}; // very exciting Since structs are plain old data, we can't dance around the issue here, and have to settle on a binary representation for the function handle. Logic suggests that it must be a function pointer. But then it would be hard to justify it if a function assigned to a local variable wasn't a function pointer as well: var f = fn() void {}; // variable f holds a function pointer internally So far so good. But what happens when we do the same at const std = @import("std");
const print = std.debug.print; Since But that's not all. There are places where you need to officially acknowledge the distinction between a function and the function pointer that may represent it. One such case is exported functions and function pointers at the ABI level. Another is low-level work with function pointers, where you might need to do pointer arithmetic. This raw function pointer can be neither I think these are the main constraints, but not all. There's also propagation of constness and comptimeness, where to put things like argument names for introspection and metaprogramming, generic functions, and I'm totally out of my depth here, and I should have taken more time to understand the issues involved before going on my commenting crusade yesterday. I don't know what the right solution is here. This proposal doesn't feel quite right yet. But the issue is far from simple, although it would have been if it weren't for Zig's comptime. |
@SpexGuy, The rules for transferring function pointers from comptime to runtime are not clear to me, but the exact same thing somehow works for literal strings, so maybe we could just apply the same rules to functions as well? The ABI issue would also have to be handled somehow, but I'd first like to know whether the idea has merit at all. |
What @SpexGuy points out, I think, is that there is a difference between A) the function definition, say as an abstract syntax tree, analogue: struct (i.e. type) definition const Foo = struct { B) the function as the operation it defines on the virtual machine. Two such operations are equivalent if they have the same semantics. analogue: tuple (i.e. type) definition const Bar1 = struct{i32, u8[]}; C) a function symbol, aka entry point for a stream of instructions with a well defined ABI (calling convention) analogue: external struct instance definition. external foo: const external struct{ He then points out that in a language like Zig which does computation at compile time you cannot quite sweep these differences under the carpet. It seems to me that there is a little leeway in the relation between fn(i32: n) int and *const fn(i32) i32 and in particular what the address operator & should return but here is my 2C: -- fn(int : n) int is the type of a function definition with argument named n (a compile time concept). E.g. const inc = fn (n : i32) i32 { return n +% 1}; which is equivalent to const inc : fn (n : i32) i32 = fn (n : i32) i32 { return n +% 1}; -- *const fn(int) int is the type of a specific function as an operation on the abstract machine (also a compile time concept). E.g. which is equivalent to const inc1: *const fn(i32) i32 = &inc; The compiler may or may not set inc1 == inc2 because the argument name does not change the semantics and because of the rules of (modular) arithmetic say m +% 2 -% 1 == m +% 1 (and in this case, optimisation is likely to find that out). -- fnptr (i32)i32 (as defined by @SpexGuy ) (or fnentry(i32) i32 or perhaps something like @Cinclude{ const zig_inc = external inc; which is equivalent to const zig_inc: fnptr (i32) i32 = inc; Only fnptr(i32) i32 (or fnentry(i32) i32, or ... ) can be stored in structs and passed to functions at runtime, and for fnptr's with different calling conventions the compiler must generate a little shim function. Two fnptr's are equal if and only if they refer to the same function entry point in an executable. |
This should be unnecessary with the function pointer changes in the self-hosted compiler. |
Since #1717 is dead now, should this still be open? |
Closing this as handled by the function pointer change in stage2. |
This proposal solves the problem introduced by #1717, that function pointers are ambiguous with function labels on the ABI boundary. The suggested solution in that issue is to make a distinction between function labels, a comptime-only type which represents an actual function; and function pointers, which may exist at runtime. That issue suggests representing this difference using an actual pointer type, but that has some strange properties. Normally, pointers to comptime-only types are themselves comptime-only types. But not in this case. Additionally, information related to function definitions (parameter names, for example) is currently stored in the type info for Decls, which is kind of a weird place. This proposal suggests an alternate form of this distinction to solve both of these problems.
1. Function Definition Types
Like struct literals, every function literal has a distinct type. The type info for this type includes information about
Function definition types are comptime only, and pointers to these values are also comptime only. Their names are assigned in exactly the same way that struct names are assigned. There is no type literal for a function definition type (in the same way that there are no type literals for frame types).
2. Function Pointer Types
Function pointer types carry only a subset of the information stored in a function definition. They leave out parameter names, function names, and file and line info. They are declared using the new keyword
fnptr
. Function definitions may coerce to compatible function pointer types. Unlike definitions, function pointers may be mutable at runtime if the parameter and return types are not generic. We could decide to disallow generic function pointer types, but this would force the use ofanytype
in many places, which could damage type safety.The result of peer type resolution on multiple function definition types is the compatible function pointer type, if one exists, or a compile error otherwise.
Unlike definitions, function pointer types are actively deduplicated by the compiler.
fnptr() void
in two places in a file will generate two references to the same type.A comptime-known function pointer value can be converted back to a definition with the new
@fnDef(comptime ptr: fnptr) fndef
builtin.3. Extern Functions
A function literal may be created to reference an extern function by replacing the body with the
extern
keyword. A symbol name may also be specified.The parameter passed to
extern
will be of typestd.builtin.ExternInfo
, which is defined as4. Function Pointer Comparison
Comparing function pointers is done by the following rules:
If the two pointers are derived from the same function literal, comparison must return true. If the program may observe a difference between calls to the two pointers (they have different side effects or return values), comparison must return false. Otherwise comparison may return true or false, but the compiler must be consistent about any given pair. (it may not decide that they are equal in one part of the code but different in another.) Function pointers may compare as equal if and only if their underlying binary representation is equal.
These rules allow the compiler to deduplicate functions which generate identical code, without generating an extra stub for one of them to ensure they compare as distinct. They allow this deduplication to happen after optimization (like dead global removal), as long as the functions are not compared at compile time.
5. A note on coercion
Coercing a function definition to a function pointer always requires writing out (or constructing) the compatible function pointer type. This could be solved with comptime code, but may be common enough that we want a more dedicated solution. Here are the four ways I can see:
std.meta.fnPtr(comptime fnDef: anytype) FnPtr(fnDef) { return fnDef; }
@fnDef
above).var ptr: fnptr = some_fndef;
to infer the needed function pointer type.The text was updated successfully, but these errors were encountered: