-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Use separate operators for function calls and calls to function pointers #6966
Comments
What would the status quo be if #1717 is implemented? Was it planned that I support the idea of requiring explicit indirection, but it would create some dissonance with *not a technical term |
What exactly is the benefit of this? The main difference between function pointers and labelled functions is non-functional (pardon the pun), i.e., it is in how much optimization and static analysis can be performed at compile time, not in how the function behaves at runtime. I'm not sure that visually signalling this difference is important enough to require special syntax.
|
|
Under the hood, a function label is nothing but a named, comptime-known function pointer. So saying that the type of a dereferenced function pointer is a label doesn't sound right to me. However, on second thought, dereferencing a function pointer is more akin to inlining a function than to accessing a callable object, which would make the |
Are you sure that function pointers so different and so bad that they deserve special treatment? Its almost identical to fields-via-pointer.
I think its planned for function pointers to have stack size metadata.
Fun fact: short-circuit operators sometimes inhibit the branch predictor
Calls to async function pointers are done via builtin, right? |
I don't see a way this explicitness would benefit me personally.
I think general-enough code benefits from the ability to uniformly call a function (comptime-known or not). Having to differentiate them means an increase in complexity. You would need to update every call site if you choose to switch from one to the other - unless you provide a comptime-known wrapper function... which is just an extra step to get back to status-quo. |
That's a good point, I overlooked that initially. Although the same might also be achieved without distinct call syntax: t.member();
(t.field)(); In all other cases, the different call notations seem to be purely ornamental, in the sense that they don't convey any information to the compiler that it does not have already. Writing Are there any other cases apart from the member function vs pointer-in-field situation, where the difference between the two call syntaxes would be semantic? |
I support a different syntax for function pointer calls, however my concern is that we need a way of communicating stack growth, as stated in the recursion issue (too lazy to find a link rn). Seems to me that we would benefit from making this mandatory, but I don't really see how we would do that without resorting to builtins, which would be cumbersome. However, maybe the relative nicheness and inherent danger of this operation would warrant that? |
One major reason to do this is that it disambiguates between "read a member of the struct and invoke that pointer" and "look in the namespace of the struct's type for a function with this name, and pass the struct as an implicit first parameter".
I take your point here, that it's extra syntax that may be unnecessary. @dbandstra 's argument is more convincing: "it would create some dissonance with But I feel like I need to point out that ducktyping usually refers to the same syntax working on objects of multiple types. Not accepting this proposal would be duck typing: allowing
Zig is openly not done yet. Once it hits 1.0, backwards compatibility will be considered if any further changes need to be made. Until then it is strongly advised not to depend on language stability at all. Please don't use the current version of Zig in production, or in anything that isn't a hobby project. |
This ambiguity won't be eliminated like this -- there's still ambiguity between member access, and referencing a function and not calling it (for instance |
Why should the programmer have to worry about how the language works internally? Unless there are hidden footguns, I'm in favor of keeping the status quo since this proposal just creates more work for the programmer with no apparent benefits. Remember: we should optimize the language for the programmer, not the compiler. |
The benefit for the programmer is knowledge. It's a huge difference if i call a function (which is always the same) or i call a function pointer (which may change between invocations). It's not at all a problem for the compiler to differentiate between function pointer and function invocation, but it makes a difference for the programmer who reviews the code and sees it for the first time.
|
I think it shouldn't be this easy -- function pointers are not as "nice" as functions, their invocations should reflect that. Adding a different syntax does nothing if that syntax leaves just as much to the imagination. I don't think there should be any special syntax for function pointer invocation at all -- I think there should be a builtin, |
Doesn't this apply to all mutable variables? I still fail to see what makes functions in particular so different. We don't require
If you are doing something really, ahem, interesting, like hot-plugging new functionality into a global jump table while the interpreter is still running... then it's your own business to know about the dangers of function pointers. But under normal circumstances, I can't actually think of any substantial footguns involving FPs that could arise purely from inattention and could therefore be remedied by a more eye-catching syntax. I'd be interested in seeing some examples. |
neutral on the choice, but wanted to provide some language design context from other communities: There are other languages that do "the equivalent of this", in particular, elixir and ruby: https://hashrocket.com/blog/posts/elixir-functions-ruby-lambdas. In those languages, the choice is more syntactical (both languages support calling a function without parentheses, so the syntax is strictly necessary). It does occasionally cause issues in forums when folks coming from python are not sure why their lambdas aren't working. I imagine something similar could happen for people coming from C, but this can easily be fixed with a helpful compiler error message that reminds people how function pointers are called in zig. This will be really easy since in zig there is no variable shadowing (IIRC) and so identifier resolution should be dead easy. |
Other than the CPU's branch predictor getting trashed by this (at least potentially), are there any impacts to how a compiler would treat a function called through a mutable pointer? I think most optimizations like leaf functions still apply. Most compilers and CPUs do pretty well with function pointers (thanks to decades of C++). Obviously if you have a function pointer that you change all the time, then your branch predictor is going to give up and you'll take a large performance hit. But if you use this as something like vtable entries (relatively static), then the predictor should do fairly well and then your cost will be low. Perhaps even as low as a direct branch in terms of cycles. I think given how |
Given that we know statically whether a given pointer is to a function, I think we are actually perfectly safe with no new syntax -- |
I think I have lost the plot a bit here. The original idea from @SpexGuy was to solve the following:
(I numbered them instead of leaving the original bullet points so that I can reference them directly.) Let's take these one at a time.
That all said, how does a different syntax help here? All this does is make it a little easier for the programmer to know that the compiler is not going to be able to determine the stack size. There is @EleanorNB's proposal about a much more explicit call syntax. That might work if you can get that information about a function in a dynamically loaded DLL at runtime. Of all the different bikeshedding here, @EleanorNB's is the only way I can see to provide that stack information, and it would need to be either a fixed large number or a runtime-known number. You could flip this around and annotate the function pointer with this information and any function that was assigned to the function pointer would need to match that annotation.
IMHO, this is a weak reason. LTO seems like the only optimization pass that will not work as well and that is going to be true in the runtime load case no matter what.
I think that issues 1 and 5 identify the most urgent issues with function pointers. However, I am not sure how a different calling syntax helps with those issues. Of all the proposals, I think only @EleanorNB's really would provide the information needed to solve these cases, but it seems like a lot of ceremony for the very common case of using function pointers in a v-table. In that case you have a very limited number of possibilities. You can flip this around and add annotations to the pointer declaration instead of the call site. At least to me, and perhaps I am missing something important, it all boils down to this:
Is the need to know whether a function is accessed through a pointer sufficient to make this a place where Zig is more explicit than ordinary fields and variables? Doing so makes the language more explicit but also makes it more complex. |
I agree, this proposed feature does not solve the problems that it set out to. Additionally, as mentioned, having a distinction between |
Calls to function pointers have dramatically different characteristics than calls to comptime-known functions. To name a few:
async
In #1717, it is planned to make an explicit difference in the type system between function labels (comptime-only types which are always statically known) and function pointers (which may be runtime known). Because of the above differences, I propose that we also use separate operators to call functions and function pointers. Specifically, use
()
for functions and.()
for pointers.This also helps to disambiguate calls to member functions in types from calls to function pointers in fields.
The text was updated successfully, but these errors were encountered: