-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: spec: redefine range loop variables in each iteration #20733
Comments
We considered this prior to Go 1, but considered it too quickly. We'll definitely reconsider it for Go 2. I actually think there's a duplicate tracking bug for this. |
I think this would be an important improvement to a situation that catches
every go developer at least once in their efforts to learn Go.
…On Tue, 20 Jun 2017, 06:10 Brad Fitzpatrick ***@***.***> wrote:
We considered this prior to Go 1, but considered it too quickly. We'll
definitely reconsider it for Go 2. I actually think there's a duplicate
tracking bug for this.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#20733 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAcA80GqcY1F4zKSo-optHwPwFOb1Jpks5sFtXGgaJpZM4N-uiT>
.
|
And I would argue even catches seasoned Go programmers occasionally. See https://golang.org/cl/40937. It took me a while staring at that logic trying to figure out why the code wasn't doing what it naturally read as. |
Adopting this proposal can make performance of perfectly valid code worse, sometimes dramatically. For example, when escape analysis cannot determine it's safe to keep the {k,v} value on stack while it's address is taken. Then we get O(n) heap allocations instead of the current O(1). |
@cznic can you give some examples of this. I would expect that most of these cases would actually be bugs because taking the address of the same variable over and over again was the problem in the first place. |
@davecheney Consider #20725 (comment) and imagine escape analysis is not able to prove foo only reads via the passed pointer. |
Why would you want to take the address of a copy of the current range value? If i'm not mistaken, any modification to it would be lost. |
Taking the address of a value does not imply intent to modify the pointee. |
But the value is already copied as part of the range iteration. Having a separate value escape on every iteration is sub optimal, but hopefully you'll agree it's reasonable to trade a potential performance problem for a very common correctness problem.
True, that is asking a lot of escape analysis and I really don't know the ways that escape analysis and the implicit taking of address interact. But I will note that your example explicitly took the address of the copy of the current index value, so if the intention was to pass a pointer to the value down to a function, you'd want to make sure you were passing a pointer to the original slice element, not a copy. In the code that I have read that is more commonly written as
Respectfully, generally it does, as in your method example above. |
Iff vals is a slice/array. There are also maps and channels.
Even the innocent looking Edit: Added '(if not already a pointer)'. |
If it's an array, it's already copied during the evaluation of the range statement. If it's a channel, it's a copy of the value received. Again i'm falling back on arguments of correctness over performance.
I believe a copy of |
Right. But with this proposal there will be, in certain situations, an additional allocation per iteration.
You're right. The interface runtime struct has the address, but that's not different with this proposal. I misspoke, sorry. |
Sure, but if it actually shows up on a profile there should be an easy workaround: var bigValue someType
for _, v := range bigValues {
bigValue = v // Compiler should be smart enough to elide this copy.
foo(&bigValue) // Only one copy of bigValue escapes.
} It is possible to write correct, efficient code with either definition of |
True, but still the proposal has the potential to cripple existing correct and efficient code. Also, your comment does not consider |
For this proposal (and Go 2 proposals in general), if the proposal is accepted we should either compile existing Go 1 programs as-is or provide a tool that converts them preserving existing semantics. That is: no "existing" Go 1 code should be affected either way.
Escape analysis of pointer methods is mostly straightforward; you only really need to worry about arguments to interface methods. And note that iterating over a container of n values is already O(n), so doing O(n) memory-management work changes only the constant factors (not the asymptotic behavior). |
I don't think this can be achieved without solving the halting problem. But let's assume it can be done. Even then we would have to put the particular compiler optimization that you've mentioned earlier, directly in the specification of the language. That can be done, no doubt. Isn't it better to avoid such leaks of abstractions? I think so. |
Why? Generally, the only time an optimization needs to go into a language spec is if it affects the asymptotic behavior of the program, the canonical example being tail-call-elimination in functional languages (which changes memory usage from O(1) to O(N) with the depth of recursion). As I noted above, per-iteration memory-management overhead affects only constant factors: running time for loops is already O(N) with the number of iterations, and peak memory usage is O(1) either way (because the value can be garbage-collected at each iteration). |
If by "cripple" you mean introduce bugs, this will only happen in code that depends on a variable having been overwritten after its capture. I find it quite hard to imaging cases where this would be a requirement for correctness (would appreciate examples). But even if it is, it's easy to make a tool (see #20725; original formulation) that would flag all relevant code for scrutinization as part of a migration process. As to the performance argument, I believe that if a good escape analysis can't figure out where a pointer is going, then neither can programmers (or it does in fact escape the loop). Thus, if such code is correct, it's probably by accident and not for performance reasons. |
I thought maybe something like the below just with := can be constructed where this happens too:
Edit: on closer look i guess currently := allows only simple variable names so this is not an issue. |
As this does not require changes to the for-range syntax, it is still valid Go 1 code. |
Ran across this interesting example today (https://play.golang.org/p/DUstLropfJ): …
func (k *key) f(v string) {
fmt.Printf("%#v.f(%#v)\n", k, v)
}
func main() {
…
for k, v := range m {
defer k.f(v)
}
} The interaction between loop variables and implicit pointer receivers makes it possible to accidentally close over a loop variable without taking its address or writing a function literal. |
For posterity, today someone posted on golang-nuts about stumbling into this with (*testing.T).Parallel: With regards to the performance-concerns: I'd expect the difference, in general, to be optimized out. Even if there are still edge-cases: Given the subtlety and prevalence of this problem, I'd strongly favor correctness over performance. If your code is too slow, you can always benchmark and hand-optimize. If your code is subtly incorrect, that's a lot harder to remedy. |
On 08/05/18 06:13 PM, Bryan C. Mills wrote:
From previous experience with managed change (which dates back to Multics and dinosaurs roaming the earth (:-)) one of the largest problems of trying to maintain compatibility is that it can lead to persons depending on a bug, and thereby preventing it from ever being fixed. I would argue that when this kind of deranged dependency occurs, that in order one
In the special case of linked-library changes, the caller may have lost the source code they need to change, so we instead made the last step
Go's case is not as hard: if you can do steps 1-4, then I strongly recommend you do so. --dave |
Change https://golang.org/cl/164119 mentions this issue: |
Fixes #30429 Updates #16520 Updates #20733 Change-Id: Iae41f06c09aaaed500936f5496d90cefbe8293e4 Reviewed-on: https://go-review.googlesource.com/c/164119 Run-TryBot: Bryan C. Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
This comment has been minimized.
This comment has been minimized.
I'm very supportive of this change. For code that really needs to be optimised then the allocation can be avoided by declaring the variables outside the loop like this:
|
Last year there was high profile example of this class of bug: |
I think this can be done now by keeping it behind a go.mod version flag. When you run for i := 0; i < n; i++ {
use(&i)
} to this as part of the upgrade: {
i := 0
for ; i < n; i++ {
use(&i)
}
} |
Hi @carlmjohnson, FWIW, the Go language changes design document uses this issue here as an example:
That document also describes using the So I suspect it is at least possible to remove this as a feature. One minor comment is I suspect |
FYI, there is a discussion covering how to redefine "for" loop variable semantics on #56010. |
This commit fixes a bug that was recently introduced when registering the costfuzz and unoptimized-query-oracle tests. This bug was due to the well-known range loop variable issue described here: golang/go#20733. This commit also adds some additional logging and debug information to help with future issues with these test cases. Fixes cockroachdb#90010 Release note: None
90221: roachtest: fix costfuzz and unoptimized-query-oracle setups r=rytaft a=rytaft This commit fixes a bug that was recently introduced when registering the costfuzz and unoptimized-query-oracle tests. This bug was due to the well-known range loop variable issue described here: golang/go#20733. This commit also adds some additional logging and debug information to help with future issues with these test cases. Fixes #90010 Release note: None Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
Closing as duplicate of #60078. |
As an alternative to #20725 (which was originally about go vet), let the variables of range loops be implicitly redefined in each iteration like in Dart's
for
loops. That is,should be equivalent to
This will make it "safe" to take the loop variable's addresses as well as capturing the loop variables in nested functions (see #16520).
The proposal could be expanded to vanilla
for
loops, although that would make it diverge semantically from other languages.The text was updated successfully, but these errors were encountered: