-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Disentangle arrays in iteration #15434
Conversation
for i in eachindex(a) | ||
nb += write(s, a[i]) | ||
for b in a | ||
nb += write(s, b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be a little more readable to rename the input A::AbstractArray
then do for a in A
? Having nb
and b
nearby is a little offputting (nb
could be a few characters longer of course). Similar above with the for y in x
comprehensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was tempted to do that, but decided to try to blend in with the surrounding code. I'll reverse that decision and change it.
This also replaces several instances of `for i in eachindex(A); a = A[i]` with `for a in A`. The latter is presumably easier for automatic bounds-checking removal.
fb359e1
to
3820cf3
Compare
For those who arrived here looking to help, here are some guidelines I'm using:
There's one important sticking point: comprehensions. See #10523. Currently Finally, it's a very minor issue, but I'm torn about the best choice in some situations. Should it be like this: for (idest, iA, iB) in zip(eachindex(dest), eachindex(A), eachindex(B))
dest[idest] = f(A[iA], B[iB])
end or for (idest, a, b) in zip(eachindex(dest), A, B)
dest[idest] = f(a, b)
end The second is more readable, but I'm tending to do the first because of the issues discussed in #15356. |
Looks like force-pushing nixed the still-pending benchmark report. Hopefully it will still appear at https://github.com/JuliaCI/BaseBenchmarkReports? |
I suggest you make something like 5-10 groups of sources files (e.g. in lexical order), so that each person willing to help can reserve a group without stepping on others' toes. I'm OK with doing some of this work. |
Should people open PRs against this branch or just make a new PR against master? |
@timholy Fortran-style array indexing: See https://github.com/eschnett/FlexibleArrays.jl. From the examples: # A 3d array with lower index bounds 0
typealias Arr3d_lb0 FlexArray(0, 0, 0)
a3 = Arr3d_lb0{Float64}(9, 9, 9) Here the lower index is part of the type, i.e. has no run-time overhead. You can also write: # A generic array, all bounds determined at creation time
typealias Arr4d_generic FlexArray(:, :, :, :)
a4 = Arr4d_generic{Float64}(1:10, 0:10, -1:10, 15:15) Fixed and open indices can also be mixed. Suggestions for improved syntax are always welcome. |
Your benchmark job has completed, but something went wrong when trying to upload the result data. cc @jrevels |
Remarkable. No cases that seem like real regressions. |
I would make separate PRs---they don't need to intersect at all. This can be the "organizer" and just hold the For comprehensions, maybe best for now would be to mark any that need fixing with Blocks of tasks (first check box is "claimed", second checkbox is "done"):
|
@@ -39,7 +39,7 @@ function _primesmask(lo::Int, hi::Int) | |||
sieve = ones(Bool, whi - wlo) | |||
hi < 49 && return sieve | |||
small_sieve = _primesmask(isqrt(hi)) | |||
@inbounds for i in eachindex(small_sieve) | |||
@inbounds for i = 1:length(small_sieve) # don't use eachindex here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It calls a function, wheel_prime(i)
, on i
. This means that you need to insist on a particular meaning for i
, in which case it's better to be explicit about what i
needs to be.
@timholy recommend an easy one I could try? (noob contributor, but I think some of this seems easy enough) |
I can't really predict without looking, but as a random guess how about Thanks for volunteering! |
Maybe we should announce what file we are going to work on. I call dibs Let me see if I understand. In function write(s::IO, a::AbstractArray)
nb = 0
for i in eachindex(a)
nb += write(s, a[i])
end
return nb
end The for loop can be trivially rewritten as: for i in a
nb += write(s, i)
end Is this the sort of thing we are trying to do? Assume I don't know anything. What do I do now?
The idea then is that I'd do the benchmarks before and after my change and report the result here? How do I submit a patch? |
@dcarrera This has some good general information: https://github.com/JuliaLang/julia/blob/master/CONTRIBUTING.md
|
I volunteer, too. I start working on the first block, |
I'll put the "claimed" checkmarks next to those two. Thanks! @dcarrera, any loop using |
WIP: Disentangle arrays in iteration
OK, this is merged to reduce the likelihood of conflicts. I encourage folks to check out #15459: new syntax |
I've been always interested in REPL-related stuff. I'll take |
Great! |
A question how far we should push this. For example in this function: function ipermutedims(A::AbstractArray,perm)
iperm = Array(Int,length(perm))
for i = 1:length(perm)
iperm[perm[i]] = i
end
return permutedims(A,iperm)
end Here, we obviously assume that
function ipermutedims(A::AbstractArray,perm)
iperm = Array(Int,length(perm))
for (i,j) = zip(1:length(perm),eachindex(perm))
iperm[perm[j]] = i
end
return permutedims(A,iperm)
end |
Sorry for asking again, here is another quite common pattern: What should we do about Iterations with a given offset. For example: function findnext(A, start::Integer)
for i = start:length(A)
if A[i] != 0
return i
end
end
return 0
end Is there a version of |
I think this is already "done". Very few loops, but
for i in 1:it.n
if done(it.xs, xs_state)
break
end
_, xs_state = next(it.xs, xs_state)
end where
for i in 1:numImages-1
name = bytestring(ccall(:_dyld_get_image_name, Cstring, (UInt32,), i))
push!(dynamic_libraries, name)
end Didn't even need a PR, lovely! |
@dfdx Thanks for the tips! I got |
Ok. I made a fairly trivial set of changes to Does this look ok? |
I think I can take everything from |
I've got an interesting one here. Here's an implementation of
So we iterate first over columns and then over rows of
Which seems to be both - more readable and efficient. Is there anything in Base we can use instead of |
@dfdx I also cam across on of those, so far I just marked it with Another case I came across was reverse iteration, so what to do about: for i=length(A):-1:1 Maybe this is something else to consider for the new |
You can do this: for I in CartesianRange(size(C)) However, the more serious issue is that for (i, x) in enumerate(xsd)
for (j, y) in enumerate(ysd)
C[i,j] /= x*y
end
end The only problem with that is what if |
@meggart, yeah, mark reversed ones too. This is proving to be a good exercise in finding out what we need to consider in generalizing our iteration! |
#15564 covering Meanwhile, I notices that |
I intend to do RNG work on Julia in the summer, if someone from GSoC won't work on it instead. Probably a good idea would be to get familiar with the code for those, and I'd like the two sections that include random.jl and dSFMT.jl, if they have not been claimed yet. I may take some others later, but for now I'd like to reserve those two tickboxes to get done over the next few days. |
I would also start another block, so I'd reserve ascii.jl ... collections.jl. |
I have some more spare time and can take points from |
Sounds great, thanks! |
I see that |
It's strictly internal, so just use an integer. |
Wow, this one isn't finished yet. Let's fix it. I see that there are several items that have been processed already, but aren't marked as done. I also scanned through a couple of other items, so now we can mark everything below the line I think I will be able to cover the rest within a week or so and submit one final PR. |
Done, and thanks in advance! Please do mention this issue in your PR, as back-references will be useful to me if I find there are future changes needed (e.g., https://github.com/timholy/ArrayIteration.jl). |
Good you revived this thread, I will also submit my next PR for the second block soon. |
EDIT: my bad, I forgot about #10523, so I will just leave a comment in the code. Below is my old comment. While working on
With this line changed tests in
I know that it is either |
I've just posted a PR (see above) that covers all previously unclaimed lines. Surprisingly, there were only 4 files that needed changes, so I'd say @timholy Also note that lines from |
It's great that we're looking so good. |
This PR is designed to allow us to expand our collection of array iterators (most proximally, to solve serious performance problems with
ReshapedArrays
).As a motivating example, suppose
copy!
is implemented like this:Let's suppose that
src
anddest
don't have an efficient iterator in common; in that case, the loop will be slow. This PR replaces such implementations withleveraging recent improvements (and planned additional improvements) in the performance of
zip
.This first commit addresses every use of
eachindex
in Base. I still have the classicfor i = 1:n
cases to go. (Where's that :janitor: emoticon when you need it?) I'm putting this out now in case there are reservations. It's also an almost guarantee that some workloads will see performance regressions. I'd rather find out now how bad they are, so:runbenchmarks(ALL, vs = "JuliaLang/julia:master")
.