Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Range indexing: error with scalar bool index like all other arrays #31829

Merged
merged 25 commits into from
Mar 8, 2021

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented Apr 25, 2019

This is an initial proposal fixing #31726. It is a bit messy as @mbauman indicated there.

If we are OK with the proposed approach I will write tests for this PR.

@bkamins
Copy link
Member Author

bkamins commented Oct 24, 2020

Is there a green light to make this fix (if yes I will rebase and add tests to finalize the PR).

@bkamins
Copy link
Member Author

bkamins commented Jan 6, 2021

I can go back and update this PR if there is a green light to implement changes sanitizing Bool indexing for Julia 2.0.

base/range.jl Outdated Show resolved Hide resolved
@mbauman mbauman changed the title An initial proposal to check Bool case Range indexing: error with scalar bool index like all other arrays Jan 6, 2021
@mbauman
Copy link
Member

mbauman commented Jan 6, 2021

The goal here seems good to me. It initially felt like we should be able to do better in the implementation by punting to to_indices... but unlike other array types we don't want to convert everything to Int. Thus I think the approach here makes sense.

Co-authored-by: Matt Bauman <mbauman@gmail.com>
@mbauman mbauman added needs news A NEWS entry is required for this change needs tests Unit tests are required for this change labels Jan 6, 2021
@bkamins
Copy link
Member Author

bkamins commented Jan 6, 2021

OK - I will rebase then and finalize the PR

base/range.jl Outdated Show resolved Hide resolved
base/range.jl Outdated Show resolved Hide resolved
base/range.jl Outdated Show resolved Hide resolved
test/ranges.jl Outdated Show resolved Hide resolved
mbauman
mbauman previously requested changes Jan 7, 2021
Copy link
Member

@mbauman mbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some ambiguities hiding here, too — that'll be the biggest painpoint for custom packages that subtype AbstractRange, but fortunately the ambiguities should themselves be errors (it's just not quite the error we want).

test/ranges.jl Outdated Show resolved Hide resolved
test/ranges.jl Outdated Show resolved Hide resolved
test/ranges.jl Outdated Show resolved Hide resolved
test/ranges.jl Outdated Show resolved Hide resolved
test/ranges.jl Outdated Show resolved Hide resolved
test/ranges.jl Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
Co-authored-by: Matt Bauman <mbauman@gmail.com>
NEWS.md Outdated Show resolved Hide resolved
@bkamins
Copy link
Member Author

bkamins commented Jan 7, 2021

Still some ambiguities hiding here

Right, I am too much used to DataFrames.jl development recently where we are mostly in full control. Maybe then it is better not to add any new methods but use if/else in the existing methods with the condition using element type like this:

function getindex(v::UnitRange{T}, i::Integer) where T
    @_inline_meta
    if i isa Bool
        throw(ArgumentError("invalid index: $i of type Bool"))
    else
        val = convert(T, v.start + (i - 1))
        @boundscheck _in_unit_range(v, val, i) || throw_boundserror(v, i)
        return val
    end
end

this if/else should be (hopefully) optimized out by the compiler and will not introduce new ambiguities.

NEWS.md Outdated Show resolved Hide resolved
@mbauman mbauman removed needs news A NEWS entry is required for this change needs tests Unit tests are required for this change labels Jan 7, 2021
@bkamins
Copy link
Member Author

bkamins commented Jan 11, 2021

@mbauman - could you please comment if you think that doing checking within the original method as I proposed above would be preferred (it for sure will reduce the risk of ambiguities). I will update the PR following your recommendation and make the tests pass. Thank you!

@bkamins
Copy link
Member Author

bkamins commented Jan 17, 2021

bump (as otherwise probably everyone will forget about the details of the design and it is non-obvious in some places)

@bkamins
Copy link
Member Author

bkamins commented Jan 23, 2021

Bump again - is it OK to merge? Sorry for pushing - I know this PR has a non-obvious logic in several places, but I have a long list of PRs to JuliaLang/julia that are unfinished because I did not push for a decision on them.

@nalimilan

@bkamins
Copy link
Member Author

bkamins commented Jan 30, 2021

The error in https://build.julialang.org/#/builders/1/builds/22/steps/5/logs/stdio seems unrelated, but maybe I do not see something important that causes this error in this PR.

@mbauman
Copy link
Member

mbauman commented Feb 23, 2021

I keep forgetting this, sorry. Let's re-run CI since it looks a little stale and then get it in.

@mbauman mbauman closed this Feb 23, 2021
@mbauman mbauman reopened this Feb 23, 2021
@mbauman mbauman added the merge me PR is reviewed. Merge when all tests are passing label Feb 23, 2021
@bkamins
Copy link
Member Author

bkamins commented Feb 23, 2021

The CI failure on package_win32 seems unrelated (but I am not super experienced with reading them as I have not contributed to Julia Base for a while).

@bkamins
Copy link
Member Author

bkamins commented Feb 28, 2021

bump

1 similar comment
@bkamins
Copy link
Member Author

bkamins commented Mar 7, 2021

bump

@nalimilan
Copy link
Member

CI failure is #39667.

@nalimilan nalimilan merged commit 6fb3558 into JuliaLang:master Mar 8, 2021
@bkamins bkamins deleted the fix_bool_getindex branch March 8, 2021 09:26
@bkamins
Copy link
Member Author

bkamins commented Mar 8, 2021

As reported in JuliaData/DataFrames.jl#2648 this change catches the decision to be made with repeat:

julia> x = [1,2,3]
3-element Vector{Int64}:
 1
 2
 3

julia> repeat(x, inner=1,outer=1)
3-element Vector{Int64}:
 1
 2
 3

julia> repeat(x, inner=1,outer=false)
Int64[]

julia> repeat(x, inner=false,outer=1)
ERROR: ArgumentError: invalid index: false of type Bool

julia> repeat(1:3, true)
3-element Vector{Int64}:
 1
 2
 3

julia> repeat(1:3, false)
Int64[]

In DataFrames.jl I decided to recommend disallowing passing Bool to repeat (@nalimilan might want to change this recommendation though). In Julia Base I am not sure what is the intention. Should Bool should be allowed as count (then a PR is needed to fix the broken inner argument) or it should be disallowed (then it would be a mildly breaking change).

@bkamins
Copy link
Member Author

bkamins commented Mar 8, 2021

@timholy Do you agree that this should error:

julia> CartesianIndices((true,))
ERROR: ArgumentError: invalid index: true of type Bool

(as it does now - I am pinging you as you have designed CartesianIndices if I understand it correctly)

bkamins added a commit to bkamins/julia that referenced this pull request Mar 9, 2021
Follow up to the problem discussed in JuliaLang#31829 (comment).

I came to the conclusion that `CartesianIndices((true,))` should be allowed as in this context `true` represents a dimension length not an index.
if first(s)
return r
else
return range(r[1], length=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return range(r[1], length=0)
return range(first(r), length=0)

In case someone makes an OffsetUnitRange?

return range(first(r), step=one(eltype(r)), length=0)
end
else # length(s) == 2
return range(r[2], step=one(eltype(r)), length=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return range(r[2], step=one(eltype(r)), length=1)
step = one(eltype(r))
return range(first(r) + step; step, length=1)

vtjnash pushed a commit that referenced this pull request Mar 12, 2021
Follow up to the problem discussed in #31829 (comment).

I came to the conclusion that `CartesianIndices((true,))` should be allowed as in this context `true` represents a dimension length not an index.
@simeonschaub simeonschaub removed the merge me PR is reviewed. Merge when all tests are passing label Apr 3, 2021
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021
Follow up to the problem discussed in JuliaLang#31829 (comment).

I came to the conclusion that `CartesianIndices((true,))` should be allowed as in this context `true` represents a dimension length not an index.
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this pull request May 9, 2021
Follow up to the problem discussed in JuliaLang#31829 (comment).

I came to the conclusion that `CartesianIndices((true,))` should be allowed as in this context `true` represents a dimension length not an index.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants