-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add logical indexing in deleteat!() #20465
Conversation
@@ -792,6 +795,12 @@ julia> deleteat!([6, 5, 4, 3, 2, 1], 1:2:5) | |||
3 | |||
1 | |||
|
|||
julia> deleteat!([6, 5, 4, 3, 2, 1], [true, false, true, false, true, false]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
base/array.jl
Outdated
@@ -827,6 +836,18 @@ function deleteat!(a::Vector, inds) | |||
return a | |||
end | |||
|
|||
function deleteat!(a::Vector, inds::AbstractVector{Bool}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what all is defined for deleteat!
, but it may be possible to make this signature way more general and simpler with to_indices
. It could be:
_deleteat!(a::Vector, inds) = # existing method above
deleteat!(a::Vector, inds) = _deleteat!(a, to_indices(a, (inds,))[1]) # to_indices must return either Integers or AbstractArray of integers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds interesting, but I discovered several issues when trying this:
-
The fact that
to_indices
does not support arbitrary iterables would introduce a regression here: for example,deleteat!([1,2,3], (i for i in 2:3))
is currently supported. Or maybe support for this could be added toto_indices
? Else, we could callto_indices
only forAbstractVector
indices, and keep the existing general definition in other cases. -
The above
AbstractVector{Bool}
method is very simple and efficient, and theLogicalIndex
type returned byto_indices
makes it actually more complex. This is a case where integer indices are less useful than boolean indices. Again, one solution is to keep the specializedAbstractVector{Bool}
in parallel with the more general framework.
I've updated the PR to use to_indices
, but keeping both the most general (iterators) and the specialized AbstractArray{Bool}
signature. Tell me what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, shucks, right of course. That's unfortunate. Yup, knowing that the logical index is sorted is extremely valuable information. And maybe someday we'll extend indexing support for all iterables. I'm not sure that to_indices
will gain us much when it's restricted to only affecting arrays in the first place… but we may as well allow this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Should be good for final review then.
base/array.jl
Outdated
function deleteat!(a::Vector, inds::AbstractVector{Bool}) | ||
n = length(a) | ||
length(inds) == n || throw(BoundsError(a, inds)) | ||
@boundscheck length(inds) == n || throw(BoundsError(a, inds)) | ||
p = 1 | ||
for (q, i) in enumerate(inds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting case to discuss the @inbounds
policy (#20469): am I allowed to move the @inbounds
before the for
here? That would allow getting rid of one bounds check per iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@boundscheck
won't have an effect here since deleteat!
is surely too complicated to inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. So better remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just added a commit to remove it.
More comments? |
More consistent with what getindex() supports. This can allow avoiding an allocation when the caller already has a boolean vector of logical indices.
fe91e5d
to
2348dc0
Compare
More consistent with what getindex() supports. This can allow
avoiding an allocation when the caller already has a boolean vector of
logical indices.
Turned out it could be useful for dropping null values in NullableArrays without calling
find
and allocating a new array of indices (JuliaStats/NullableArrays.jl#179), so I figured it would make sense to have this in general.Unfortunately, we cannot define a separate method for boolean iterators, only for
AbstractVector{Bool}
. The only way to support any boolean iterator is to add aneltype(a) <: Bool
branch to the most generic method. Which approach is better?Also, it would be great to allow passing a predicate just like e.g. to
find
, but since there's no specific predicate type, the method would be ambiguous. Should we add a separate function just like the Find & Search Julep has several functions forfind*
variants?I realize one can already write
deleteat!(a, (i for i in eachindex(a) if pred(i)))
withpred
an arbitrary predicate, so maybe that's not needed. OTOH, this form is harder to discover, and does more computations in each iteration than the simple implementation checking onlypred
for each entry.