Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logical indexing in deleteat!() #20465

Merged
merged 3 commits into from
Feb 11, 2017
Merged

Add logical indexing in deleteat!() #20465

merged 3 commits into from
Feb 11, 2017

Conversation

nalimilan
Copy link
Member

More consistent with what getindex() supports. This can allow
avoiding an allocation when the caller already has a boolean vector of
logical indices.


Turned out it could be useful for dropping null values in NullableArrays without calling find and allocating a new array of indices (JuliaStats/NullableArrays.jl#179), so I figured it would make sense to have this in general.

Unfortunately, we cannot define a separate method for boolean iterators, only for AbstractVector{Bool}. The only way to support any boolean iterator is to add an eltype(a) <: Bool branch to the most generic method. Which approach is better?

Also, it would be great to allow passing a predicate just like e.g. to find, but since there's no specific predicate type, the method would be ambiguous. Should we add a separate function just like the Find & Search Julep has several functions for find* variants?

I realize one can already write deleteat!(a, (i for i in eachindex(a) if pred(i))) with pred an arbitrary predicate, so maybe that's not needed. OTOH, this form is harder to discover, and does more computations in each iteration than the simple implementation checking only pred for each entry.

@kshyatt kshyatt added the collections Data structures holding multiple items, e.g. sets label Feb 5, 2017
@@ -792,6 +795,12 @@ julia> deleteat!([6, 5, 4, 3, 2, 1], 1:2:5)
3
1

julia> deleteat!([6, 5, 4, 3, 2, 1], [true, false, true, false, true, false])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

base/array.jl Outdated
@@ -827,6 +836,18 @@ function deleteat!(a::Vector, inds)
return a
end

function deleteat!(a::Vector, inds::AbstractVector{Bool})
Copy link
Member

@mbauman mbauman Feb 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what all is defined for deleteat!, but it may be possible to make this signature way more general and simpler with to_indices. It could be:

_deleteat!(a::Vector, inds) = # existing method above
deleteat!(a::Vector, inds) = _deleteat!(a, to_indices(a, (inds,))[1]) # to_indices must return either Integers or AbstractArray of integers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds interesting, but I discovered several issues when trying this:

  • The fact that to_indices does not support arbitrary iterables would introduce a regression here: for example, deleteat!([1,2,3], (i for i in 2:3)) is currently supported. Or maybe support for this could be added to to_indices? Else, we could call to_indices only for AbstractVector indices, and keep the existing general definition in other cases.

  • The above AbstractVector{Bool} method is very simple and efficient, and the LogicalIndex type returned by to_indices makes it actually more complex. This is a case where integer indices are less useful than boolean indices. Again, one solution is to keep the specialized AbstractVector{Bool} in parallel with the more general framework.

I've updated the PR to use to_indices, but keeping both the most general (iterators) and the specialized AbstractArray{Bool} signature. Tell me what you think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, shucks, right of course. That's unfortunate. Yup, knowing that the logical index is sorted is extremely valuable information. And maybe someday we'll extend indexing support for all iterables. I'm not sure that to_indices will gain us much when it's restricted to only affecting arrays in the first place… but we may as well allow this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Should be good for final review then.

@nalimilan nalimilan changed the title RFC: Add logical indexing in deleteat!() Add logical indexing in deleteat!() Feb 6, 2017
base/array.jl Outdated
function deleteat!(a::Vector, inds::AbstractVector{Bool})
n = length(a)
length(inds) == n || throw(BoundsError(a, inds))
@boundscheck length(inds) == n || throw(BoundsError(a, inds))
p = 1
for (q, i) in enumerate(inds)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting case to discuss the @inbounds policy (#20469): am I allowed to move the @inbounds before the for here? That would allow getting rid of one bounds check per iteration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boundscheck won't have an effect here since deleteat! is surely too complicated to inline.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. So better remove it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just added a commit to remove it.

@nalimilan
Copy link
Member Author

More comments?

More consistent with what getindex() supports. This can allow
avoiding an allocation when the caller already has a boolean vector of
logical indices.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants