Skip to content

Commit

Permalink
Improve sort!, searchsortedXXX and insorted doc
Browse files Browse the repository at this point in the history
  • Loading branch information
knuesel committed Jan 24, 2023
1 parent 4ff3842 commit 411df38
Show file tree
Hide file tree
Showing 2 changed files with 87 additions and 30 deletions.
4 changes: 2 additions & 2 deletions base/operators.jl
Original file line number Diff line number Diff line change
Expand Up @@ -163,8 +163,6 @@ Types with a partial order should implement [`<`](@ref).
See the documentation on [Alternate Orderings](@ref) for how to define alternate
ordering methods that can be used in sorting and related functions.
See also [`isequal`](@ref), [`isunordered`](@ref).
# Examples
```jldoctest
julia> isless(1, 3)
Expand Down Expand Up @@ -330,6 +328,8 @@ New types with a canonical partial order should implement this function for
two arguments of the new type.
Types with a canonical total order should implement [`isless`](@ref) instead.
See also [`isunordered`](@ref).
# Examples
```jldoctest
julia> 'a' < 'b'
Expand Down
113 changes: 85 additions & 28 deletions base/sort.jl
Original file line number Diff line number Diff line change
Expand Up @@ -162,8 +162,9 @@ partialsort(v::AbstractVector, k::Union{Integer,OrdinalRange}; kws...) =
# reference on sorted binary search:
# http://www.tbray.org/ongoing/When/200x/2003/03/22/Binary

# index of the first value of vector a that is greater than or equal to x;
# returns lastindex(v)+1 if x is greater than all values in v.
# Index of the first value of vector a that doesn't sort before x (for the
# default order this means the first value less than or equivalent to x).
# Returns lastindex(v)+1 if x sorts after all values of v.
function searchsortedfirst(v::AbstractVector, x, lo::T, hi::T, o::Ordering)::keytype(v) where T<:Integer
hi = hi + T(1)
len = hi - lo
Expand All @@ -181,8 +182,9 @@ function searchsortedfirst(v::AbstractVector, x, lo::T, hi::T, o::Ordering)::key
return lo
end

# index of the last value of vector a that is less than or equal to x;
# returns firstindex(v)-1 if x is less than all values of v.
# Index of the last value of vector a that doesn't sort after x (for the
# default order this means the last value less than or equaivalent to x).
# returns firstindex(v)-1 if x sorts before all values of v.
function searchsortedlast(v::AbstractVector, x, lo::T, hi::T, o::Ordering)::keytype(v) where T<:Integer
u = T(1)
lo = lo - u
Expand All @@ -198,7 +200,7 @@ function searchsortedlast(v::AbstractVector, x, lo::T, hi::T, o::Ordering)::keyt
return lo
end

# returns the range of indices of v equal to x
# returns the range of indices of v equivalent to x
# if v does not contain x, returns a 0-length range
# indicating the insertion point of x
function searchsorted(v::AbstractVector, x, ilo::T, ihi::T, o::Ordering)::UnitRange{keytype(v)} where T<:Integer
Expand Down Expand Up @@ -291,16 +293,24 @@ for s in [:searchsortedfirst, :searchsortedlast, :searchsorted]
end

"""
searchsorted(a, x; by=identity, lt=isless, rev=false)
searchsorted(v, x; by=identity, lt=isless, rev=false)
Return the range of indices of `a` that compare as equal to `x` (using binary
search), assuming that `a` is already sorted. Return an empty range located at
the insertion point if `a` does not contain values equal to `x`.
Return the range of indices in `v` where values are equivalent to `x`, or an
empty range located at the insertion point if `v` does not contain values
equivalent to `x` (see below for the definition of equivalence).
The range is found using binary search. The vector `v` must be sorted, or at
least partitioned with respect to `x` such that all values that sort before `x`
come first and all values that sort after `x` come last.
The `by`, `lt` and `rev` keywords modify what order is assumed for the data,
as described in the [`sort!`](@ref) documentation.
See also: [`insorted`](@ref), [`searchsortedfirst`](@ref), [`sort!`](@ref), [`findall`](@ref).
Values `x` and `y` are said equivalent if `lt(by(x), by(y))` and `lt(by(y),
by(x))` both return `false`, and `x` is said to sort before `y` if `lt(by(x),
by(y))` returns `true` (with `x` and `y` exchanged if `rev=true`).
See also: [`searchsortedfirst`](@ref), [`sort!`](@ref), [`insorted`](@ref), [`findall`](@ref).
# Examples
```jldoctest
Expand All @@ -318,20 +328,30 @@ julia> searchsorted([1, 2, 4, 5, 5, 7], 9) # no match, insert at end
julia> searchsorted([1, 2, 4, 5, 5, 7], 0) # no match, insert at start
1:0
julia> searchsorted([1, 0, 0, 2, 2, 7, 6], 2) # data unsorted but partitioned with respect to 2
4:5
julia> searchsorted([1, -1, -2, 2, -2, 3, -4, 4], 2, by=abs) # sorted by absolute value, -2 equivalent to 2
3:5
```
""" searchsorted

"""
searchsortedfirst(a, x; by=identity, lt=isless, rev=false)
searchsortedfirst(v, x; by=identity, lt=isless, rev=false)
Return the index of the first value in `a` greater than or equal to `x`,
assuming that `a` is already sorted. Return `lastindex(a) + 1` if `x` is
greater than all values in `a`.
Return the index of the first value in `v` that doesn't sort before `x`.
If all values in `v` sort before `x`, the function returns `lastindex(v) + 1`.
`insert!`ing `x` at this index will maintain sorted order.
The index is found using binary search. The vector `v` must be sorted, or at
least partitioned with respect to `x` such that all values that sort before `x`
come first and all values that sort after `x` come last. `insert!`ing `x` at
the returned index will maintain the sorted order (or partition).
The `by`, `lt` and `rev` keywords modify what order is assumed for the data,
as described in the [`sort!`](@ref) documentation.
as described in the [`sort!`](@ref) documentation. Value `x` is said to sort
before `y` if `lt(by(x), by(y))` returns `true` (with `x` and `y` exchanged if
`rev=true`).
See also: [`searchsortedlast`](@ref), [`searchsorted`](@ref), [`findfirst`](@ref).
Expand All @@ -351,16 +371,29 @@ julia> searchsortedfirst([1, 2, 4, 5, 5, 7], 9) # no match, insert at end
julia> searchsortedfirst([1, 2, 4, 5, 5, 7], 0) # no match, insert at start
1
julia> searchsortedfirst([1, 0, 0, 2, 2, 7, 6], 2) # data unsorted but partitioned with respect to 2
4
julia> searchsortedfirst([1, -1, -2, 2, -2, 3, -4, 4], 2, by=abs) # sorted by absolute value
3
```
""" searchsortedfirst

"""
searchsortedlast(a, x; by=identity, lt=isless, rev=false)
searchsortedlast(v, x; by=identity, lt=isless, rev=false)
Return the index of the last value in `a` less than or equal to `x`, assuming
that `a` is already sorted. Return `firstindex(a) - 1` if `x` is less than all
values in `a`. The `by`, `lt` and `rev` keywords modify what order is assumed
for the data, as described in the [`sort!`](@ref) documentation.
Return the index of the last value in `v` that doesn't sort after `x`.
If all values in `v` sort after `x`, the function returns `firstindex(v) - 1`.
The index is found using binary search. The vector `v` must be sorted, or at
least partitioned with respect to `x` such that all values that sort before `x`
come first and all values that sort after `x` come last.
The `by`, `lt` and `rev` keywords modify what order is assumed for the data,
as described in the [`sort!`](@ref) documentation. Value `x` is said to sort
before `y` if `lt(by(x), by(y))` returns `true` (with `x` and `y` exchanged if
`rev=true`).
# Examples
```jldoctest
Expand All @@ -378,16 +411,31 @@ julia> searchsortedlast([1, 2, 4, 5, 5, 7], 9) # no match, insert at end
julia> searchsortedlast([1, 2, 4, 5, 5, 7], 0) # no match, insert at start
0
julia> searchsortedlast([1, 0, 0, 2, 2, 7, 6], 2) # data unsorted but partitioned with respect to 2
5
julia> searchsortedlast([1, -1, -2, 2, -2, 3, -4, 4], 2, by=abs) # sorted by absolute value
5
```
""" searchsortedlast

"""
insorted(x, a; by=identity, lt=isless, rev=false) -> Bool
insorted(x, v; by=identity, lt=isless, rev=false) -> Bool
Determine whether an item `x` is in the sorted collection `a`, in the sense that
it is [`==`](@ref) to one of the values of the collection. The `by`, `lt` and
`rev` keywords modify what order is assumed for the collection, as described in
the [`sort!`](@ref) documentation.
Determine whether a vector `v` contains any value equivalent to `x` (see below
for the definition of equivalence).
The check is done using binary search. The vector `v` must be sorted, or at
least partitioned with respect to `x` such that all values that sort before `x`
come first and all values that sort after `x` come last.
The `by`, `lt` and `rev` keywords modify what order is assumed for the data,
as described in the [`sort!`](@ref) documentation.
Values `x` and `y` are said equivalent if `lt(by(x), by(y))` and `lt(by(y),
by(x))` both return `false`, and `x` is said to sort before `y` if `lt(by(x),
by(y))` returns `true` (with `x` and `y` exchanged if `rev=true`).
See also [`in`](@ref).
Expand All @@ -407,6 +455,12 @@ false
julia> insorted(0, [1, 2, 4, 5, 5, 7]) # no match
false
julia> insorted(2, [1, 0, 2, 2, 7, 6]) # data unsorted but partitioned with respect to 2
true
julia> insorted(2, [1, -1, -2, 3, -4, 4], by=abs) # sorted by absolute value
true
```
!!! compat "Julia 1.6"
Expand Down Expand Up @@ -1338,11 +1392,14 @@ for available algorithms). Elements are first transformed by the function `by`
and then compared according to either the function `lt` or the ordering
`order`. Finally, the resulting order is reversed if `rev=true`.
The `lt` function should define a strict partial order, that is, it should be
The `lt` function should define a strict weak ordering, that is, it should be
- irreflexive: `lt(x, x)` always yields `false`,
- asymmetric: if `lt(x, y)` yields `true` then `lt(y, x)` yields `false`,
- transitive: `lt(x, y) && lt(y, z)` implies `lt(x, z)`.
- transitive: `lt(x, y) && lt(y, z)` implies `lt(x, z)`,
- transitive in incomparability: `incomparable(x, y) && incomparable(y, z)`
implies `incomparable(x, z)`, where `incomparable(x, y)` is defined as
`!lt(x, y) && !lt(y, x)`.
For example `<` is a valid `lt` function but `≤` is not.
Expand Down

0 comments on commit 411df38

Please sign in to comment.