nn and knn do not accept points as arrays of arrays #121

BoZenKhaa · 2021-03-12T12:29:55Z

When passing points as array of arrays to the knn and nn functions:

using NearestNeighbors
data = rand(3, 10^4)
k = 3

kdtree = KDTree(data)

# Multiple points as array of arrays
points = [rand(3) for i in 1:4];

idxs, dists = knn(kdtree, points, k, true);

I get following error:

julia> idxs, dists = knn(kdtree, points, k, true);
ERROR: MethodError: no method matching length(::Type{Array{Float64,1}})
Closest candidates are:
  length(::Base.EnvDict) at env.jl:132
  length(::LibGit2.GitStatus) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LibGit2/src/status.jl:21
  length(::Base.AsyncGenerator) at asyncmap.jl:410
  ...
Stacktrace:
 [1] check_input(::KDTree{StaticArrays.SArray{Tuple{3},Float64,1,3},Euclidean,Float64}, ::Array{Array{Float64,1},1}) at /home/user/.julia/packages/NearestNeighbors/jDOe3/src/NearestNeighbors.jl:27
 [2] knn(::KDTree{StaticArrays.SArray{Tuple{3},Float64,1,3},Euclidean,Float64}, ::Array{Array{Float64,1},1}, ::Int64, ::Bool, ::Function) at /home/user/.julia/packages/NearestNeighbors/jDOe3/src/knn.jl:18 (repeats 2 times)
 [3] top-level scope at REPL[139]:1

I understood the line "points can also be a vector of other vectors where each element in the outer vector is considered a point." in README to mean that this should work?

Anyway, thanks for a neat implementation of kdtrees!

The text was updated successfully, but these errors were encountered:

BoZenKhaa · 2021-03-12T12:37:42Z

Looking into the code, the check that throws the error is this:

function check_input(::NNTree{V1}, ::AbstractVector{V2}) where {V1, V2 <: AbstractVector}
    if length(V1) != length(V2)
        throw(ArgumentError(
            "dimension of input points:$(length(V2)) and tree data:$(length(V1)) must agree"))
    end
end

where the length of type V2 can't be determined. I suppose a length of a type should return length of the instances of said type?

KristofferC · 2021-03-12T12:54:25Z

The README says:

vector of vectors with fixed dimensionality, nd, which must be part of the type. Specifically, data should be a Vector{V}, where V is itself a subtype of an AbstractVector and such that eltype(V) and length(V) are defined.

In your case eltype(V) and length(V) is not defined. You can use an array from StaticArrays:

using NearestNeighbors, StaticArrays

data = rand(3, 10^4)
k = 3

kdtree = KDTree(data)

# Multiple points as array of arrays
points = [rand(SVector{3}) for i in 1:4];

idxs, dists = knn(kdtree, points, k, true)

BoZenKhaa · 2021-03-12T13:56:52Z

Thank you for a quick reply and for pointing that out to me, I completely missed that.

Maybe it could be worthwhile to mention in the README bullet that discusses points that same type restrictions that apply to points also apply to data? I could make the PR.

Alternatively, could having a more general check work as well? E.g. having

function check_input(::NNTree{V1}, points::AbstractVector{V2}) where {V1, V2 <: AbstractVector}
    dim=0
    try
        dim = length(V2)
    catch MethodError
        points_dims = unique([length(p) for p in points])
        if length(points_dims)==1
            dim = points_dims[1]
        else
            throw(ArgumentError(
                "all input points must have the same dimension: point dimensions: $points_dims and tree data:$(length(V1)) must agree"))
        end
    end

    if length(V1) != dim
        throw(ArgumentError(
            "dimension of input points:$(length(V2)) and tree data:$(length(V1)) must agree"))
    end
end

seems to work for my usecase and passes the tests, but I am not sure whether that doesn't affect the performance or potentially introduces some hidden issues...

Reason I am looking at this is that I would like to use NearestNeighbors inside a package where I set up the data and it's up to users to provide the points. I think I would prefer not to limit users to StaticArrays or repackage the user provided points into Matrix or StaticArrays if there wouldn't be a performance hit...

Thank you!

lrsantos11 · 2021-08-10T13:51:38Z

I really think that @BoZenKhaa solution is the better. I really don't want to rely on StaticArrays because of performance issues for large and sparse arrays.

davnn · 2022-01-21T10:12:22Z

I really think that @BoZenKhaa solution is the better. I really don't want to rely on StaticArrays because of performance issues for large and sparse arrays.

Alternatively, you could simply use StaticArrays.jl SizedVector to annotate your vectors with their size. Example to transform a matrix to a vec of points:

points = [SizedVector{length(v)}(v) for v in eachcol(mat)]

BoZenKhaa · 2022-01-21T10:46:08Z

Thanks for the tip @davnn, I haven't realized StaticArrays.jl provided this option. That could also be implemented in my package and the end users won't have to worry about using StaticArrays.

Anyway, it seems to me that at that point I am just working around a constraint imposed by NearestNeighbors.jl that may be too strict. Wouldn't relaxing this constraint be a better course of action? I can't tell whether there are some bad consequences to relaxing this constraint, maybe somewhere else in the package. As I said, relaxing the constraint passed the tests on my machine. Can you tell?

KristofferC · 2022-06-15T06:45:23Z

Dup of #85

KristofferC mentioned this issue Jun 15, 2022

BallTree and KDTree do not accept Vector{AbstractVector{Float64}} Type. Possible bug. #144

Closed

KristofferC closed this as completed Jun 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nn and knn do not accept points as arrays of arrays #121

nn and knn do not accept points as arrays of arrays #121

BoZenKhaa commented Mar 12, 2021

BoZenKhaa commented Mar 12, 2021 •

edited

Loading

KristofferC commented Mar 12, 2021

BoZenKhaa commented Mar 12, 2021 •

edited

Loading

lrsantos11 commented Aug 10, 2021

davnn commented Jan 21, 2022 •

edited

Loading

BoZenKhaa commented Jan 21, 2022

KristofferC commented Jun 15, 2022

nn and knn do not accept points as arrays of arrays #121

nn and knn do not accept points as arrays of arrays #121

Comments

BoZenKhaa commented Mar 12, 2021

BoZenKhaa commented Mar 12, 2021 • edited Loading

KristofferC commented Mar 12, 2021

BoZenKhaa commented Mar 12, 2021 • edited Loading

lrsantos11 commented Aug 10, 2021

davnn commented Jan 21, 2022 • edited Loading

BoZenKhaa commented Jan 21, 2022

KristofferC commented Jun 15, 2022

BoZenKhaa commented Mar 12, 2021 •

edited

Loading

BoZenKhaa commented Mar 12, 2021 •

edited

Loading

davnn commented Jan 21, 2022 •

edited

Loading