Alternatives to libkdtree?

Fri May 2 06:38:37 UTC 2008

2008/5/2 Eric Fowler <eric.fowler at gmail.com>:
> Thanks for the suggestion. It would not have occurred to me to use a hash
> table to deal with ordered information. If I wanted to find the nearest
> point, wouldn't I have to check for the nearest point in each of 9
> neighboring squares, then sort? I don't have a firm req for that but it
> would be nice to know.
>

yep, you'd have to check the surrounding cells.  and if you were to do
a wider search, you'd have to search wider.

>
> I was thinking of taking a more conventional approach and storing the points
> in two set<>'s, one sorted on x, the other on y (I have plenty of memory)
> and using iterators through each set, one searching up, the other down. It's
> not sophisticated but I should be able to get something working soon.
>

2 sets means you would have to scan through both and then match the
common ones in each set.
so you'd have to
1) get a set of the nearby points in the x set<>   O(logN)
2) ditto for the y set<>  O(logN)
3) sort by pointer/id  O(M * logM)
4) do a "join" on the pointers so you end up with a set of pointers
that were in both sets.  O(M)

pretty inefficient.  set<>s are great if you want to find the next
highest/lowest one... but if you can tell it exactly what spot to look
(ie my hash_set idea) then its a lot faster to do the hash_set style.

hash_set lookup is O(1)   so you can afford to hunt through a much
smaller dataset for the nearest point.  make the grid small enough and
you can hunt in a radiating pattern until you find a cell with an item
in it - go far enough out to be sure that you have found the nearest
point.

> I am also looking at ANN. I assume you know what that is.
>

i've heard of it, haven't used it.
let me know how that goes...

cheers
Paul