Sunteți pe pagina 1din 2

K-nearest Neighbors Search by Random Projection Forests

ABSTRACT:

K-nearest neighbors (kNN) search is an important problem in data mining and


knowledge discovery. Inspired by the huge success of tree-based methodology and
ensemble methods over the last decades, we propose a new method for kNN
search, random projection forests (rpForests). rpForests finds nearest neighbors by
combining multiple kNN-sensitive trees with each constructed recursively through
a series of random projections. As demonstrated by experiments on a wide
collection of real datasets, our method achieves a remarkable accuracy in terms of
fast decaying missing rate of kNNs and that of discrepancy in the k-th nearest
neighbor distances. rpForests has a very low computational complexity as a tree-
based methodology. The ensemble nature of rpForests makes it easily parallelized
to run on clustered or multicore computers; the running time is expected to be
nearly inversely proportional to the number of cores or machines. We give
theoretical insights on rpForests by showing the exponential decay of neighboring
points being separated by ensemble random projection trees when the ensemble
size increases. Our theory can also be used to refine the choice of random
projections in the growth of rpForests; experiments show that the effect is
remarkable.

SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:

 System : i3 Processor
 Hard Disk : 500 GB.
 Monitor : 15’’ LED
 Input Devices : Keyboard, Mouse
 Ram : 4GB.

SOFTWARE REQUIREMENTS:

 Operating system : Windows 7/UBUNTU.


 Coding Language : Java 1.7 ,Hadoop 0.8.1
 IDE : Eclipse
 Database : MYSQL
REFERENCE:

Donghui Yan, Yingjie Wang, Jin Wang, Honggang Wang, and Zhenpeng Li, “K-
nearest Neighbors Search by Random Projection Forests”, IEEE Transactions on
Big Data, 2019.

S-ar putea să vă placă și