Documente Academic
Documente Profesional
Documente Cultură
Pattern Classication (Duda, Hart, Stork) Nearest Neighbor Pattern Classication (Cover and Hart)
Roberto Souza
DCA/FEEC/UNICAMP
16 de mar co de 2012
Roberto Souza
1/ 13
Agenda
Roberto Souza
2/ 13
Developed in the 1960s; Non-parametric, sub-optimum classiers; Often provide competitive results; Simple to understand and implement.
Roberto Souza
3/ 13
p (error ) =
+
4/ 13
=
Roberto Souza
p (X | i ) p (i ).
Bayesian decision Rule: p (error | X ) = 1 max [p (1 | X ), ..., p (M | X )]. Bayes Error Rate (BER) is achieved by using BDR.
Roberto Souza 5/ 13
1-NN Overview
(1)
(2)
As the number of labeled samples N tends to innity in a M -class classication problem, the 1-Nearest Neighbor Error Rate (1NNER) is bounded by the following expression: BER 1NNER BER (2
M M 1
BER ).
(3)
Roberto Souza
8/ 13
1-NN Weaknesses
1-NN is sensible to noise and outliers; 1NNER is only valid on an innite labeled samples space; Its computational complexity increases with N .
Roberto Souza
9/ 13
k-NN Overview
k-NN is a natural extension of the 1-NN classier. k-NN classies X by assigning it to the label most frequently present in the k nearest neighbors. k-NN takes into account k neighbors, so it is less sensible to noisy than 1-NN; It can be shown that for an innite number of samples, N ,as k tends to innity the k-NN Error Rate (kNNER) tends to the BER;
Roberto Souza
10/ 13
Although k-NN, for k > 1, is theoretically a better classier than 1-NN, this may not be true if the number of training samples is not large enough; To avoid k-NN anomalous behaviour it is inserted a parameter d, i.e. k-NN is no longer a non-parametric classier.
Roberto Souza
12/ 13
Roberto Souza
13/ 13