Sunteți pe pagina 1din 4

1

Machine learning
K - Nearest Neighbor Classifier



Question 1: Given two categories such as category A = [ ] and category B =[
],


determine to which category the input x =[ ] belongs to.

Solution:1- Determine the parameter K i.e. number of nearest neighbors.


Assume k=3
2- Calculate the distance between the new input and all the training data. The absolute
distance is used since it is faster to compute i.e. without square roots.
Groups
2
3
5
3
5
7
6

1
1
1
2
2
2
2

4
7
4
8
9
10
8

The distance between the


query input x and all the
training data
(2 4)2 + (4 7)2 = 13
(3 4)2 + (7 7)2 = 1
(5 4)2 + (4 7)2 = 10
(3 4)2 + (8 7)2 = 2
(5 4)2 + (9 7)2 = 5
(7 4)2 + (10 7)2 = 18
(6 4)2 + (8 7)2 = 5

3- Sort the distance and determine the K-th minimum distance.


3
3
5
6
5
2
7

7
8
9
8
4
4
10

Groups
1
2
2
2
1
1
2

Sorted Distance
1
2
5
5
10
13
18

Neighbors
Yes
Yes
Yes
No
No
No
No

4- Collect the categories of these neighbors.


3
3
5
6
5
7
6

7
8
9
8
4
10
8

Groups
1
2
2
2
1
2
2

4th year Biomedical Engineering

Sorted Distance
1
2
5
5
10
13
18

Neural Network Tutorial

Neighbors
Yes
Yes
Yes
No
No
No
No

Categories
1
2
2
2
1
1
2

Helwan University

Machine learning
5- Determine the category based on the majority vote.
Input
4

Category
2

Matlab implementation:
close all; clear all; clc
% Initialization.
class1=[2 4;3 7;5 4];
class2=[3 8;5 9;7 10;6 8];
input=[4 7];
k=3;
labels=[ones(size(class1,1),1); 2*ones(size(class2,1),1)];
% Combines the two classes into one class
classes=[class1;class2];
% Compute the distance
distances=zeros(size(classes,1),1);
for i=1:size(classes,1)
tmp=(classes(i,1)-input(1))^2+(classes(i,2)-input(2))^2;
distances(i,1)=tmp;
end
% Sort the distances
[dist,pos]=sort(distances);
% Gather categories
neighborsIdx=pos(1:k);
neighborsLabels=labels(neighborsIdx);
% Majority vote
numClass1=length(find(neighborsLabels==1));
numClass2=length(find(neighborsLabels==2));
jointNum=[numClass1;numClass2];
category=max(jointNum);

12
11
10
9
8
7
6
5
4
3
2
1

Fig.1 Shows KNN plot such as blue squares represent class A, red circles represent class B and
black asterisk represents the input X.

4th year Biomedical Engineering

Neural Network Tutorial

Helwan University

Machine learning
K- Means Classifier
Question 2: Determine which of the following data points belong to cluster one and which
belong to the other cluster:


Solution:We should repeat below steps until convergence
1- Determine the centroid coordinates
Initialize the first two centroids. For example, 1 = (3,3) and 2 = (2,3).
2- Calculate the distance of each data point to the centroids (Euclidean distance).
3- Gather the data points based on minimum distance
Num
1
2
3
4

Data
3
-1
2
0

3
-4
3
-5

Iteration 1
1 = (3,3) , 2 = (2,3)
0
1
8.1
7.6
1
0
8.5
8.2

Iteration 2
1 = (3,3) , 2 = (0.3, 2)
0
5.7
8.1
2.4
1
5.3
8.5
3

Iteration 3
1 = (2.5,3) , 2 = (0.5, 4.5)
0
5.7
8.1
2.4
1
5.3
8.5
3

Matlab implementation:
clear all; close all; clc
% Initialization
data=[3 3;-1 -4;2 3;0 -5];
% Number of clusters
k=2;
[nRows,nCols] = size(data);
% Determine the centroid coordinates
r = randperm(nRows);
centroid(1 :k,:) = data(r(1 :k),:);
tempCentroid = zeros(size(centroid));
clusters = zeros(size(data,1 ));
while (true)
tempCentroid = centroid;
dist = pdist2(data,centroid);
[~,clusters] = min(dist,[],2);
for i = 1 : k
centroid(i,:) = mean(data(clusters == i,:));
end
if(tempCentroid==centroid)
break;
end
end

4th year Biomedical Engineering

Neural Network Tutorial

Helwan University

Machine learning

-2

-4

-6
-5

-4

-3

-2

-1

Fig.2 K-means clustering

4th year Biomedical Engineering

Neural Network Tutorial

Helwan University