Documente Academic
Documente Profesional
Documente Cultură
Mid-Semester Test
(EC-2 Makeup)
Q.1. For the data set given below in Table 1, Fill the missing values. [5]
Table-1
SL NO: Fet1 Fet2 Fet3 Fet4
1 3 4 4 3
2 0 12 13 13
3 10 15 12 18
4 0 23 17 0
5 4 7 6 2
6 2 14 7 12
Upper Quartile = 13 or Lower Quartile= 2 or
Mode = 0 Mean=12.5 12 3
Q.2. Assume the following dataset is given: (1,2,1), (4,4,2), (3,5,1), (1,6,2), (8,8,2),(7,9,2),
(0,4,2), (4,0,2). K-Means is used with k=3 to cluster the dataset. Moreover, Euclidean
distance is used as the distance function to compute distances between centroids and
objects in the dataset. Moreover, K-Means’s initial clusters C1, C2, and C3 are as
follows:
C1: {(1,2,1), (4,4,2), (1,6,2)}
C2: {(3,5,1), (0,4,2),(4,0,2)}
C3: {(8,8,2), (7,9,2)}
Now K-means is run for a single iteration; what are the new clusters and what are
their centroids? [5]
New Center
C1(2,4,1.667)
C2(2.333,3,1.667)
C3 (7.5,8.5,2)
Distance Matrix
Now:
New Cluster=
C1: (1.333,5,1.6667)
C2:(3,2,1.667)
C3:(7.5,8.5,2)
10 8
16 2
30 1
33 1
35 1
39 1
80 1
96 1
Mode: 10
B. Calculate the standard deviation for the above Dataset.
26.29
C. Calculate the variance for the above Dataset.
691.59
D. Calculate 25%,75%,50% for the above Dataset.
25%=10
Q.5 (b) Prove that the mean of n1=(a+b)/2 and n2=(a-b)/2 is a/2, and the variance is a*b/4
Mean(n1,n2)=(n1+n2)/2=a/2
Var(o1,o2)=
Not possible
O1 O2 O3 O4 O5
It does not assume a particular value of 𝑘, as needed by 𝑘-means clustering. 2. The generated tree
may correspond to a meaningful taxonomy. 3. Only a distance or “proximity” matrix is needed to
compute the hierarchical clustering
O1, O2 O3 O4 O5
O1, O2,O3 O4 O5
O1, O2,O3, O5 O4
*********