Quality of Clustering

Introduction
Basic thoughts
Cluster quality statistics
Examples
Discussion
Measurement of quality in cluster analysis

Christian Hennig
July 24, 2013
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
Which clustering is better?

Why datasets without known truth?
1. Introduction
IFCS task force for cluster benchmarking
(Nema Dean, Iven van Mechelen, Fritz Leisch, Doug Steinley,
Bernd Bischl, Isabelle Guyon, Christian Hennig)
Data repository for systematic comparison of quality
of different cluster analysis algorithms
In this presentation: compare quality of clusterings
based on clustering and data alone,
without reference to known truth.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion


(Old faithful geyser data)
0
duration
1
2
duration
pam
mclust
waiting
waiting
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

40
30
20
0
10
xy[,2]
20
10
0
xy[,2]
30
40
10
10
20
30
xy[,1]
10
10
20
xy[,1]
Christian Hennig
30
Introduction
Basic thoughts
Examples
Discussion

Which tower is better?
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion


Benchmarking approaches:
Real datasets with known classes
Simulated datasets from mixture distributions
Datasets with intuitive classes by fiat
Real datasets without known classes
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion


Benchmarking approaches:
Real datasets with known classes
Simulated datasets from mixture distributions
Datasets with intuitive classes by fiat
Real datasets without known classes
Misclassification rates or Rand index

are (more or less) straightforward.
So why use datasets without known truth?
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Whats wrong with knowing the truth?

Disclaimer: knowing the truth is not evil.
There is definitely a role for datasets
with known truth in cluster benchmarking.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion


Disclaimer: knowing the truth is not evil.
There is definitely a role for datasets
with known truth in cluster benchmarking.
Measuring cluster quality
ignoring the truth can be of use
even if truth is known.
(May explain which truths a method can discover.)
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion


But. . .
In datasets with known classes

clustering is not of real scientific interest.
(Or one may want to find different clusterings.)
Deviate systematically from real clustering problems.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion


But. . .

The fact that we know certain true classes

doesnt preclude other legitimate/true clusterings.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion


But. . .


Classes in supervised classification problems

may not qualify as data analytic clusters.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion


But. . .


Classes in supervised classification problems

may not qualify as data analytic clusters.
So there could be better truths than the known one.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

dc 2
44
44
4
4
44
444
44 44
44 4
4
4 44444444
55
4
55
444 44 3
5555 566 6 6
5 64 444444 33 3
3
5
6
33
5 5
44
46 6
664
5 656665 5 44644
433
46 6
65 46
3
5 566555
56665655 55
44 6a 6 33 333 3 3
75 55 55
556565 6556664664 4
3
3a 33 4
4a3
3
565666 444 644a
77 775
22 333
7
77 5
22
6
33a333
a3332
3
3
6
55566 6464666
3
3
6
5
2
7
2
7
5
5
7
4
5
3
5
3
464
77 7 7
32 a22
5565655 6 6 6
1
4
757 57775
56
236 2 33
57 7 55 5
6 56a 564
55
1 1
66
373a 2
a 7
aa a
7 757775556 56 6a6aa56
7
7
3
7
6
a
7
5
a
7
7
8
3
7
6
a
3
5
7 5777 7 777 a7aaa aa a 7
7 8
333
6 3 233
1 111
3
6 5 3a 36
677 3 aa26333 3 3333
8 7 5
777 7777 77a56 7
3
1
797a5aa aa
a7 6
8 7 7 7888 78 7 77
a 33 22 3 3
77a
aaaaa466
3 11
8 87 7 7
36
7 65aa
8 88 878
8
7
1
6
a
3
a
7
8
a
2
a
a
a
a
3
8 8 8 888 8 89
7 77
2
26 2a 33 2 3 3
8a8a6 96a2a a4a4
2
2
7
8
888 88 888 08 8 98 9
1
8
a
a
2
4
2
2
a42 2 32 2
88
6a a24
8 8 9 7 9 79 8899 9
0
2
2 2 1 1 11 2 2 1 1
9 aaa9a9622 92
8 80 0 0 889 8
979 8 0990a a
0 3
1
2
0 8 8 988 889 8
2 111
2222 1 21111
09 a a2a29
a
0 0 22 22
a
2
a
9
9
9
9
a
9
1
0
9
88
0 a 0 3 01
1 212
88 98
a
989
8 8
aa
2
999 9
898 9988888
11 1
8
9 a a aa0 2 2 2 2
8
1
9 9 809
22
2
a
8
999 99 9
0900090909
1
8 8 88989999 9
99
1
9
09
9
8 8
2
00 09 9900
222
1
1
0 0 999 991 9 9 9
0 000 0
222
2
11
09
9 191
2222 1 2
00
0 2 22
0
9 9 9 0 00
11 1 1 11
1
0
1
1
2
1
0
1
0
1
1
1 10 0
21111 2 22112
0 011 1
1
11
0 0 00 00 0 9 9 91 009 0 0
0
1
0
00 0
0 9
1
11
1
0 0 0
1 1
0 0 0 0
0
1 1
0
0
0 00
0
1
0
11
0
1
9
9
999
9
99
999 999
9
9
99
999
9
9
9
9
99 9
9999
9
9
9
99 999
5
9 99
5
5
9996699 9599 99
5
66 6 699996 9969
8
5
5
66
9 6 665
66 066
9
666
6
8
06
8888
88
6
555
6000
0000 6
666
8
5
8 8 8888 8
6 6
60000 0 665555556
660
55
00
888888888
8 88 8 0 8 6 6 0006
8 88 8 8
888
5
6 60606
5
8 88
88 8 08806 6 6 06
6 55 56
888
60666666 6
80
228 8 888
88888888
0
8
2 222
0
6
0
0
0
0
0
0
8
5
8
8
8
0
5
6
0
55
006666
066
8
8 8 888 4888888 063
60
8
6600 6
6606660066
66 55 555
6
0
0000
7
77 5
2222
5
2 82 88
6
0
0
8
6
0
2
6
8
6
0
0
0
0
2
2
5
8
6 66046665
5
12
2 1 88 814 00
05666
000 6 0 55
55
8 08 0 0
66460
66
06
5
5
2 222 2822 121812
663055
8834 8 8333306
438
3201 48
2 2
5
7 5 5
7
30 6646606066650
2 12 2 2
01 14
3
2 22 1821 18 3
333 34
23
2 22
3
0
6
1
6
3
0
3
0
3
5
3
7
0
0
3
7
2
0
1
5
3
343 4340 4 3 0330
7
66 5
5
4 2141 4 4
57 77 5
1 1 28
4344333 330333330
306 0 0 5 55
314
22 21 2421211133
7 7
7
334 31
1 11 1 112
3 43303
03 030060 5
7
77777 5 77
33 4030 30
11 1 11
77 7
1 11 11 21 22
12
4
777
1222 11
3 00 33 5
77
1 1 1 134 3 1 4340333
7
1 1 2121
21 1 111 4 343 430333 306 55a
7 7
111 2 1
77
77 7 7
0 3 0 44 a
7
4 1 0
77
11 11 1 1
7 777
3
4
11 1 11 11
4
a
7
2
11 41 3 a 3 33a3 4 a a a
1 11 1 1
a a 7 7 a77
7 77
7
22
77
7
3
33
1 11111 1141
a 3a3 aa
aa a a a
77
7
2
7
a
1
a 7 aa
444 a
1 11
a 3
a
11
3 3 a aa
2 2
a a7 a
33 a
a7
a a a
7
aa aaaaa a
7 77
7
a
a
a
a
a
aaaaa aa
a
a
a
aaaa
a
a
12
10
dc 2

(10-d vowel data; Hastie, Tibshirani and Friedman ESL)
dc 1
10
a a
a a aaaaa a
a aa
a
a
2
dc 1
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Identification mixture components/clusters

is problematic.
Mixture of two components may be unimodal.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion


is problematic.
Observations in tails may rather be outliers

than cluster members (t-distributions).
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion


is problematic.
Observations in tails may rather be outliers

than cluster members (t-distributions).
Clustering aims may deviate from

finding intuitive clusters or mixture components.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

40
30
20
0
10
xy[,2]
20
10
0
xy[,2]
30
40
10
10
20
30
xy[,1]
10
10
20
xy[,1]
Christian Hennig
30
Introduction
Basic thoughts
Examples
Discussion
Cluster validation indexes

General philosophy
Typical clustering aims
2. Basic thoughts
There is a range of cluster validation indexes
measuring clustering quality, such as
Average silhouette width (ASW)
(Kaufman and Rouseeuw 1990)
b(i,C)a(i,C)
,
sw (i, C) = max(a(i,C),b(i,C))
a(i, C) =
X
1 X
1
d (xi , x), b(i, C) = min
d (xi , x).
|Cj | 1
xi 6Cl |Cl |
xCj
xCl
Maximum average sw good C.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy
Most such indexes balance within-cluster homogeneity

against between-cluster separation.
One size fits it all-approach.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy
Most such indexes balance within-cluster homogeneity

against between-cluster separation.
One size fits it all-approach.
40
30
20
0
10
xy[,2]
20
10
0
xy[,2]
30
40
Homogeneity will always dominate here:
10
10
20
30
xy[,1]
10
10
20
30
xy[,1]
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy
General philosophy
There are various different aims of clustering.
Depending on application,
these aims carry different weights.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy
General philosophy
Measure them separately to characterise
what a method does best,
instead of producing a single ranking.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy
General philosophy
Measure them separately to characterise
what a method does best,
instead of producing a single ranking.
Can piece together overall quality
as weighted mean of separate statistics.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy
Between-cluster separation
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy
Within-cluster homogeneity (low distances)
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy
Within-cluster homogeneous distributional shape
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy
Good representation of data by centroids
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy
Good representation of dissimilarity

by clustering-induced metric
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy

Clusters are regions of high density

without within-cluster gaps
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy


Uniform cluster sizes
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy


Uniform cluster sizes
Stability (requires knowledge of method)
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy
E.g., pattern recognition in images

requires separation,
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy

clustering for information reduction requires
good representation by centroids,
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy

groups in social network analysis shouldnt have
large within-cluster gaps,
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

General philosophy

groups in social network analysis shouldnt have
large within-cluster gaps,
underlying true classes (biological species)
may cause homogeneous distributional shapes.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
Principle of direct interpretation

Measuring between-cluster separation
Other statistics
3. Cluster quality statistics

Aim: measure all thats of interest
by statistics in [0, 1] (1 is good)
(so that different statistics are comparable
and weighted means make sense).
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
3. Cluster quality statistics

Aim: measure all thats of interest
by statistics in [0, 1] (1 is good)
(so that different statistics are comparable
and weighted means make sense).
Principle of direct interpretation:
Aim at translating requirements directly into formulae;
thats not optimisation, not estimation of any truth.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Warning: requires bold subjective tuning decisions.

And its work in progress.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics

several ways measuring separation (as for other aims).
Straightforward: min distance between any two clusters,
or distance between centroids (e.g., k-means).
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
duration

Other statistics
waiting
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
duration
waiting
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics

several ways measuring separation (as for other aims).
Straightforward: min distance between any two clusters,
or distance between centroids (e.g., k-means).
These measure quite different concepts of separation.
(min distance relies on only two points;
centroid distance ignores what goes on at border.)
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
p-separation index:
More stable version of min distance:
Average distance to nearest point in different cluster for
p = 10% border points in any cluster.
(ASW averages 100% to all in neighbouring cluster.)
duration
X
X X X
X
X X
X
X
X
X
XX
X
X
X
X
X
X
X
X
X
X
X
X XXX
Christian Hennig
waiting
Introduction
Basic thoughts
Examples
Discussion

Other statistics
p-stability index:
Problems: choice of p, standardisation.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
p-stability index:
Problems: choice of p, standardisation.
May standardise by maximum distance;
range then is [0, 1], but values may be very small,
max distance may be outlying,
implicit downweighting if used in
overall quality weighted mean.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Could use average/median distance etc. and bound by 1

(separation larger ave distance perfect).
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics

Probably not fully satisfactory.
May use nonlinear transformation to [0, 1]
pronouncing differences between lower values,
taking into account whether
Max distance >> ave/median distance.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics

Stick to max distance standardisation here.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics

Stick to max distance standardisation here.
p = 0.1 intuitive; sensitivity?
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Alternative concept:
Distance-based knn density index
Measures whether border points have lowest density,
highest density is within clusters i.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Alternative concept:
Distance-based knn density index
Measures whether border points have lowest density,
highest density is within clusters i.
Border points here: niB points that have points
from other clusters among k = 4-nearest neighbours,
niI interior points.
Pointwise density: k/(2mean distance to k-nn).
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
duration

Other statistics
B
B
B
B
waiting
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
duration

Other statistics
B
B
B
B
waiting
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Clusterwise density index ri :

(mean border density)/(mean interior density),
0 if niB = 0, 1 if niI = 0.
Aggregation ( [0, 1]):
ID = 1 ((1 q)r1 + qr2 )1((1 q)r1 + qr2 1),
P
r1 = wi ri , r2 = bi ,
q = 0.5 |
niI
0.5|, wi =
nI
Pi I ,
ni
mean border density, i mean interior density.

Overall: b
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics

ID = 1 ((1 q)r1 + qr2 )1((1 q)r1 + qr2 1),
P
r1 = wi ri , r2 = bi ,
q = 0.5 |
niI
0.5|, wi =
nI
Pi I ,
ni
Idea: r1 measures cluster-relative density ratio,

r2 overall.
P I
Both of interest, but for
ni n or 0,
one side of r2 relies on very weak information.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics

ID = 1 ((1 q)r1 + qr2 )1((1 q)r1 + qr2 1),
P
r1 = wi ri , r2 = bi ,
q = 0.5 |
niI
0.5|, wi =
nI
Pi I ,
ni
Idea: r1 measures cluster-relative density ratio,

r2 overall.
P I
Both of interest, but for
ni n or 0,
one side of r2 relies on very weak information.
Although r1 downweights clusters with niI small,
outlier one-point clusters still produce too good ID .
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Other statistics
Within-cluster average distance
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Other statistics
Aggregated within-cluster similarity

(Kolmogorov etc.) to normal/uniform
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Other statistics

Within-cluster (squared) distance to centroid
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Other statistics

(distance, cluster induced distance) (Huberts )
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Other statistics

Entropy of cluster sizes
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Other statistics

Within-cluster nearest neighbour distances

coefficient of variation
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Other statistics

Within-cluster nearest neighbour distances

coefficient of variation
Average largest within-cluster gap
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
All need standardisation/transformation.

Most are dissimilarity-based,
allow flexible use with non-Euclidean data,
given meaningful distance measure.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Other statistics
Data set submission to benchmarking repository

requires filling in questionnaire, e.g.
Should clusters be similar or dissimilar in size?
Are there requirements on what should be the discriminating ground for elements to belong to different
clusters? Large between-cluster dissimilarities (and, if yes, in which respect)? Separation (and, if yes, of
which kind)? Other (and, if yes, what form do these requirements take)?
Are there requirements on the between-cluster heterogeneity, that is, the structure of between-cluster
differences (e.g., should lie in low-dimensional space, other)?
(Some other; not all yet formalised by indexes)
Are there requirements on what should be the unifying/common ground for elements to belong to the same
cluster? Small within-cluster dissimilarities (and, if yes, in which respect)?
Please indicate the importance of those criteria selected by filling in a numerical weight.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
40
30
20
0
10
xy[,2]
20
0
10
xy[,2]
30
40
4. Examples
10
10
20
30
10
xy[,1]
ave within
sep index
density index
within gap
10
20
30
xy[,1]
3-means
0.811
0.163
0.460
0.927
Christian Hennig
mclust-3
0.643
0.306
0.876
0.949
Introduction
Basic thoughts
Examples
Discussion
0
2
ave.linkage
single linkage
pdfCluster (3)
0
2
duration
0
2
duration
0
duration
waiting
waiting
waiting
duration
0
duration
1
2
duration
spectral
pam
mclust
waiting
1
waiting
Christian Hennig
0
waiting
Introduction
Basic thoughts
Examples
Discussion
ave within
sep index
density
gap
coef var
gamma
normality
entropy
mclust
0.783
0.127
0.910
0.888
0.541
0.679
0.880
0.923
pam
0.797
0.045
0.733
0.891
0.567
0.708
0.838
0.974
spect
0.792
0.127
0.864
0.891
0.554
0.709
0.854
0.941
Christian Hennig
ave.l
0.794
0.096
0.903
0.891
0.564
0.711
0.841
0.952
sing.l
0.666
0.175
0.969
0.929
0.573
0.064
0.786
0.023
comp.l
0.779
0.103
0.874
0.891
0.545
0.664
0.882
0.913
pdf3
0.875
0.065
0.719
0.906
0.554
0.767
0.856
0.999
Introduction
Basic thoughts
Examples
Discussion
Weighthed mean:
full weight: ave within, sep index
0.8 weight: entropy
half weight: within nn cov, gap, min separation,
density index, hubert gamma, normality, uniformity
pdf3
0.624
spect
0.622
ave.l
0.622
mclust
0.619
Christian Hennig
kmeans
0.618
comp.l
0.610
pam
0.601
sing.l
0.460
Introduction
Basic thoughts
Examples
Discussion
Problem with pam captured.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
Single linkage: useless (entropy 0.023; one-point cluster),

good values in some indexes (careful!), bad in others.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Comparison 2-cluster vs. 3-cluster (pdfCluster):

individual indexes unfair;
ave within better, separation worse with larger k (etc.)
Depends on proper weighting.
Could add parsimony index.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion

Comparison 2-cluster vs. 3-cluster (pdfCluster):

individual indexes unfair;
ave within better, separation worse with larger k (etc.)
Depends on proper weighting.
Could add parsimony index.
mclust not best in normality,

ave.l not best in ave within!
Individual indexes may favour certain methods,
but not as obvious as it seems.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
0.4
0.2
0.0
0.6
0.4
0.2
datanp[,2]
0.0
0.2
0.4
0.6
datanp[,2]
0.2
0.4
European land snails data (Hausdorf, Hennig 2003,2006)

Presence-absence (0-1) data for species in regions;
geographical Kulczynski dissimilarity;
clustering for biotic elements (natural history),
originally clustered with mclust after MDS.
Can compare with distance-based.
0.4
0.2
0.0
0.2
0.4
0.6
datanp[,1]
Christian Hennig
0.4
0.2
0.0
0.2
0.4
0.6
datanp[,1]
0.4
0.2
0.0
0.6
0.4
0.2
datanp[,2]
0.0
0.6
0.4
0.2
datanp[,2]
0.2
0.4
Introduction
Basic thoughts
Examples
Discussion
0.4
0.2
0.0
0.2
0.4
0.6
datanp[,1]
ave within
gap
density index
sep index
entropy
normality (MDS)
uniformity (MDS)
Christian Hennig
0.4
0.2
0.0
0.2
0.4
0.6
datanp[,1]
mclust
0.619
0.766
0.503
0.055
0.929
0.805
0.393
ave.l
0.645
0.771
0.852
0.126
0.717
0.781
0.302
0.4
0.2
0.0
0.6
0.4
0.2
datanp[,2]
0.0
0.6
0.4
0.2
datanp[,2]
0.2
0.4
Introduction
Basic thoughts
Examples
Discussion
0.4
0.2
0.0
0.2
0.4
0.6
0.4
0.2
datanp[,1]
0.0
0.2
0.4
0.6
datanp[,1]
mclust only better in entropy

and MDS-distribution based indexes.
Christian Hennig
0.4
0.2
0.0
0.6
0.4
0.2
datanp[,2]
0.0
0.6
0.4
0.2
datanp[,2]
0.2
0.4
Introduction
Basic thoughts
Examples
Discussion
0.4
0.2
0.0
0.2
0.4
0.6
0.4
0.2
datanp[,1]
0.0
0.2
0.4
0.6
datanp[,1]
mclust only better in entropy

and MDS-distribution based indexes.
Perturbed by noise cluster;
better cluster with noise component.
How to use indexes with unclustered data?
Could just ignore them but that gives
clustering with noise unfair advantage.
Christian Hennig
9
9
999
9
99
999 999
9
9
99
9 99
9
999999
9999
9
9
99 999
5
9 99
5
5
9996699 9599 99
5
66 6 699996 9969
8
5
5
9 6 665
66 066
9
666
666
8
6
8
0
8
8
8
6
6
8
0
5
5
5
6
8
666000000 6
8
5
8 8 8888 8
60000 0 665555556
55
00
6660
888888888
8 88 8 0 8 6 6 0006
8 88 8 8
888
5
8 88
6 6660606 6 5 5 5
88 8 08806 6 6 06
888
80
228 8 888
8 600
2 222
6 606666 6
5 65
0
00 00
8888 0
5
88888888
8
0
3
066
66 555 555
006600 6
86 0066006666
8 8 888
57
60
48
7 5
222
00
600000
08606666606
2 2 822 88
6 6
06 65
28 2 8 88 808 0000
55
55 7
1 22
605666
1 888 0 814 800
8 80 0
0 6 0 55 5
664604666
5
0
66
06
5
2 222 2822 121812
663055
834 8 8333306
438
321 48
2 2
5
7 5 5
7
30 6646606066650
2 12 2 2
3
01 14
3
2 22 1821 18 3
333 34
2
2
3
0
6
1
2
6
3
0
3
0
3
30 3
3
343 4340
00330
7
66 5
5 5 57 77 5577
4 2141 4 4
11 21228
4344333033403333330
306 0 0 5 55
314
22 21 2421211133
7 7
7
334 31
1 11 1 112
3 43303
03 030060 5
7
77777 5 77
33 4030 30
11 1 11
77 7
1 11 11 21 22
12
4
5
77
1222 11
3 00
77
1 1 1 134 3 1 4340333
7
1 1 2121
1
7 77
430333 33306 55a
2
7
4
1
3
1
1
1
1 1 4 1 30
1 21
77
77 7 7
0 3 0 44 a
77
11 11 1 1
7 777
3
44 4
11 1 11 11
7
2
11 41 3 a 3 33a3 4 a aa a
1 11 1 1
a a 7 7 a77
7 77
7
22
77
7
3
33
1 11111 1141
a 3a3 aa
aa a a a
77
7
2
7
a
1
a 7 aa
434 4 a
1 11
a 3
a
7
a
a
11
3
2 2
a
a
a
a
33 a
a7
a a a
7
7
aa aaaaa a
777
aa aa
a
aaaaa aa
a
a
a
aaaa
a
a
10
dc 1
a a
a a aaaaa a
a aa
a
0
dc 2
2
4
6
dc 2
12
10
dc 2
55
55
5555 566
5
5
44
44
4
4
44
444
44 44
44 4
4
4 44444444
4
6 6
444 44 3
5 64 444444 33 3
3
6
33
44
646
46 6
6
5
555656665 5 44644
3
43
46
65 46
3
5 5665 56665655 55
44 6a 6 33 333 3 3
75 55 55
556565 6556664664 4
3
3a 33 4
4a3
3
565666 444 644a
77 775
22 333
7
77 5
6
33a333
a3332
6
55566 46466
3
3 3 33
233222a
6
77577 7 5 7 7555555655 665656 6 46 4
2 22
1
44
757 57775
236 2 33
57 7 5 566
6 56a 564
565
6
1 1
373a 2
a 7
aa a6 7
7 75777555 56 6a6aa56
7
5
66333 a23
233 3 33333
7 7 88 777755
77
7 5777 a7aa6aa5aaa3a7 6 37
1 111
6
7
7
7
3
6
a
7
7
7
7
3
5
6
9
a
8
7 77 77a6 7
1
a
333 3
77a5aa aa
8 7 7 7888 78 7 77
7 77a
a 33 33 23
aa7 666
8 87 7 7
36 2 33 3 11
8 88 878
8
7 7 65aa
1
a
6a a aaaaa4aaa
7
8
2
8
8
7
3
88 88
7
2
3
86 96a2a aa26
7
8 8 9
2
a 3 23 3
4 44a2
2
8 7
8 8 88 888 08 8 98 9
122 2
88aa
a4a 2 32 2 2
88 8
6aa a24
8 8 9 7 9 79 8899 9
11 2
0
1
a
8
1
2
a
2
2
8
0
2
8
2
9 a 9962 9a 0 3 2
8
11 1
979 8 0990a a
8
9
12222 1 21112
0 08 08 988 9 889 8
0 0 2 22
22
2 1 12
a90 0a a2 29
9 9
11
99
aaa
1
0
1
8
898 98
8
8
9
1
8
a
8
0
2
3
9
aa0 0
a2 2
8
2
898 988889 809 999 9 9 a
11 1
2
8
8
1
9 9
22
a aa
22
a
8
999 99 9
0900090909
1
8 8 88989999 9
9
1
9
9
0
9
9
8
8
2
00 0 99009 1 9 9 9
222
9
1
1
0 0 99 99
0 000 0
22
11
09
2
9 191
2 222 2
00
0
1 1 11
99 90 0
10 0 2222 22111 112
11 1111
0
0 1 10
1 10 0
11 2 22
0 011 1
1
11
0 0 00 00 0 9 9 91 009 0 0
1
0
00 0
0 09
1
11
1
0 0 0
1
0
0 0
1
00
1 1
0
0
0 00
0
1
0
11
0
1
Introduction
Basic thoughts
Examples
Discussion
5 5
5
7 5 55
2222 2
7
2
2222 7
2
2
77 7 2222
7 77 2626 2
266 22 22 7
6
662
22222
6
727
6 6 2 66
62
2222
777 7 7
666
6
76 72 62 2 7 7 777
7
67
7 7 7 77 7 77777 7 7 7 7
667 77776 67
7777 76
7
7 7 777 77
7 77 67 7
7777 7 7 6
4
4
7
3
1
6 66
4
7
77776 7 77
44
7a a 7 7
7
6 a a7
77 7a7 767777
a 7 4444
7777
11
79 9a 7aa77a
a4 4 1 1
7 67 77a77a 7a 7 7
7
14a411144 41
aa 6
a 777aa 777aa
aa
3
a7
44
4 41
7
a aaaaa9 9 6 a7 aa7
aaa77a7a77 7
a777
7 77 7 44 a1
4 44 4 4
77a7
a aaaaa7
3 3
a7
747 4 71 4 a444441 1 4
99 6aa aa 77a aaa
77
aaa aa
7
7 7 7 aaa7 a aa
a a aaa7
a
a
aaa aa 4 7 4
a7 a
a
a
a
4441
433
4
a
a
a
a
a
3
a
7
a
a
4
7a a7aaaaaaaa1777 a 1
4 4 44
7a
aa
7 aaa aa 7aa
7a
a 9a aaaaa 7
9
aa a aaaaa7
4 4 44
9
33
aa 7aaaa
7a7a77aa
aa
7aa
7a97 a9
7aa
4 3
aaa aaa a 7
aaaa
7a
aa99a9 aa
aa77aa aaaaaaa
4444
4 3433
a
a
a
a
a
a
a
7
a
a
4
aa aa a9 a a7aa a aaaa
aa a
9
34
44
3
44 44
3
9
aa aaaaaaaa
99a9a
99 aaaa
aaa a aa aa aaa a 44
9
aaaaa9aaa999
9 a aaa aaaaaaaa
a a aa
3
3
4
9
3
a9
9
a
4
3
9
a
a
9
a
a
9
3
a
aa4 4 1 44 3
9
9 9 9 99 a999a99a 8 a a98
34
9aaaa a9 a 1
a a aa
4 4
9 999 999 9a999a
aa 55898aa8aa9 99 9 9
8a
434 3 3433
1 aaa 1 1
a a9
4
9
8
a
8
a
a
4
a
9
8
a
9
9
9a5
99 a a9 99a 9 9
33
030 3
3
333 3444 4
8 a a
a8 88
a a 9aa9 8
99
33 33
33
8 89 99 888 a8
9 a a a 9 59 a 9 a
0
9 999a9 9
a a
89 9
0
a a9
0
3
a
9 55 9 a99a 9a988
03
9 9
9 999 9 9 9 9 9 9 9
9
99 9
9 8
9 9
5
5
8
9 9 99
88
99
8 88
ave within
sep index
hubert gamma
entropy
ARI
true
0.691
0.069
0.224
1.000
1.000
Christian Hennig
5
2
dc 1
dc 1
11-means
0.734
0.093
0.411
0.983
0.205
spectral
0.692
0.130
0.400
0.739
0.142
Introduction
Basic thoughts
Examples
Discussion
ave within
sep index
hubert gamma
entropy
ARI
true
0.691
0.069
0.224
1.000
1.000
11-means
0.734
0.093
0.411
0.983
0.205
spectral
0.692
0.130
0.400
0.739
0.142
true no good clustering.

Ave within, entropy are only indexes
positively correlated with ARI!
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
ave within
sep index
hubert gamma
entropy
ARI
true
0.691
0.069
0.224
1.000
1.000
11-means
0.734
0.093
0.411
0.983
0.205
spectral
0.692
0.130
0.400
0.739
0.142
true no good clustering.

Ave within, entropy are only indexes
positively correlated with ARI!
Good ARI needs good ave within and nothing else here.
Use to explain results in data with known classes.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
Crabs data (2 species, m/f):

6 8
12
16
20
20
30
40
spectral
50
12
16
20
20
30
40
50
15
FL
20
20
6 8
15
16
16
20
20
10
10
FL
RW
CL
15
40
40
50
50
15
25
25
CL
35
35
45
45
6 8
6 8
12
12
RW
CW
BD
10
15
20
15
25
35
45
10
15
20
Christian Hennig
10
10
BD
15
15
20
20
20
20
30
30
CW
10
15
20
15
25
35
45
10
15
20
Introduction
Basic thoughts
Examples
Discussion
Crabs data (2 species, m/f):

6 8
12
16
20
20
30
40
mclust
50
12
16
20
20
30
40
50
15
FL
20
20
6 8
15
16
16
20
20
10
10
FL
RW
CL
15
40
40
50
50
15
25
25
CL
35
35
45
45
6 8
6 8
12
12
RW
CW
BD
10
15
20
15
25
35
45
10
15
20
Christian Hennig
10
10
BD
15
15
20
20
20
20
30
30
CW
10
15
20
15
25
35
45
10
15
20
Introduction
Basic thoughts
Examples
Discussion
ave within
density
hubert gamma
ARI
true
0.761
0.167
0.060
1.000
mclust
0.828
0.027
0.291
0.316
spectral
0.908
0.246
0.591
0.023
true is worst according to most indexes.

But there is a visible pattern!
All indexes (except entropy) are

negatively correlated with ARI.
mclust has best ARI out of 8 methods

but quite bad index values.
Indexes fail to capture what goes on here:-(

Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
5. Discussion
Clustering quality is multidimensional.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
5. Discussion
Provide multidimensional evaluation,

characterising a methods behaviour.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
5. Discussion

Can aggregate criteria by weighted mean

given well justified weights.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
5. Discussion


Can use to explain performance

in data with known truth
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
5. Discussion


Can use to explain performance

in data with known truth
Designers of new methods should specify

what aspects of clustering they aim at,
so that it can be tested.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
Open problems
Dependence on subjective tuning hurts

(weights, k-nn, percentage, standardisation).
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
Open problems

But its honest; such decisions are needed

to define a good clustering in practice.
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
Open problems


Proper behaviour of criteria

(standardisation, transformation,
different numbers of clusters)
for fair aggregation and comparability?
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
Open problems



Index choice vs. method definition

(average linkage not always optimal for ave. distance)
Christian Hennig
Introduction
Basic thoughts
Examples
Discussion
Open problems



Index choice vs. method definition

(average linkage not always optimal for ave. distance)
With given weights, optimise quality?
Christian Hennig

Quality of Clustering

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Quality of Clustering

Încărcat de

Drepturi de autor:

Formate disponibile

Introduction

Measurement of quality in cluster analysis

July 24, 2013

Measurement of quality in cluster analysis

Which clustering is better?

Measurement of quality in cluster analysis

Which clustering is better?

Which clustering is better?

Measurement of quality in cluster analysis

Which clustering is better?

Which clustering is better?

Measurement of quality in cluster analysis

Which clustering is better?

Which tower is better?

Measurement of quality in cluster analysis

Which clustering is better?

Why datasets without known truth?

Real datasets with known classes

Simulated datasets from mixture distributions

Datasets with intuitive classes by fiat

Real datasets without known classes

Measurement of quality in cluster analysis

Which clustering is better?

Why datasets without known truth?

Real datasets with known classes

Simulated datasets from mixture distributions

Datasets with intuitive classes by fiat

Real datasets without known classes

Misclassification rates or Rand index

Measurement of quality in cluster analysis

Which clustering is better?

Whats wrong with knowing the truth?

Measurement of quality in cluster analysis

Which clustering is better?

Whats wrong with knowing the truth?

Measurement of quality in cluster analysis

Which clustering is better?

Whats wrong with knowing the truth?

In datasets with known classes

Measurement of quality in cluster analysis

Which clustering is better?

Whats wrong with knowing the truth?

In datasets with known classes

The fact that we know certain true classes

Measurement of quality in cluster analysis

Which clustering is better?

Whats wrong with knowing the truth?

In datasets with known classes

The fact that we know certain true classes

Classes in supervised classification problems

Measurement of quality in cluster analysis

Which clustering is better?

Whats wrong with knowing the truth?

In datasets with known classes

The fact that we know certain true classes

Classes in supervised classification problems

So there could be better truths than the known one.

Measurement of quality in cluster analysis

Which clustering is better?

Which clustering is better?

Measurement of quality in cluster analysis

Which clustering is better?

Whats wrong with knowing the truth?