Sunteți pe pagina 1din 2

Contact() Links()

Home() AI() Robotics() Notes() About()

Home>AIMain>Clustering>kMeansIntroduction>kMeansExample1

Clustering StatisticalClustering.kMeans.* NewsLinks:


kMeans 2017(0)
LVQ ViewJavacode January(0)
ViewPythoncode 2016(0)
2015(0)
kMeans:StepByStepExample 2014(0)
2013(32)
Asasimpleillustrationofakmeansalgorithm,considerthefollowingdatasetconsistingofthe
scoresoftwovariablesoneachofsevenindividuals: 2012(242)
2011(217)
Subject A B
2010(185)
1 1.0 1.0
2 1.5 2.0 2009(20)
3 3.0 4.0
4 5.0 7.0
5 3.5 5.0
6 4.5 5.0 SearchNewsLinks:
7 3.5 4.5
SearchNews

Thisdatasetistobegroupedintotwoclusters.Asafirststepinfindingasensibleinitial
partition,lettheA&Bvaluesofthetwoindividualsfurthestapart(usingtheEuclideandistance
measure),definetheinitialclustermeans,giving:
Mean
Individual Vector
(centroid)
Group1 1 (1.0,1.0)
Group2 4 (5.0,7.0)

Theremainingindividualsarenowexaminedinsequenceandallocatedtotheclustertowhich
theyareclosest,intermsofEuclideandistancetotheclustermean.Themeanvectoris
recalculatedeachtimeanewmemberisadded.Thisleadstothefollowingseriesofsteps:
Cluster1 Cluster2
Mean Mean
Step Individual Vector Individual Vector
(centroid) (centroid)
1 1 (1.0,1.0) 4 (5.0,7.0)
2 1,2 (1.2,1.5) 4 (5.0,7.0)
3 1,2,3 (1.8,2.3) 4 (5.0,7.0)
4 1,2,3 (1.8,2.3) 4,5 (4.2,6.0)
5 1,2,3 (1.8,2.3) 4,5,6 (4.3,5.7)
6 1,2,3 (1.8,2.3) 4,5,6,7 (4.1,5.4)

Nowtheinitialpartitionhaschanged,andthetwoclustersatthisstagehavingthefollowing
characteristics:
Mean
Individual Vector
(centroid)
Cluster1 1,2,3 (1.8,2.3)
Cluster2 4,5,6,7 (4.1,5.4)

Butwecannotyetbesurethateachindividualhasbeenassignedtotherightcluster.So,we
compareeachindividualsdistancetoitsownclustermeanandto
thatoftheoppositecluster.Andwefind:
Distanceto Distanceto
mean mean
Individual
(centroid)of (centroid)of
Cluster1 Cluster2
1 1.5 5.4
2 0.4 4.3
3 2.1 1.8
4 5.7 1.8
5 3.2 0.7
6 3.8 0.6
7 2.8 1.1

Onlyindividual3isnearertothemeanoftheoppositecluster(Cluster2)thanitsown
(Cluster1).Inotherwords,eachindividual'sdistancetoitsownclustermeanshouldbe
smallerthatthedistancetotheothercluster'smean(whichisnotthecasewithindividual
3).Thus,individual3isrelocatedtoCluster2resultinginthenewpartition:
Mean
Individual Vector
(centroid)
Cluster1 1,2 (1.3,1.5)
Cluster2 3,4,5,6,7 (3.9,5.1)

Theiterativerelocationwouldnowcontinuefromthisnewpartitionuntilnomorerelocations
occur.However,inthisexampleeachindividualisnowneareritsownclustermeanthanthatof
theotherclusterandtheiterationstops,choosingthelatestpartitioningasthefinalcluster
solution.
Also,itispossiblethatthekmeansalgorithmwon'tfindafinalsolution.Inthiscaseitwouldbe
agoodideatoconsiderstoppingthealgorithmafteraprechosenmaximumofiterations.

publicvoidfooter(){
About|Contact|PrivacyPolicy|TermsofService|SiteMap
Copyright20092012JohnMcCullock.AllRightsReserved.