Documente Academic
Documente Profesional
Documente Cultură
DateJan302008
MediansandOrderStatistics
Minimumandmaximum
Howmanycomparisonsarenecessarytodeterminetheminimumofasetofn
elements?Wecaneasilyobtainanupperboundofn1comparisons:examine
eachelementofthesetinturnandkeeptrackofthesmallestelementseenso
far.Inthefollowingprocedure,weassumethatthesetresidesinarrayA,where
length[A]=n.
MINIMUM(A)
1minA[1]
2fori2tolength[A]
3doifmin>A[i]
4thenminA[i]
5returnmin
TogettheminimumormaximumtheoperationislinearandhencetheorderforminandmaxisO(n).
Mean
Meanistheaverageofallthenumbers.
=avg(A)
Median
Medianisexactlythemiddleoffallnumbers.
Thisfunctionismorestableandrobustsincethereisnosquarefactorcomparedtomean
Algorithmsforfindingthemedian.
1. Simple
a. SortA[]
OrderforsortisO(nlogn)
b. Median=A[size/2]
2. Randamizedalgorithm
a1,a2.an
a.
b.
c.
d.
step1:Order(formgroups)
step2:Findmedianineachgroup
step3:findmedianofthemedian(m1,m2,m3,..m5)=M
step4:DopartitionofAbyM
LR
If|L|=n/21thenk=n/2andthemiddletermisthemedian.
If|L|<=n/21thenmedian(R,k|L|)
If|L|>=n/2thenmedian(L,k)
Example:
ConsiderA[]={2544442555321839987212647199131656241021971}
Formagroupof5numbers
25444425
55321839
987212647
199131656
241021971
Findthemedianforeachrow
2544442525
5532183918
98721264726
19913165616
24102197119
Findthemedianofthemedians
Medianof2518261619=19
ThereforeM=19
DopartitionofA[]byM
2544442555321839987212647199131656241021971
254
24445425
245542544
245525445432
2455184454322539
2455189543225394487212647
245518919322539448721264754
2455189199253944872126475432
245518919913394487212647543225
245518919913164487212647543225395624
24551891991316108721264754322539562444
245518919913161022126475432253956244487
2455189199131610219264754322539562444872171
SinceM=19isthemiddleterm,themedianis19.
Thisexampleshowsthatiftherearekelements=pivot,thesekelementsdonotnecessarilylocate
togetherinthefinalresults.
DateFeb4th2008
RuntimeanalysisofQuickSort:
T(n)=T(n/10)+T(9/10n)+cn=O(nlog2n)
AverageCase:
Thesplitupofsubarraysintheaveragecasewouldbe
1/10n
9/10n
BestCase:(LUCKY)
Forthebestcase,thearraywillbesplitexactlyinthemiddlei.e.pivotelementwillbeexactlyinthe
middleofthearray.
RuntimeAnalysisofMedianFind:
Letstakeanexample
Heretheentirearrayisbeingsplitintodatachunksofsize5i.e.n/5
2
5
9
19
24
54
5
87
9
10
44
32
21
13
2
4
18
26
16
19
25
39
47
56
71
Aftersortingeachandeverycolumn,wewillget
2
5
9
19
24
5
9
10
54
87
2
13
21
32
44
4
16
18
19
26
25
39
47
56
71
Intheabovefigure,themiddlerowwhichiscircledisthemedianofthecorrespondingcolumns.
Nowweneedtofindthemedianofmedians.Sothemiddlerowneedstobesorted.Aftersortingthe
middlerow,themiddleelementwillbethemedianofmedians.
2
5
9
19
24
5
9
10
54
87
4
16
18
19
26
2
13
21
32
44
25
39
47
56
71
18isthemedianofmedians.Intheabovefigurethecolumn3andcolumn4areswapped.
Generallythepartitionwillbelike
18
3n/10<=L
3n/10<=R<7n/10
n/5
2
5
9
19
24
5
9
10
54
87
4
16
18
19
26
2
13
21
32
44
25
39
47
56
71
Theactualarrayissplitinto10halflineswhichareshownintheboxes(abovediag).Thereareexactly
3n/10elementsinboththeboxes.Theblackboxhasalltheelementswhicharedefinitelylesserthan
themedianofmedians.Alltheelementswhichareinthebrownboxhavealltheelementswhichare
greaterthanmedianofmedians.
T(n)<=T(n/5)+T(7n/10)+cn
Wearegoingtoprovetworecurrencerelationsinthisclass
1. ForQuicksort
T(n)=T(1/10n)+T(9/10n)+cn
WeneedtoprovethisisoforderO(nlogn)
2. ForMedianfinding
T(n)=T(1/5n)+T(7/10n)+cn
WeneedtoprovethisisoforderO(n)
Recurrence1(QuickSort)
T(n)=T(1/10n)+T(9/10n)+cn
=T((1/10)2n)+T(9/10*1/10n)+cn/10+
T(1/10*9/10n)+T((9/10)2n)+9cn/10+cn
=T((1/10)2n)+2T(9/10*1/10n)+T((9/10)2n)+2cn
Generalizingtheaboveequationweget
T((1/10)kn)+(K1)T((1/10)k19/10n)+(K2)T((1/10)k2(9/10)2n)+..+(K1)T((1/10)2(9/10)k
K
1
n)+( 0)T((9/10)kn)+kcn
Theabovecircledfactoristhelargestfactorintheaboveequationbecauseofthefollowing
reasons
1. (K0)willgivethelargestnumber.
2. (9/10)kwillbethelargestfactorintheaboveequation.Thesmallestfactoris(1/10)k.
WeneedtodeducetheaboveequationtoClosedFormi.e.toT(1)
Therefore
(9/10)kn=1
n=(10/9)k
logn=klog(10/9)
k=(logn)/log(10/9)
substitutingvalueofkandtakingthelargestfactoroutsideintheaboveequation,weget
T(n)<=T((9/10)kn)[1+(K1)+(K2)++(K1)+1]+kcn
<=T((9/10)kn)[2k]+kcn(usingBinomialTheorem)
<=T(1)2(logn/log(10/9))+cnlogn/log(10/9)
<=T(1)[2logn]1/log(10/9)+cnlogn/log(10/9)
<=T(1)n1/log(10/9)++cnlogn/log(10/9)
=O(nlog2n)
HenceT(n)=O(nlog2n)
Recurrence2(MedianFinding)
T(n)=T(1/5n)+T(7/10n)+cn
=T((1/5)2n)+T(1/5*7/10n)+cn/5
+T(7/10*1/5n)+T((7/10)2n)+7cn/10
+cn
=T((1/5)2n)+2T(1/5*7/10n)+T((7/10)2n)+(9/10+1)cn
=T((1/5)kn)+(K1)T((1/5)k17/10n)+(K2)T((1/5)k2(7/10)2n)+..+(K1)T((1/5)2(7/10)k1
n)+(K0)T((7/10)kn)+[(9/10)k1+(9/10)k2+..+1]cn
Kterms=[(1(9/10)k+1)/(1(9/10))]cn
TodeducetheaboveequationtoClosedForm,weset(7/10)kn=1
n=(10/7)k
Takinglogonbothsides
logn=klog(10/7)
k=logn/log(10/7)
Substitutingthevalueofkintheaboveequationweget
T(n)<=T(1)n1/log(10/7)+(1(9/10)(logn/log(10/7)+1)/(1(9/10))
*cn
9/10islessthan1andwhenthevalueoflogn/log(10/7)growshigherandhigher,the
totalvaluegrowslesserandlesser.Soitcanbeneglected.
ThereforeT(n)=O(1/(1(9/10)*n)
=O(10n) =O(n)
Normallywepartitionthearrayby5i.e.n/5
Whatifwepartitionthearrayby3i.e.n/3(sortingwillbeeasierfor3elements)
n/3
Therearetotally6halflinesintheabovediagram.
2/6n>guaranteedlowerelementsthanY.Theotherhalfis12/6n=4/6n=2/3n
Sotheruntimeanalysis
T(n)<=T(n/3)+T(2n/3)+O(n)
Addingtheconstantsn/3+2n/3=ni.e.nondecreasingfunction
Theoreticallyforthegroupof3,theorderisO(nlogn)
Whatiswepartitionby7i.e.n/7
n/7
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
M
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Therearetotally14halflinesandoutofwhich,4areguaranteed.
4/14n=1/7n
Theotherhalfis14/14n=10/14n=5/7n
T(n)<=T(n/7)+T(5/7n)+cn
Theoreticallyn/7+5/7n=6/7nwhichis<n.Soitisadecreasingfunctionandthe
orderisO(n)
Sotherecurrencerelationsare
Tg=7(n)<=T(1)n1/log(7/5)+Cn/(1(6/7))
<=T(1)n1/log(7/5)+7Cn
Tg=5(n)<=T(1)n1/log(10/7)+Cn/(1(9/10))
<=T(1)n1/log(10/7)+10Cn