Sunteți pe pagina 1din 5

CS521CSEIITG 11/23/2012

ASahu 1 ASahu 2

Processesrunondifferentprocessors
Multiprocessorsarelikelytobecost/power independently
effectivesolutions
Becauseitsharelotsofresources Atsomepointtheyneedtoknowthestatusof
Personalroomiscostlierthandormitory eachotherfor
Sharingresourcearisemanyotherproblems Communication,mutualexclusionetc
Coherence
Coherence SimpleLocking
Si l L ki
Shareddataatallplacedshouldbesame
CriticalSections PartofACA
LockandBarrierDesign Course@IITG
Lock(L)
Consistency CriticalSection(C)
Ordershouldbesimilartoserial(ROB) Unlock(L)
OneprocessorInterferenceothers
Shareefficientlyusingsomepolicy
ASahu 3 ASahu slide4

SimpleLocking Noinstruction
lock: ldreg,loc//copylocationtoreg areatomic
cmp reg #0//comparewith0 Hardwareprimitiveforatomicread+write is
bnz lock//ifnotzerotryagain required e.g.
st loc#1//store1atloctomarkitlocked
return; Test&Set, //testforunlock(0)thensetthelock(1)
Exchange,
Unloock:st
Unloock: st loc#0
loc #0 Fetch&Increment
F t h&I t
return

Supposetwoprocessorarecontendingtoacquirelock
Bothreadlockatsametimevalue0andpassesthe
branches
BothlockedthevariableandentertoCS
ThiscontradictthemeaningofaLOCK
ASahu slide5 ASahu slide6

ASahu 1
CS521CSEIITG 11/23/2012

Lock:0indicatesfreeand1indicateslocked
CodetolockX: SPINing LLr1XReadingfromalocationX
Trytotestandacquirethelockina
r2 1 tightloop //dosomeoperation
lockit: r2 X ;atomic exchange Time
( )
if(r20)lockit ;
;already y locked SCr3XStoringtolocationr3toX
locksarecachedforefficiency,coherenceisused ifunsuccessfulr3==0
BettercodetolockX:
StorewillbeunsuccessfulifvaluesofXis
lockit: r2 X ;read lock altered/changedbyothersprocessorbetween
if(r20)lockit ;not available timeofLLandtimeofSC
r2 1
r2 X ;atomic exchange Loadlinked/StoreConditional
if(r20)lockit
ASahu ;already locked 7 ASahu slide8

Simplertoimplement lockit:
AtomicexchangeusingLLandSC LL r2, X ;load locked
try: r3 r2 ;move exchange value
if(r20)lockit ;not available
LL r1, X ;load locked
SC r3, X ;store conditional r2 1
if(r3
if(r3=0)try
0)try ;branch store fails SC r2, X ;store cond
r2 r1 ;put loaded value in r2 if(r2=0)lockit ;branch store
Fetch&incrementusingLLandSC fails
try: LL r1, X ;load locked
r3 r1 + 1 ;increment
SC r3, X ;store conditional Spinlockwithexponentialbackoffreduces
if(r3=0)try ;branch store fails contention

ASahu slide9 ASahu slide10

DoWork(){ Initialize
Spinningwastetime Do Phase I work
TAS Lock
RecallMACProtocol Barrier();
Do Phase II work PhaseI 1 2 3 N
NonPersistenceCSMAprotocol
Waitrandomtimeifmediumif Barrier();
Do Phase III work()
time

busy,thensend PhaseII
Backoff lock
Barrier(); 1 2 3 N
Spinlockwithexponential }
backoffreducescontention
threads for (i=0;i<NumProc;i++){
Waitkamountoftimefor1st PhaseIII
1 2 3 N
attempt P[i].Start(DoWork);
Waitk*ci amountoftime }
PrintResult
forith attempt Print result();
ASahu slide11 ASahu slide12

ASahu 2
CS521CSEIITG 11/23/2012

IfAllProcessor
Barrier(){ Passes/completedthen Sumofallfinishedprocessor
lock (X) theygotonextPhase

if(count=0)release 0 Lock
count++ Lock
unlock(X)
Lock1 Lock2
if(count=total){count0;release1}
else spin until(release==1) P1 P2 P3 PN
P1 P1 PM P1 P1 PM
}
Everyoneaccessingto Lockcontentionis
samelock distributedinaTreeFashion
ASahu slide13 ASahu slide14

Do Phase I work
local_sense !local_sense //Toggle
Barrier(bar1, p);
lock (X)
//After this release=1, but not
//visible to all some how, it may count++
//happened one process is not got this unlock(X)
//
//and waiting
g while other entered to if(count = total)
//NEXt barrier {count0;releaselocal_sense}
Do Phase II work else
Barrier(bar1, p); {spin
//Some will enter to this not all, So until(release==local_sense)}
barrier will not end atall.

ASahu slide15 ASahu slide16

Whenmustaprocessorseethevaluethathas P1: A = 0 P2: B = 0


beenwrittenbyanotherprocessor?Atomicityof ... ...
operations systemwide? A = 1 B = 1
Canmemoryoperationsbere
Can memory operations be reordered?
ordered? L1: if(B=0)S1 L2: if(A=0)S2

Variousmodels:HighlyrecommendedbyHPBook WhichstatementsamongS1andS2aredone?
http://rsim.cs.uiuc.edu/~sadve/Publications/
models_tutorial.ps BothS1,S2maybedoneifwritesaredelayed

ASahu slide17 ASahu slide18

ASahu 3
CS521CSEIITG 11/23/2012

S1:X=10 S2:Y=10 Resultofanyexecutionissameasifthe


L1:R1=Y L2:R2=X operationsofallprocessorswereexecutedin
somesequentialorder
Time p p
Operationsofeachprocessoroccurinthe
S1:X=10 S2:Y=10 S1:X=10 L1:R1=Y orderspecifiedbyitsprogram
L1:R1=Y L2:R2=X S2:Y=10 L2:R2=X
S2:Y=10 S1:X=10 L1:R1=Y S1:X=10
L2:R2=X L1:R1=Y L2:R2=X S2:Y=10 itrequiresallmemoryoperationstobeatomic
toorestrictive,highoverheads
SC SC SC SC
Loadarepreferred
ASahu slide19 ASahu slide20
byStore:Buffered

XY
OperationXmustcompletebeforeoperationYisdone Loadsareallowedtoovertakestores
Sequentialconsistencyrequires:R W,RR,WR,
WW Writebufferingispermitted
RelaxWR
Totalstoreordering
Total store ordering
RelaxWW 1. TotalStoreOrdering:Writesareatomic
Partialstoreorder 2. ProcessorConsistency:Writesneednotbe
RelaxRWandRR atomic Invalidationsmaygradually
Weakorderingandreleaseconsistency propagate
Consistencymodelismultiprocessorspecific S1:X=10 S2:Y=10
Programmerswilloftenimplementexplicit FENCE FENCE //NoWritebufferingpoint
synchronization L1:R1=Y L2:R2=X
ASahu slide22

P1 P2 SCensuresthat1is
PartialStoreOrdering printed
A = 1; while(flag=0);
flag = 1; print A; TSO,PCalsodoso
Loadsareallowedtoovertakestores PSOdoesnot
Writescanbereordered
Writes can be re ordered
P1 P2 SCensuresthatifBis
Memorybarrierorfenceareusedto printedas1then
A = 1; print B;
explicitlyorderanyoperations Aisalsoprintedas
B = 1; print A;
1
1. TSO:Writesareatomic
TSO,PCalsodoso
Furtherimprovestheperformance 2. PC:Writesneednotbeatomic
Invalidationsmaygradually PSOdoesnot
propagate
ASahu slide23 ASahu
3. PSO:Writecanbeordered slide24

ASahu 4
CS521CSEIITG 11/23/2012

P1 P2 P3
A = 1; while(A=0); while(B=0);
B = 1; print A;
WeakOrderingorWeakConsistency
SCensuresthat1isprinted.TSOandPSOalsodothatbutPC LoadsandStoresarenotrestrictedtofollow
doesnot
//PSOdoesasonlyWriteperProcess anorder
Explicitsynchronizationprimitivesareused
li i h i i i ii d
P1 P2
A = 1; B = 1; Synchronizationprimitivesfollowastrictorder
print B; print A; 1. TSO:Writesareatomic
2. PC:Writesneednotbeatomic Easytoachieve
SCensuresthatbothcantbe Invalidationsmaygradually
printedas0.TSO,PCand propagate Lowoverhead
PSOdonot 3. PSO:Writecanbeordered
AsLoadsareallowedtoovertakestores
ASahu slide25 ASahu slide26

WC RC
Furtherrelaxationofweakordering R/W R/W
1
Synchprimitivesaredividedintoaquire and R/W 1 R/W

release operations
synch aquire
R/Woperationsafteranaquire
R/W operations after an aquire cannotmove
can not move
beforeitbutthosebeforeitcanbemoved 2
R/W
2
R/W

R/W R/W
after
R/Woperationsbeforearelease cannotmove synch release
afteritbutthoseafteritcanbemovedbefore R/W R/W
3 3
R/W R/W

ASahu slide27 ASahu slide28

ASahu 5

S-ar putea să vă placă și