MCC Iitg Sahu MPLec09

CS521CSEIITG 11/23/2012
ASahu 1 ASahu 2
Processesrunondifferentprocessors
Multiprocessorsarelikelytobecost/power independently
effectivesolutions
Becauseitsharelotsofresources Atsomepointtheyneedtoknowthestatusof
Personalroomiscostlierthandormitory eachotherfor
Sharingresourcearisemanyotherproblems Communication,mutualexclusionetc
Coherence
Coherence SimpleLocking
Si l L ki
Shareddataatallplacedshouldbesame
CriticalSections PartofACA
LockandBarrierDesign Course@IITG
Lock(L)
Consistency CriticalSection(C)
Ordershouldbesimilartoserial(ROB) Unlock(L)
OneprocessorInterferenceothers
Shareefficientlyusingsomepolicy
ASahu 3 ASahu slide4
SimpleLocking Noinstruction
lock: ldreg,loc//copylocationtoreg areatomic
cmp reg #0//comparewith0 Hardwareprimitiveforatomicread+write is
bnz lock//ifnotzerotryagain required e.g.
st loc#1//store1atloctomarkitlocked
return; Test&Set, //testforunlock(0)thensetthelock(1)
Exchange,
Unloock:st
Unloock: st loc#0
loc #0 Fetch&Increment
F t h&I t
return
Supposetwoprocessorarecontendingtoacquirelock
Bothreadlockatsametimevalue0andpassesthe
branches
BothlockedthevariableandentertoCS
ThiscontradictthemeaningofaLOCK
ASahu slide5 ASahu slide6
ASahu 1
CS521CSEIITG 11/23/2012
Lock:0indicatesfreeand1indicateslocked
CodetolockX: SPINing LLr1XReadingfromalocationX
Trytotestandacquirethelockina
r2 1 tightloop //dosomeoperation
lockit: r2 X ;atomic exchange Time
( )
if(r20)lockit ;
;already y locked SCr3XStoringtolocationr3toX
locksarecachedforefficiency,coherenceisused ifunsuccessfulr3==0
BettercodetolockX:
StorewillbeunsuccessfulifvaluesofXis
lockit: r2 X ;read lock altered/changedbyothersprocessorbetween
if(r20)lockit ;not available timeofLLandtimeofSC
r2 1
r2 X ;atomic exchange Loadlinked/StoreConditional
if(r20)lockit
ASahu ;already locked 7 ASahu slide8
Simplertoimplement lockit:
AtomicexchangeusingLLandSC LL r2, X ;load locked
try: r3 r2 ;move exchange value
if(r20)lockit ;not available
LL r1, X ;load locked
SC r3, X ;store conditional r2 1
if(r3
if(r3=0)try
0)try ;branch store fails SC r2, X ;store cond
r2 r1 ;put loaded value in r2 if(r2=0)lockit ;branch store
Fetch&incrementusingLLandSC fails
try: LL r1, X ;load locked
r3 r1 + 1 ;increment
SC r3, X ;store conditional Spinlockwithexponentialbackoffreduces
if(r3=0)try ;branch store fails contention
DoWork(){ Initialize
Spinningwastetime Do Phase I work
TAS Lock
RecallMACProtocol Barrier();
Do Phase II work PhaseI 1 2 3 N
NonPersistenceCSMAprotocol
Waitrandomtimeifmediumif Barrier();
Do Phase III work()
time
busy,thensend PhaseII
Backoff lock
Barrier(); 1 2 3 N
Spinlockwithexponential }
backoffreducescontention
threads for (i=0;i<NumProc;i++){
Waitkamountoftimefor1st PhaseIII
1 2 3 N
attempt P[i].Start(DoWork);
Waitk*ci amountoftime }
PrintResult
forith attempt Print result();
ASahu 2
CS521CSEIITG 11/23/2012
IfAllProcessor
Barrier(){ Passes/completedthen Sumofallfinishedprocessor
lock (X) theygotonextPhase
if(count=0)release 0 Lock
count++ Lock
unlock(X)
Lock1 Lock2
if(count=total){count0;release1}
else spin until(release==1) P1 P2 P3 PN
P1 P1 PM P1 P1 PM
}
Everyoneaccessingto Lockcontentionis
samelock distributedinaTreeFashion
Do Phase I work
local_sense !local_sense //Toggle
Barrier(bar1, p);
lock (X)
//After this release=1, but not
//visible to all some how, it may count++
//happened one process is not got this unlock(X)
//
//and waiting
g while other entered to if(count = total)
//NEXt barrier {count0;releaselocal_sense}
Do Phase II work else
Barrier(bar1, p); {spin
//Some will enter to this not all, So until(release==local_sense)}
barrier will not end atall.
Whenmustaprocessorseethevaluethathas P1: A = 0 P2: B = 0

beenwrittenbyanotherprocessor?Atomicityof ... ...
operations systemwide? A = 1 B = 1
Canmemoryoperationsbere
Can memory operations be reordered?
ordered? L1: if(B=0)S1 L2: if(A=0)S2
Variousmodels:HighlyrecommendedbyHPBook WhichstatementsamongS1andS2aredone?
http://rsim.cs.uiuc.edu/~sadve/Publications/
models_tutorial.ps BothS1,S2maybedoneifwritesaredelayed
ASahu 3
CS521CSEIITG 11/23/2012
S1:X=10 S2:Y=10 Resultofanyexecutionissameasifthe

L1:R1=Y L2:R2=X operationsofallprocessorswereexecutedin
somesequentialorder
Time p p
Operationsofeachprocessoroccurinthe
S1:X=10 S2:Y=10 S1:X=10 L1:R1=Y orderspecifiedbyitsprogram
L1:R1=Y L2:R2=X S2:Y=10 L2:R2=X
S2:Y=10 S1:X=10 L1:R1=Y S1:X=10
L2:R2=X L1:R1=Y L2:R2=X S2:Y=10 itrequiresallmemoryoperationstobeatomic
toorestrictive,highoverheads
SC SC SC SC
Loadarepreferred
byStore:Buffered
XY
OperationXmustcompletebeforeoperationYisdone Loadsareallowedtoovertakestores
Sequentialconsistencyrequires:R W,RR,WR,
WW Writebufferingispermitted
RelaxWR
Totalstoreordering
Total store ordering
RelaxWW 1. TotalStoreOrdering:Writesareatomic
Partialstoreorder 2. ProcessorConsistency:Writesneednotbe
RelaxRWandRR atomic Invalidationsmaygradually
Weakorderingandreleaseconsistency propagate
Consistencymodelismultiprocessorspecific S1:X=10 S2:Y=10
Programmerswilloftenimplementexplicit FENCE FENCE //NoWritebufferingpoint
synchronization L1:R1=Y L2:R2=X
ASahu slide22
P1 P2 SCensuresthat1is
PartialStoreOrdering printed
A = 1; while(flag=0);
flag = 1; print A; TSO,PCalsodoso
Loadsareallowedtoovertakestores PSOdoesnot
Writescanbereordered
Writes can be re ordered
P1 P2 SCensuresthatifBis
Memorybarrierorfenceareusedto printedas1then
A = 1; print B;
explicitlyorderanyoperations Aisalsoprintedas
B = 1; print A;
1
1. TSO:Writesareatomic
TSO,PCalsodoso
Furtherimprovestheperformance 2. PC:Writesneednotbeatomic
Invalidationsmaygradually PSOdoesnot
propagate
ASahu slide23 ASahu
3. PSO:Writecanbeordered slide24
ASahu 4
CS521CSEIITG 11/23/2012
P1 P2 P3
A = 1; while(A=0); while(B=0);
B = 1; print A;
WeakOrderingorWeakConsistency
SCensuresthat1isprinted.TSOandPSOalsodothatbutPC LoadsandStoresarenotrestrictedtofollow
doesnot
//PSOdoesasonlyWriteperProcess anorder
Explicitsynchronizationprimitivesareused
li i h i i i ii d
P1 P2
A = 1; B = 1; Synchronizationprimitivesfollowastrictorder
print B; print A; 1. TSO:Writesareatomic
2. PC:Writesneednotbeatomic Easytoachieve
SCensuresthatbothcantbe Invalidationsmaygradually
printedas0.TSO,PCand propagate Lowoverhead
PSOdonot 3. PSO:Writecanbeordered
AsLoadsareallowedtoovertakestores
WC RC
Furtherrelaxationofweakordering R/W R/W
1
Synchprimitivesaredividedintoaquire and R/W 1 R/W
release operations
synch aquire
R/Woperationsafteranaquire
R/W operations after an aquire cannotmove
can not move
beforeitbutthosebeforeitcanbemoved 2
R/W
2
R/W

R/W R/W
after
R/Woperationsbeforearelease cannotmove synch release
afteritbutthoseafteritcanbemovedbefore R/W R/W
3 3
R/W R/W
ASahu 5

MCC Iitg Sahu MPLec09

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

MCC Iitg Sahu MPLec09

Încărcat de

Drepturi de autor:

Formate disponibile

CS521CSEIITG 11/23/2012

ASahu slide9 ASahu slide10

ASahu slide15 ASahu slide16

Whenmustaprocessorseethevaluethathas P1: A = 0 P2: B = 0

ASahu slide17 ASahu slide18

S1:X=10 S2:Y=10 Resultofanyexecutionissameasifthe

ASahu slide27 ASahu slide28

S-ar putea să vă placă și