Sunteți pe pagina 1din 30

Shared Memory Consistency of Shared Variables

The ideal picture of


shared memory :
CPU0 CPU1 CPU2 CPU3
Shared Memory
Read /
Write
The actual architecture of shared memory
systems :
R/W of
Misses
Cache
!"#alidate
CPU0 CPU1 CPU2 CPU3
Shared Memory
Read /
Write
$ocal
Cache
$ocal
Cache
$ocal
Cache
$ocal
Cache
Symmetric Multi-Processor (SMP) :
CPU0 CPU1 CPU2 CPU3
$ocal
Memory
Module
$ocal
Memory
Module
$ocal
Memory
Module
$ocal
Memory
Module
%et&or'
Distributed Shared Memory (DSM) :
The Million $$s Question:
How/When Does One Process
Read Other Processs Writes
CPUi
Write #alue (
to local copy
of shared
#aria)le *
W *+(
,ssumptio": !"itial #alue of shared #aria)les is al&ays 0-
CPU.
R *+0/ R *+(/
Read * from
local copy
Why is this a 0uestio"/ 1ecause temporal order relatio"s li'e
2)efore/after3 do "ot "ecessarily hold i" a distri)uted system-
Why Memory Model
a40+)40
Pri"t5)6 Pri"t5a6
a41 )41
Pri"ted:0+0/
Pri"ted:1+0/
Pri"ted:1+1/
,"s&ers the 0uestio":
2Which writes by a
!rocess are seen by
which reads of the
other !rocesses/3
Memory Consistency Models
Pi" R *7 W *+87 R *7 R *
P#" R *7 W *+137 R *7 R *
9(ample pro:ram:
, co"siste"cy/memory model is a" 2agreement )et&ee" the
e(ecutio" e"#iro"me"t 5;/W+ <S+ middle&are6 a"d the processes-
Ru"time :uara"tees to the applicatio" certai" properties o" the
&ay #alues &ritte" to shared #aria)les )ecome #isi)le to reads-
This determi"es the memory model+ &hat=s #alid+ &hat=s "ot-
9(ample e(ecutio":
Pi" R *+07 W *+87 R *+87 R *+13
P#" R *+07 W *+137 R *+137 R *+8
<rder of &rites to * as see" to Pi: 516 W *+87 526 W *+13
<rder of &rites to * as see" to P.: 516 W *+137 526 W *+8
Memory Model" Coherence
Cohere"ce is the memory model i" &hich 5the ru"time :uara"tees to
the pro:ram that6 &rites performed )y the processes for e#ery
specific #aria)le are #ie&ed )y all processes i" the same full order-
9(ample pro:ram: All #alid e(ecutio"s u"der Cohere"ce:
Pi"
W V,7
R V
R V
P#"
W V,13
R V
R V
The Re:ister Property: the #ie& of a process co"sists of the #alues it
2sees3 i" its reads+ a"d the &rites it performs- !f a R * i" P &hich is later
tha" W *+( i" P sees #alue differe"t tha" (+ the" a later R * ca""ot see (-
Pi"
W V,7
R V,7
R V,7
P#"
W V,13
R V,13
R V,7
Pi"
W V,7
R V,7
R V,7
P#"
W V,13
R V,7
R V,7
Pi"
W V,7
R V,7
R V,13
P#"
W V,13
R V,13
R V,13
Pi"
W V,7
R V,13
R V,13
P#"
W V,13
R V,13
R V,13
Pi"
W V,7
R V,7
R V,7
P#"
W V,13
R V,13
R V,13
$ormal definition of Coherence
Pro:ram <rder: The order i" &hich i"structio"s appear i" each
process- This is a partial order o" all the i"structio"s i" the
pro:ram -
, seriali>atio": , full order o" all the i"structio"s 5reads/&rites6 of
all the processes+ &hich is co"siste"t &ith the pro:ram order -
, le:al seriali>atio": , seriali>atio" i" &hich each read X retur"s
the #alue &ritte" )y the latest write X i" the full order .
$et P )e a pro:ram7 let P
X
)e the 2su)?pro:ram3 of P &hich
co"tai"s all the read X/write X operatio"s o" X o"ly -
Cohere"ce: P is said to )e coherent if for e#ery #aria)le X there
e(ists a le:al seriali>atio" of P
X
- 5%ote: a process ca""ot disti":uish
o"e such seriali>atio" from a"other for a :i#e" e(ecutio"6
%&am!les
Process 1
write x,1
write x,2
Process 2
read x,2
read x,1
Cohere"t- Seriali>atio"s:
(: &rite (+1+ read (+1
y: &rite y+1+ read y+1
%ot Cohere"t-
Cycle of dependencies.
Ca""ot )e seriali>ed-
%ot Cohere"t-
Ca""ot )e seriali>ed-
Process 2
read y,1
write x,1
Process 1
read x,1
write y,1
Process 1
read x,1
write x,2
Process 2
read x,2
write x,1
Process 2
read y,1
write x,1
Se'(ential Consistency )*am!ort +,-,.
Se0ue"tial Co"siste"cy is the memory model i"
&hich all reads/&rites performed )y the processes
are #ie&ed )y all processes i" the same full order-
Cohere"t-
%ot Se0ue"tially
co"siste"t-
Cohere"t-
%ot Se0ue"tially
co"siste"t-
Process 1
write x,1
write y,1
Process 2
read y,1
read x,0
Process 1
read x,1
write y,1
Process 2
read y,1
write x,1
Strict /Stron01 Memory Models
a40+)40
Pri"t5)6 Pri"t5a6
a41 )41
Pri"ted:0+0 or 0+1 or 1+0
Pri"ted:1+1
Se'(ential Consistency"
@i#e" a" e(ecutio"+ there
e(ists a" order of
reads/writes &hich is
co"siste"t &ith all pro:ram
orders-
Coherence"
Aor a"y 2ariable &+ there e(ists
a" order of read x/write x
co"siste"t &ith all p-o-s-
$ormal definition of Se'(ential Consistency
$et P )e a pro:ram -
Se0ue"tial Co"siste"cy: P is said to )e seuentially consistent
if there e(ists a le:al seriali>atio" of all reads/writes i" P -
Obser2ation"
9#ery pro:ram &hich is se0ue"tially co"siste"t is also cohere"t-
Concl(sion"
Se0ue"tial Co"siste"cy has stronger reuirements a"d &e thus
say that it is stronger tha" Cohere"ce-
3n 0eneral"
, co"siste"cy model A is said to )e 5strictly6 stro":er tha" B if
all e(ecutio"s &hich are #alid u"der A are also #alid u"der B-
4he !roblem of stron0 consistency models
The ru"time system should e"sure the e(iste"ce of le:al
seriali>atio"+ a"d the same co"siste"t #ie& for all processes -
This re0uires lots of e(pe"si#e coordi"atio" de:rades
performa"ce B
P1:
Pri"t5U6
Write *+1
P2:
Pri"t5*6
Write U+1
SC: ;ard&are ca""ot reorder
locally i" each thread for this
&ill result i" a possi)le pri"ti": 1+1-
;W may reorder a"y&ay a"d
postpo"e &rites+ )ut the" &hy
reorder i" the first place/
Cohere"ce Aor)ids Reorderi":
p-( 4 0
p-(41
a4p-(
)40-(
assert5a)6
<"ce thread sees a" update C ca""ot 2for:et3 it has see" it-

Ca""ot reorder t&o reads of the same memory locatio"-


0-( is aliased to p-(-
Reorderi": may
ma'e assi:"me"t to
1 early 5seei": 06
a"d that to , late
5seei": 16- The ri:ht
thread see order of
&rites differe"t
from left thread-
Coherence ma5es reads !re2ent
common com!iler o!timi6ations
p a"d 0 mi:ht poi"t to same o).ect
p-( 4 0
p-(41
a4p-(
)40-(
assert5p440 a ) c6
Ca""ot put c4a c4p-(
reads ca" ma'e a
process see writes
)y a"other process-
The read 2'ills3 later
reuse of local #alues-
1D
Release Consistency
)7harachorloo et al8 +,,9: D;SH.
E
!"troduces a special type of #aria)les+ called
synchroni!ation "ariables or loc#s-
E
$oc's ca""ot )e read or &ritte"- They ca" )e
acuired a"d released+ de"oted acquire(L)
a"d release(L) for a loc' $-
E
, process that ac0uired a loc' $ )ut has "ot
released it+ holds it-
E
%o more tha" o"e process ca" hold a loc' $+ &hile
others &ait-
K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J.L. Hennessy. Memory consistency
and event ordering in scalable shared-memory multiprocessors. In Proceedings of the !th Annual
International "y#posiu# on $o#puter Architecture, pages %&&'(. I))), *ay ++,.
1F
Usi": release a"d acquire to defi"e
e(ecutio"?flo& sy"chro"i>atio" primiti#es
E
$et a set of processes release to'e"s )y reachi": the operatio"
Release i" their pro:ram order-
E
$et a"other set 5possi)ly &ith o#erlap6 ac0uire those to'e"s )y
performi": acquire operatio"+ &here acquire ca"
proceed o"ly &he" all to'e"s ha#e already arri#ed from all
releasi": processes-
E
%-&ay synchroni!ation 4 loc'?u"loc'+ 1 release+ 1
acquire
E
n-&ay synchroni!ation 4 )arrier+ " releases+ " acquires
E
P,RC=s synch 4 #-&ay synchroni!ation
1G
5 ,lmost Aormal 6 Hefi"itio" of
Release Co"siste"cy
,ssumi": atomic read/&rite/ac0uire/release 5"e#er the case-
Aor simplicity o"ly6
5,61efore a read or write access is allo&ed to perform+ all
precedi": 5pro:ram order6 acquire accesses must )e
performed+ a"d
5161efore a release access is allo&ed to perform+ all
precedi": 5pro:ram order6 read or write accesses must )e
performed+ a"d
5C6 acquire a"d release accesses are se0ue"tially
co"siste"t-
20
A
B
rel(L
1
) w(x)1
t
r(x)0 r(x)? acq(L
1
) r(x)1 r(x)1
U"dersta"di": RC
Arom this poi"t i" this e(ecutio" all
processes must see the #alue 1 i" I
!t is u"defi"ed &hat
#alue is read here- !t
ca" )e a"y #alue
&ritte" )y some
process- ;ere it ca"
)e 0 or 1-
,ccordi": to rule
516: 1 is read i" the
curre"t e(ecutio"-
;o&e#er+ the
pro:rammer ca""ot
)e sure 1 &ill )e
read i" all
e(ecutio"s-
,ccordi": to rules
5,6 a"d 5C6+ the
pro:rammer '"o&s
that i" all e(ecutio"s
this read retur"s 1-
21
Acquire a"d Release

release ser#es as a memory-synch operatio"+ or a flush


of the local modificatio"s to all other processes-

acquire a"d release are "ot o"ly used for


synchroni!ation of e'ecution+ )ut also for synchroni!ation
of memory( i-e- for propa:atio" of &rites from/to other
processes-
C
This allo&s to o#erlap the t&o e(pe"si#e types of
sy"chro"i>atio"-
C
This tur"s out also simpler o" the pro:rammer from
sema"tic poi"t of #ie&-
22
Acquire a"d Release 5co"t-6
E
, release follo&ed )y a" acquire of the same loc'
:uara"tees to the pro:rammer that all writes pre#ious to
the release &ill )e see" )y all reads follo&i": the
acquire.
E
The idea is to let the pro:rammer decide &hich )loc's of
operatio"s "eed )e sy"chro"i>ed+ a"d put them )et&ee"
matchi": pair of acquire-release operatio"s-
E
!" the a)se"ce of release/acquire pairs+ there is no
assurance that modificatio"s &ill e#er propa:ate )et&ee"
processes-
2D
;appe"ed?1efore relatio" i"duced )y
acquire/release
A
B
rel(L
2
)
w(x)
t
acq(L
1
)
r(x)
acq(L
2
)
rel(L
1
) acq(L
2
)
w(y)
r(y)
rel(L
2
) w(y) rel(L
1
) r(x)
C
r(y)
w(x)
28
Hata Races i" RC
Release Co"siste"cy does "ot :uara"tee a"ythi": a)out
ordered propa:atio" of updates
!"itially: grades = oldDataase! "#dated = false!

grades = newDatabase
u!dated = true
while (u!dated == false)
"#=grades.grade$%(lecturers&'n)

$%read T.A.

$%read Lecturer
E
!f the modificatio" of #aria)le updated is passed to
$ecturer &hile the modificatio" of grades is "ot+ the"
$ecturer loo's at the old data)aseB
E
This is possi)le i" )elease *onsistency+ )ut "ot i"
Seuential *onsistency-
2G
%&!ressi2eness of Release Consistency
)7harachorloo et8al +,,9.
4heorem" RC 4 SC for pro:rams ha#i": "o data?races -
@i#e" a properly?la)eled pro:ram P+ the set of #alid e(ecutio"s for
P is the same o" systems pro#idi": RC a"d those pro#idi": SC -
Co"clusio" 5<pe"MP+ Ja#a+ C+ etc6 :
E
System pro#ide RC 5performa"ce6
E
Pro:rammer a#oid data?races 5pro:ram #erificatio"6

1est of )oth &orldsB


30
*a6y Release Consistency
)<eleher et al8: 4readmar5s +,,=. >
E
Postpo"e modificatio"s u"til remote process
2really3 "eeds them
E
More rela(ed tha" RC
P. Keleher, A. L. $o-, ". D.arkadas, and /. 0.aenopol. Treadmarks: Distributed shared memory on
standard workstations and operating systems. In Proceedings of the ++1 /inter 2seni- $onference,
pages %&&3', Jan. ++1 8
E />1
31
Aormal Hefi"itio" of
$a>y Release Co"siste"cy
5,61efore a read or write access is allo&ed to perform
&ith respect to a"y other process+ all pre#ious acquire
accesses must )e performed &ith respect to that other process+
a"d
5161efore a release access is allo&ed to perform &ith
respect to a"y other process+ all pre#ious read or write
accesses must )e performed &ith respect to that other process+
a"d
5C6 acquire a"d release accesses are se0ue"tially
co"siste"t-
32
A
B
rel(L
1
) w(x)1
t
r(x)0 r(x)? acq(L
1
) r(x)? r(x)1
U"dersta"di": the $RC Memory Model
C
r(x)0 r(x)? acq(L
2
) r(x)? r(x)?
E
!t is :uara"teed that the ac0uirer of the same loc' sees the
modificatio" that precede the release i" pro:ram order-
33
U"dersta"di": the $RC Memory Model:
Tra"siti#ity
E
The process C sees the modificatio" of ' )y ,-
A
B
rel(L
1
) w(x)1
t
acq(L
1
)
C
acq(L
2
)
w(y)1 rel(L
2
)
r(x)1 r(y)1
acq(L
2
) rel(L
1
)
3K
!mpleme"tatio" of $RC
E
Satisfyi": the happe"ed?)efore relatio" )et&ee"
all operatio"s is e"ou:h to satisfy $RC-
C
Mai"te"a"ce a"d usa:e of such a detailed orderi":
&ould )e e(pe"si#e-
E
!"stead+ the orderi": is applied to process
inter"als-
C
!"ter#als are se:me"ts of time i" the e(ecutio" of a
si":le process-
C
%e& i"ter#al )e:i"s each time a process e(ecutes a
sy"chro"i>atio" operatio"-
3D
!"ter#als
&
2
acq(L
1
) rel(L
2
) acq(L
2
) rel(L
1
)
&
1
t
&
'
rel(L
1
) acq(L
'
)
acq(L
2
) rel(L
'
)
1
2 3
1 2
3
3 2
1
K
D
3F
;appe"ed?)efore of !"ter#als
, happe"ed )efore partial order is defi"ed
)et&ee" i"ter#als-
," i"ter#al i
+
precedes a" i"ter#al i
%

accordi": to happened-before of inter"als+ if
all accesses i" i
+
precede accesses i" i
%

accordi": to the happened-before of accesses.
38
*ector Timestamps
E
," i"ter#al is said to )e performed at a
process if all i"ter#al=s accesses ha#e )ee"
performed at that process-
E 9ach process p has "ector timestamp ,
p
that
trac's &hich i"ter#als ha#e )ee" performed
at that process-
C
, #ector timestamp co"sists of a set of i"ter#al
i"dices+ o"e per process i" the system-

S-ar putea să vă placă și