Documente Academic
Documente Profesional
Documente Cultură
Introduction
20 June 2011
All-prefix-sums
Definition (All-prefix-sums)
The all-prefix-sums operation (scan) takes a binary associative
operator , and an array of n elements
[a0 , a1 , . . . , an1 ],
and returns
[a0 , (a0 a1 ), . . . , (a0 a1 an1 )].
Sequential Algorithm
i m p o r t random
i n p = [ random . r a n d i n t ( 0 , 9 ) f o r x i n r a n g e ( 1 0 ) ]
out = [ ]
o u t . append ( i n p [ 0 ] )
for i in range (1 , len ( inp ) ) :
o u t . append ( o u t [ 1] + i n p [ i ] )
print ( inp )
p r i n t ( out )
Example
input:
output:
6
6
1
7
5
12
10
22
1
23
7
30
2
32
5
37
7
44
Exclusive Scan
Example
input:
output:
6
0
1
6
5
7
10
12
1
22
7
23
2
30
5
32
7
37
Notes
Uses
1
Evaluate polynomials
Solve recurrences
10
11
Basic concepts
Definition (Work-efficient)
No more operations (or work) than the sequential version. The two
implementations must have the same work complexity.
Definition (Step complexity)
The number of steps that the algorithm executes.
Example
x0
x1
x2
x3
x4
x5
x6
x7
Example
x0 OO
OOO
P
(x0 ..x0 )
OOO
OOO
O'
x1 OO
OOO
P
(x0 ..x1 )
OOO
OOO
O'
x2 OO
OOO
P
(x1 ..x2 )
OOO
OOO
O'
x3 OO
OOO
P
(x2 ..x3 )
OOO
OOO
O'
x4 OO
OOO
P
(x3 ..x4 )
OOO
OOO
O'
x5 OO
OOO
P
(x4 ..x5 )
OOO
OOO
O'
x6 OO
OOO
P
(x5 ..x6 )
x7
OOO
OOO
O'
P
(x6 ..x7 )
Example
x0 OO
OOO
OOO
OOO
O'
P
(x0 ..x0 ) W
P
(x0 ..x0 )
x1 OO
OOO
OOO
OOO
O'
x2 OO
OOO
OOO
OOO
O'
x3 OO
OOO
OOO
OOO
O'
x4 OO
OOO
OOO
OOO
O'
x5 OO
OOO
OOO
OOO
O'
x6 OO
OOO
x7
OOO
OOO
O'
P
(x0 ..x1 ) W
(x1 ..x2 ) W
(x2 ..x3 ) W
(x3 ..x4 ) W
(x4 ..x5 ) W
(x5 ..x6 )
(x6 ..x7 )
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
W
W
W
W
W
W
W
W
W
W
WW+ P
WW+ P
WW+ P
WW+ P
WW+ P
+ P
P
(x0 ..x1 )
(x0 ..x2 )
(x0 ..x3 )
(x1 ..x4 )
(x2 ..x5 )
(x3 ..x6 )
(x4 ..x7 )
Example
x0 OO
OOO
OOO
OOO
O'
P
(x0 ..x0 ) W
x1 OO
OOO
OOO
OOO
O'
x2 OO
OOO
OOO
OOO
O'
x3 OO
OOO
OOO
OOO
O'
x4 OO
OOO
OOO
OOO
O'
x5 OO
OOO
OOO
OOO
O'
x6 OO
OOO
x7
OOO
OOO
O'
P
(x0 ..x1 ) W
(x1 ..x2 ) W
(x2 ..x3 ) W
(x3 ..x4 ) W
(x4 ..x5 ) W
(x5 ..x6 )
(x6 ..x7 )
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
WWWW
W
W
W
W
W
W
W
W
W
W
WW+ P
WW+ P
WW+ P
WW+ P
WW+ P
+ P
P
P
(x0 ..x0 ) [[[[[[[[ (x0 ..x1 ) [[[[[[[[ (x0 ..x2 ) [[[[[[[[ (x0 ..x3 ) [[[[[[[[ (x1 ..x4 )
(x2 ..x5 )
(x3 ..x6 )
(x4 ..x7 )
[[[[[[[[
[
[
[
[[[[[[[[ [[[[[[[[[[[[[[[ [[[[[[[[[[[[[[[ [[[[[[[[[[[[[[[
[[[[[[[[
[[[[[[[[
[[[[[[[[
[[[[[[[[
[[[[[[[[
[[[[[[[[
[[[[[[[[
[[[[[[[[
[
[
[
[
P
P
P
P [[[[[[[[[[[- P [ [[[[[[[[[[- P [ [[[[[[[[[[- P [ [[[[[[[[[[- P
(x0 ..x0 )
(x0 ..x1 )
(x0 ..x2 )
(x0 ..x3 )
(x0 ..x4 )
(x0 ..x5 )
(x0 ..x6 )
(x0 ..x7 )
for d = 1 to log2 n do
for k n in parallel do
if k 2d then
x[k] = x[k 2d1 ] + x[k]
end if
end for
end for
Work complexity
Theorem
The algorithm performs:
log2 n
log2 n
log2 n
X
X
X
d1
(n 2
)=
n
2d1 = n log n n = O(n log n)
d=1
d=1
d=1
Notes
for d = 1 to log2 n do
for k n in parallel do
if k 2d then
x[out][k] = x[k 2d1 ] + x[k]
else
x[out][k] = x[in][k]
end if
end for
swap(in, out)
end for