Benes Networks Generalizare 2013

CS 498 : Theory of Parallel Computing
Marc Snir
Lecture 13 Oct 7th
1
Switching Networks, Permutations and FFT Graphs
� A switching network (SN) is a DAG

with N inputs and N outputs. Each
node is indegree=outdegree=d . The
node is a switch that can connect each straight crossed
input to each output (in an arbitrary 0 0
1-to-1 connection). 1 1
� Figure shows SN with 2x2 switches ;

each has 2 possible states (straight and 2 2
crossed) 3 3
� SN is configured to connect
0 → 1, 1 → 2, 2 → 3, 3 → 2.
� SN is permutation network if it can
connect any permutation
2
Benes Network
N/2xN/2
Benes
� A network with N inputs/outputs Network
(N power of 2) is build
recursively from two lines of 2x2
switches and 2 N/2 Benes
networks
� Each switch connects to top N/2xN/2
Benes
and bottom subnetworks. Network
3
Benes Networks Permute
� Construct bipartite graph : nodes are the left and right row of
switches. Two nodes uv are connected if an input at switch u
has to reach an output at switch v .
� Color graph with two colors – this is alkways possible (all
cycles have even length)
� Route connectionss of one color thru top half subnetwork and
edges of the other color to bottom half subnetwork
� Each switch routes one connection to top subnetwork and one
connection to bottom subnetwork
� Use same approach recursively to set up each subnetwork.
4
Example
� Permute : 0->4, 1->2, 2->0, 3->7, 4->5, 5->3, 6->1, 7->6
� Coloring : 0->4, 1->2, 2->0, 3->7, 4->5, 5->3, 6->1, 7->6
Coloring Network Setup
10 01
0 0
1 1
23 23 2 2
3 3
4 4
45 54 5 5
6 6
7 7
67 67
5
FFT Network
The network with the topology of an FFT computation : N = 2n

inputs and outputs, n = lg N stages, N/2 nodes at each stage.
Nodes at stage i are connected to inputs that diﬀer in bit i of
binary representation.
000
001
010
011
100
101
110
111
6
Homework 1
� Prove : A Benes network consists of a reverse FFT network

followed by an FFT network (middle stage is common)
7
Homework 2
� Prove : an FFT SN, with all switches straight, performs the bit
reversal permutation (an . . . a1 → a1 . . . an )
000
001
010
011
100
101
110
111
8
Homework 3
� Prove : A reverse FFT graph is isomorphic to a regular FFT

graph ; the isomorphism maps output a1 . . . an of the reverse
FFT to input an . . . a1 of the regular FFT graph
9
Conclusion
� A SN that consists of the concatenation of 3 FFT graphs is a

permutation network
� The middle FFT graph is set with all switches straight. The
combination is isomporphic to a Benes network (if we glue
outputs of the first FFT to the inputs of the 3rd FFT they are
connected to).
� An algorithm that evaluates the FFT graph takes at least
Ω( P max(BNlglgM,lg(N/P))
N
) steps
� The algorithms can be modifed to evaluate (pebble) 3 FFT
graphs in a row. The same communication steps can be used
to emulation a Benes network and, hence, to perform any
permutation.
10
Transpose
� Specific, hard√permutation
√ : √
transpose of N × N matrix. N B
� Assume N = 2n ,n even.
Transpose is B
Xan ...a1 ↔ Xan/2 ...a1 an ...an/2+1
(rotate binary address). √
N
� Algorithm, for B 2 < M :
Processors read B × B
submatrices, transpose them in
memory and store back
� T = N/PB
11
Cont.
� B 2 < M. Assume M = 2m , B = 2b .
� Basic operation : read aligned 2m−b block of 2b words,
permute in cache, write back. Can permute
Xan ...a1 ↔ Xan ...an/2+m−b+1 am−b a1 an/2 ...am−b+1 am−b+1 an/2+m−b ...an/2+1
(permute M/B × B submatrices) in one pass – time N/BP.
� Problem essentially solved when each block (of size B)
contains the right set of elements – can then move blocks to
right place inb one pass.
� Can complete transposition in b/2(m − b) passes
� �
N lg B
T =O ·
PB lg(M/B)
12
Lower Bound
� Need to “gather” words from distinct lines into each line. Show
this cannot be done too fast.
� “Step” : One I/O operation done by one processor.
� t is the number of words in line i that have to go to line j at
xi,j
end of step t (0 < i, j ≤ N/B).
� t is the number of words in cache i that have to go to line j
yi,j
at end of step t (0 < i ≤ P, 0 < j ≤ N/B).
� Φt = ∑i,j xi,,j
t lg x t +
i,j ∑i,j yi,j lg yi,j (entropy-like function)
� Initially, xi,j = 0 or xi,j = 1, yi,j = 0, so Φ0 = 0
� T = B, x T = 0 if i �= j and y T = 0, so
Finally, xi,i i,j i,j
T
Φ = (N/B) · (B lg B) = N lg B .
13
I/O
� Easy to check that write does not increase Φ
� Read by processor i from line k at step t + 1 : Let yj = yi,j t and
t
xj = xk,i . ∑ yj = M − B and ∑ xJ = B. The change in potential
is
∇Φ = ∑ ((yj + xj ) lg(yj + xj ) − yj lg yj − xj lg xj )
j
� ∇Φ is maximized when x1 = · · · = xN/B = B/(N/B) and

y1 = · · · = yn/N = (M − B)/(N/B). Thus
� �
N M M M −B M −B B B
∇Φ ≤ lg − lg − lg = MH(B/M)
B N/B N/B N/B N/B N/B N/B
� (H(α) = −α lg α − (1 − α) lg(1 − α). For α < 0.5
H(α) = O(−α lg α), so that ∇Φ = O(B lg(M/B). It follows
that the number of I/O steps is Ω((N lg B)/(B lg(M/B)) and
� �
N lg B
T =Ω
PB lg(M/B)
14

Benes Networks Generalizare 2013

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Benes Networks Generalizare 2013

Încărcat de

Drepturi de autor:

Formate disponibile

CS 498 : Theory of Parallel Computing

Lecture 13 Oct 7th

� A switching network (SN) is a DAG

input to each output (in an arbitrary 0 0

� Figure shows SN with 2x2 switches ;

Coloring Network Setup

The network with the topology of an FFT computation : N = 2n

� Prove : A Benes network consists of a reverse FFT network

� Prove : A reverse FFT graph is isomorphic to a regular FFT

� A SN that consists of the concatenation of 3 FFT graphs is a

� ∇Φ is maximized when x1 = · · · = xN/B = B/(N/B) and

S-ar putea să vă placă și