Minimum Spanning Tree Parallel

Design and Analysis of Parallel
Algorithms
Harsh Kumar 106113033
Problem Statement:
To find the minimum spanning tree of an undirected graph using
1. Sequential model
2. Shared Memory model (EREW, CRCW)
Algorithms:
Sequential Algorithm:
Assumption:
G = (V, E) is used to represent a graph G whose vertex set is V and edge
set is E. A matrix representation can be used for computer storage and
manipulation of a graph. Let G be a graph whose vertex set is V = {v 1,
v2, . . ., vn). This graph can be uniquely represented by an n x n adjacency
matrix A whose entries aij, 0 =< i, j < n, are defined as follows:
aij =
=
1, if vi is connected to vj
0, otherwise
procedure Sequential MST ( A)
Step 1: Include vertex v, in the MST and let c(v i) = vo for i = 1, 2, . . ., n 1. Step 2: This step is repeated as long as there are vertices not yet in
the MST:
(2.1) Include in the tree the closest vertex not yet in the tree; that is, for
all vi not in the MST find the edge (vi, c(vi)) for which dist (vi, c(vi)) is
smallest and add it to the tree.
(2.2) For all ui not in the MST, update c(vi); that is, assuming that vj was
the most recently added vertex to the tree, then c(v i) can be updated by
determining the smaller of dist (vi, c(vi)) and dist (vi, vj).
Analysis:
Step 1 requires n constant time operations. Step 2 is executed once for
each of n - 1 vertices. If there are already k vertices in the tree, then
steps 2.1 and 2.2 consist of n - k - 1 and n - k comparisons, respectively.
Thus step 2, and hence the algorithm, require time proportional to n-1 k=1
(n - k), which is O(n2).
Parallel Algorithms:
1. Shared Memory Model:
Assumptions:
The model used is a EREW (Exclusive Read Exclusive Write) shared
memory model following SIMD (Single Instruction Stream Multiple Data
Elements) model of execution.
For the CRCW model, the same algorithm is used except for the fact that
minimum and broadcast functions are not required. The time and cost is
the same for both the models.
Input is the weight matrix W, order vv which resides in the shared
memory where v is the number of vertices in the graph. V is the vertex
set.
Total number of processors used are n1-x where 0<x<1. Each processor Pi
is assigned a distinct subsequence Vi of V of size nx. In other words, Pi is
"in charge" of the vertices in Vi. Note that Pi needs only to store the
indices of the first and last vertices in V i. During the process of
constructing the MST and for each vertex v p in Vi. that is not yet in the
tree, Pi also keeps track of the closest vertex in the tree, denoted c(v p).
The weight matrix W of G is stored in shared memory, where w ij = dist (vi,
vj) for i, j = 0, 1, . . ., n -1.
procedure EREW MST (W , TREE)
Step 1:
(1.1) Vertex viVo islabelled as a vertex already the tree

(1.2) for i=0 N 1 do
for each vertex vjVi do
c ( vj ) <v 0
end for
end for
Step 2: for i=1 n1do
(2.1) for j=0 N 1 do

i Pj finds the smallest of the quantities dist ( vp , c ( vp ) ) ,
where vp is a vertex Vj that is not yetthe tree
ii Let the smallest quantity found i dist (vr , vt) . Pj
delivers a triple(dj , aj, bj)where

dj=dist ( vr , vt ) ,
aj=vr
bj=vt
end for
(2.2)Using procedure MINIMUM the smallest
of the
distances djits vertices ajbjare found

(2.3)P 0 assigns( vr , vt ) TREE (i) , theith entry of arrayTREE
(2.4)Using BROADCAST , vr is made known all N processors
(2.5) for j=0 to N-1 do in parallel

i if vr isVj
then pjlabels vr as a vertex already tree

end if
ii for each vertex v ,5 that is not yetthe tree do

if dist (vp , vq )< dist ( vp , c (vp))
thenc ( vp)=vq
end if
end for
end for
end for
Analysis:
Step 1.1 is done in constant time. Since each processor is in charge of n x
vertices, step 1.2 requires nx assignments. Therefore, step 1 runs in O(n x)
time. In step 2.1, a processor finds the smallest of n x quantities
(sequentially) using nx - 1 comparisons.
Procedures MINIMUM and BROADCAST both involve O (log N) constant
time operations. Since N = n1-x, steps 2.2 and 2.4 are done in O (log n)
time. Clearly steps 2.3 and 2.5 require constant time and O(n x) time,
respectively. Hence each iteration of step 2 takes O(n x) time. Since this
step is iterated n + 1 times, it is completed in O(n 1+x) time.
Consequently, the overall running time of the procedure is O(n 1+x).

Its cost is
c(n) = p(n) x t(n) = n1-x x O(n1+x)
= O(n2)
This means that the procedure is also cost optimal.
Implementation of the Algorithms:

The algorithm is implemented in Java.
Java Threads are used to simulate processors. Each thread is given a
unique id which it uses to reference vertices required. The weight matrix
is defined as a global variable hence accessible to all.
Each thread can only access the vertices from its id so Exclusiveness is
maintained. The threads are created and started. They wait at a particular
piece of code till all the threads are created and are waiting at the same
piece. Then they all start executing the same function provided to all of
them thus ensuring a parallel execution and preserving the SIMD
property.
The CRCW algorithm is also implemented in C++ using OpenMP (Open
Multi-Processing is an application programming interface (API) that
supports multi-platform shared memory multiprocessing programming in
C, C++).
Implemented Code:
Java Code (Sequential and EREW)
import java.util.LinkedList;
import java.util.List;
import java.util.Scanner;
import java.util.concurrent.*;
public class minSpanningTree {
public static final int MAX_VALUE=99999;
public static void main(String[] args) {
int adjacency_matrix[][];
int number_of_vertices,p;
Scanner scan = new Scanner(System.in);
System.out.println("Enter the number of vertices");
number_of_vertices = scan.nextInt();
adjacency_matrix = new int[number_of_vertices+1][number_of_vertices+1];
System.out.println("Enter the Weighted Matrix for the graph");
for (int i = 1; i <= number_of_vertices; i++)
{
for (int j = 1; j <= number_of_vertices; j++)
{
adjacency_matrix[i][j] = scan.nextInt();
if (i == j)
adjacency_matrix[i][j] = MAX_VALUE;
continue;
}
if (adjacency_matrix[i][j] == 0)
{
adjacency_matrix[i][j] = MAX_VALUE;
}
}
}
System.out.println("Enter the number of processors(<=n)");
p= scan.nextInt();
System.out.println("Using the sequential algorithm");
System.out.println("Src Dest Weight");
sequential rithm = new sequential(number_of_vertices);
rithm.mstAlgo(adjacency_matrix);
System.out.println("Using the parallel algorithm");
System.out.println("Src Dest Weight");
parallel kruskalAlgorithm = new parallel(number_of_vertices,p);
kruskalAlgorithm.mstAlgo(adjacency_matrix);
scan.close();
}
}
class sequential{
private int v;
private List<Edge> spanning_tree;
private int closest[];
private boolean visited[];
public sequential(int ver)
{
this.v=ver;
spanning_tree=new LinkedList<Edge>();
closest= new int[ver+1];
visited=new boolean[ver+1];
}
public void mstAlgo(int adjacencyMatrix[][])
{
visited[1]=true;
for(int i=2;i<=v;i++)
closest[i]=1;
for(int i=1;i<=v-1;i++)
{
int mn=MAX_VALUE;
Edge e=new Edge();
for(int j=1;j<=v;j++)
{
if(visited[j]==true)
continue;
if(adjacencyMatrix[j][closest[j]]<mn)
{
mn=adjacencyMatrix[j][closest[j]];
e.source=j;
e.dest=closest[j];
e.weight=mn;
}
visited[e.source]=true;
spanning_tree.add(e);
for(int k=1;k<=v;k++)
{
if(visited[k]==false)
{
int d1=adjacencyMatrix[k][closest[k]];
int d2=adjacencyMatrix[k][e.source];
if(d2<d1)
closest[k]=e.source;
}
}
for(Edge edge: spanning_tree)
{
System.out.println(edge.source+" "+ edge.dest+" "+edge.weight);
}
}
}
class parallel{
private int v;
Edge edges[];
private
private
private
private
private
private
int closest[];
boolean visited[];
int processors;
int adjacencyMatrix[][];
int cur[];
int selected[];
CyclicBarrier cb;
public parallel(int ver,int n)
{
this.v=ver;
edges = new Edge[n+1];
cur=new int[n+1];
closest= new int[ver+1];
visited=new boolean[ver+1];
processors= n;
adjacencyMatrix=new int[v+1][v+1];
selected=new int[n];
for(int i=0;i<n;i++)
selected[i]=0;
cb=new CyclicBarrier(n);
}
public void mstAlgo(int adjMatrix[][])
{
adjacencyMatrix=adjMatrix;
int p=v/processors;
int i=1;
for(int k=0;k<=v;k++)
closest[k]=1;
visited[1]=true;
for(int j=0;j<processors;j++)
{
if(j==processors-1)
p=v-i+1;
edges[j]=new Edge();
Thread t=new Thread(new processor(j,i,p,cb));
t.start();
i=i+p;
}
class processor extends Thread{
private int index;
private int noOfElements;
private int id;
private final CyclicBarrier cc;
public processor(int id,int i, int n,CyclicBarrier cb)
{
this.id=id;
this.index=i;
this.noOfElements=n;
this.cc=cb;
}
public void run()
{
while(true)
{
int mn=MAX_VALUE;
//FINDING THE MINIMUM DIST(VR,VT) PAIR IN VERTICES ASSIGNED TO THIS PROCESSOR
edges[id].weight=mn;
for(int j=index;j<=index+noOfElements-1;j++)
{
if(visited[j]==true)
continue;
if(adjacencyMatrix[j][closest[j]]<mn)
{
mn=adjacencyMatrix[j][closest[j]];
edges[id].source=j;
edges[id].dest=closest[j];
edges[id].weight=mn;
}
}
ruk();
//MINIMUM FUNCTION THAT TAKES O(log n) Time to FIND THE MINIMUM
for(int k=0;k<=log2(processors);k++)
int c=(int) Math.pow(2, k);

int d=Math.min(id+c,processors-1);
if(edges[id].weight>edges[d].weight && edges[d].weight!=0)
{
edges[id].source=edges[d].source;
edges[id].dest=edges[d].dest;
edges[id].weight=edges[d].weight;
}
ruk();
}
//ADDING THE EDGE TO THE SPANNING TREE
if(id==0 && edges[id].weight!=MAX_VALUE)
{
System.out.println(edges[id].source+"
"+edges[id].dest+" "+edges[id].weight);
visited[edges[id].source]=true;
cur[id]=edges[id].source;
for(int k=0;k<processors;k++)
selected[k]++;
}
ruk();
//BROADCAST FUNCTION THAT TAKES O(log n) Time to BROADCAST MINIMUM
for(int k=0;k<=log2(processors);k++)
{
int l=0;
int r=(int) (Math.pow(2, k)-1);
if(id<=r && id>=l)
{
cur[Math.min(id+r+1,processors-1)]=cur[id];
ruk();
}
ruk();
//******UPDATING THE CLOSEST VARIABLE*******************************
for(int i=index;i<=index+noOfElements-1;i++)
{
if(visited[i]==false && cur[id]!=i)
{
int d1=adjacencyMatrix[i][closest[i]];
int d2=adjacencyMatrix[i][cur[id]];
if(d2<d1 && d2!=0)
{
closest[i]=cur[id];
}
}
}
if(selected[id]==v-1)
break;
ruk();
}
}
public void ruk()
{
try{
cc.await();}
catch(Exception e)
{
e.printStackTrace();
}
}
public void soja(int i)
{
try{
Thread.sleep(i);}
catch(Exception e)
{
e.printStackTrace();
}
}
}
public static int log2(int n){
if(n <= 0) throw new IllegalArgumentException();
return 31 - Integer.numberOfLeadingZeros(n);
}
class Edge{
int source;
int dest;
int weight;
};
C++ Code (CRCW)

#include<bits/stdc++.h>
using namespace std;
typedef long long ll;
ll v;
bool visited[10005];
ll closest[1005];
ll adj[1005][1005];
ll src,dest,weight;
int main(int argc, char *argv[])
{
int n;
cout<<"\nEnter the number of processors: ";

cin>>n;
cout<<"\nEnter the number of vertices: ";
cin>>v;
cout<<"\nEnter the weighted adjacency matrix: ";
for(ll i=0;i<v;i++)
{
for(ll j=0;j<v;j++)
{
cin>>adj[i][j];
if(adj[i][j]==0)
adj[i][j]=INT_MAX;}
}
visited[0]=true;
for(ll i=1;i<v;i++)
closest[i]=0;
for(ll j=0;j<v-1;j++){
ll mn=INT_MAX;
#pragma omp parallel for num_threads(n)
for(ll i=1;i<v;i++)
{
if(visited[i]==true)
continue;
if(adj[i][closest[i]]<mn)
{
mn=adj[i][closest[i]];
src=i;
dest=closest[i];
weight=mn;
}
}
cout<<src+1<<" "<<dest+1<<" "<<weight<<"\n";
visited[src]=true;
#pragma omp parallel for num_threads(n)
for(ll i=0;i<v;i++)
{
if(adj[i][src]<adj[i][closest[i]])
closest[i]=src;
}
}
return 0;
}
Result:
The following is the result of execution of the algorithm.
Enter the number of vertices
6
Enter the Weighted Matrix for the graph
0 6 8 6 0 0
6 0 0 5 10 0
8 0 0 7 5 3
6 5 7 0 0 0
0 10 5 0 0 3
0 0 3 0 3 0
Enter the number of processors
3
Using the sequential algorithm
Src Dest Weight
2 1 6
4 2 5
3 4 7
6 3 3
5 6 3
Using the parallel algorithm
Src Dest Weight
2 1 6
4 2 5
3 4 7
6 3 3
5 6 3
Comparison:
Attribute
EREW SM SIMD
model
Sequential Model
Input format
Processors used
Number of vertices,
weight matrix of a
graph and the
number of
processors
N= n1-x , 0<x<1
Time Complexity
O(n1+x)
O(n )
O(n2 )
O(n2 )
Total Cost
Output format
The edges
belonging to the
spanning tree are
printed
Number of vertices
and weight matrix
of a graph
1
2
The edges
belonging to the
spanning tree are
printed
Conclusion:
2
Minimum spanning tree can be solved in O ( n ) time in sequential
execution. For a parallel algorithm the Shared Memory Approach of EREW

SIMD SM model gives us O(n1+x) but n1-x processors are required. Thus the
algorithm is optimal.

Minimum Spanning Tree Parallel

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Minimum Spanning Tree Parallel

Încărcat de

Drepturi de autor:

Formate disponibile

Design and Analysis of Parallel

procedure Sequential MST ( A)

(1.1) Vertex viVo islabelled as a vertex already the tree

(2.1) for j=0 N 1 do

delivers a triple(dj , aj, bj)where

distances djits vertices ajbjare found

(2.4)Using BROADCAST , vr is made known all N processors

(2.5) for j=0 to N-1 do in parallel

then pjlabels vr as a vertex already tree

ii for each vertex v ,5 that is not yetthe tree do

Consequently, the overall running time of the procedure is O(n 1+x).

Implementation of the Algorithms:

int c=(int) Math.pow(2, k);

C++ Code (CRCW)

cout<<"\nEnter the number of processors: ";

cout<<src+1<<" "<<dest+1<<" "<<weight<<"\n";

execution. For a parallel algorithm the Shared Memory Approach of EREW

S-ar putea să vă placă și