Sunteți pe pagina 1din 12

Design and Analysis of Parallel

Algorithms
Harsh Kumar 106113033

Problem Statement:
To find the minimum spanning tree of an undirected graph using
1. Sequential model
2. Shared Memory model (EREW, CRCW)

Algorithms:
Sequential Algorithm:
Assumption:
G = (V, E) is used to represent a graph G whose vertex set is V and edge
set is E. A matrix representation can be used for computer storage and
manipulation of a graph. Let G be a graph whose vertex set is V = {v 1,
v2, . . ., vn). This graph can be uniquely represented by an n x n adjacency
matrix A whose entries aij, 0 =< i, j < n, are defined as follows:
aij =
=

1, if vi is connected to vj
0, otherwise

procedure Sequential MST ( A)

Step 1: Include vertex v, in the MST and let c(v i) = vo for i = 1, 2, . . ., n 1. Step 2: This step is repeated as long as there are vertices not yet in
the MST:
(2.1) Include in the tree the closest vertex not yet in the tree; that is, for
all vi not in the MST find the edge (vi, c(vi)) for which dist (vi, c(vi)) is
smallest and add it to the tree.
(2.2) For all ui not in the MST, update c(vi); that is, assuming that vj was
the most recently added vertex to the tree, then c(v i) can be updated by
determining the smaller of dist (vi, c(vi)) and dist (vi, vj).

Analysis:
Step 1 requires n constant time operations. Step 2 is executed once for
each of n - 1 vertices. If there are already k vertices in the tree, then
steps 2.1 and 2.2 consist of n - k - 1 and n - k comparisons, respectively.
Thus step 2, and hence the algorithm, require time proportional to n-1 k=1
(n - k), which is O(n2).

Parallel Algorithms:
1. Shared Memory Model:
Assumptions:
The model used is a EREW (Exclusive Read Exclusive Write) shared
memory model following SIMD (Single Instruction Stream Multiple Data
Elements) model of execution.
For the CRCW model, the same algorithm is used except for the fact that
minimum and broadcast functions are not required. The time and cost is
the same for both the models.
Input is the weight matrix W, order vv which resides in the shared
memory where v is the number of vertices in the graph. V is the vertex
set.
Total number of processors used are n1-x where 0<x<1. Each processor Pi
is assigned a distinct subsequence Vi of V of size nx. In other words, Pi is
"in charge" of the vertices in Vi. Note that Pi needs only to store the
indices of the first and last vertices in V i. During the process of
constructing the MST and for each vertex v p in Vi. that is not yet in the
tree, Pi also keeps track of the closest vertex in the tree, denoted c(v p).
The weight matrix W of G is stored in shared memory, where w ij = dist (vi,
vj) for i, j = 0, 1, . . ., n -1.
procedure EREW MST (W , TREE)
Step 1:

(1.1) Vertex viVo islabelled as a vertex already the tree


(1.2) for i=0 N 1 do
for each vertex vjVi do

c ( vj ) <v 0
end for

end for
Step 2: for i=1 n1do

(2.1) for j=0 N 1 do


i Pj finds the smallest of the quantities dist ( vp , c ( vp ) ) ,
where vp is a vertex Vj that is not yetthe tree
ii Let the smallest quantity found i dist (vr , vt) . Pj

delivers a triple(dj , aj, bj)where


dj=dist ( vr , vt ) ,
aj=vr
bj=vt

end for
(2.2)Using procedure MINIMUM the smallest

of the

distances djits vertices ajbjare found


(2.3)P 0 assigns( vr , vt ) TREE (i) , theith entry of arrayTREE

(2.4)Using BROADCAST , vr is made known all N processors

(2.5) for j=0 to N-1 do in parallel


i if vr isVj

then pjlabels vr as a vertex already tree


end if

ii for each vertex v ,5 that is not yetthe tree do


if dist (vp , vq )< dist ( vp , c (vp))

thenc ( vp)=vq
end if

end for
end for

end for

Analysis:
Step 1.1 is done in constant time. Since each processor is in charge of n x
vertices, step 1.2 requires nx assignments. Therefore, step 1 runs in O(n x)
time. In step 2.1, a processor finds the smallest of n x quantities
(sequentially) using nx - 1 comparisons.
Procedures MINIMUM and BROADCAST both involve O (log N) constant
time operations. Since N = n1-x, steps 2.2 and 2.4 are done in O (log n)
time. Clearly steps 2.3 and 2.5 require constant time and O(n x) time,
respectively. Hence each iteration of step 2 takes O(n x) time. Since this
step is iterated n + 1 times, it is completed in O(n 1+x) time.

Consequently, the overall running time of the procedure is O(n 1+x).


Its cost is
c(n) = p(n) x t(n) = n1-x x O(n1+x)
= O(n2)
This means that the procedure is also cost optimal.

Implementation of the Algorithms:


The algorithm is implemented in Java.
Java Threads are used to simulate processors. Each thread is given a
unique id which it uses to reference vertices required. The weight matrix
is defined as a global variable hence accessible to all.
Each thread can only access the vertices from its id so Exclusiveness is
maintained. The threads are created and started. They wait at a particular
piece of code till all the threads are created and are waiting at the same
piece. Then they all start executing the same function provided to all of
them thus ensuring a parallel execution and preserving the SIMD
property.
The CRCW algorithm is also implemented in C++ using OpenMP (Open
Multi-Processing is an application programming interface (API) that
supports multi-platform shared memory multiprocessing programming in
C, C++).

Implemented Code:
Java Code (Sequential and EREW)
import java.util.LinkedList;
import java.util.List;
import java.util.Scanner;
import java.util.concurrent.*;
public class minSpanningTree {
public static final int MAX_VALUE=99999;
public static void main(String[] args) {
int adjacency_matrix[][];
int number_of_vertices,p;
Scanner scan = new Scanner(System.in);
System.out.println("Enter the number of vertices");
number_of_vertices = scan.nextInt();
adjacency_matrix = new int[number_of_vertices+1][number_of_vertices+1];
System.out.println("Enter the Weighted Matrix for the graph");
for (int i = 1; i <= number_of_vertices; i++)
{
for (int j = 1; j <= number_of_vertices; j++)
{
adjacency_matrix[i][j] = scan.nextInt();
if (i == j)

adjacency_matrix[i][j] = MAX_VALUE;
continue;

}
if (adjacency_matrix[i][j] == 0)
{
adjacency_matrix[i][j] = MAX_VALUE;
}
}
}
System.out.println("Enter the number of processors(<=n)");
p= scan.nextInt();
System.out.println("Using the sequential algorithm");
System.out.println("Src Dest Weight");
sequential rithm = new sequential(number_of_vertices);
rithm.mstAlgo(adjacency_matrix);
System.out.println("Using the parallel algorithm");
System.out.println("Src Dest Weight");
parallel kruskalAlgorithm = new parallel(number_of_vertices,p);
kruskalAlgorithm.mstAlgo(adjacency_matrix);
scan.close();
}
}
class sequential{
private int v;
public static final int MAX_VALUE=99999;
private List<Edge> spanning_tree;
private int closest[];
private boolean visited[];
public sequential(int ver)
{
this.v=ver;
spanning_tree=new LinkedList<Edge>();
closest= new int[ver+1];
visited=new boolean[ver+1];
}
public void mstAlgo(int adjacencyMatrix[][])
{
visited[1]=true;
for(int i=2;i<=v;i++)
closest[i]=1;
for(int i=1;i<=v-1;i++)
{
int mn=MAX_VALUE;
Edge e=new Edge();
for(int j=1;j<=v;j++)
{
if(visited[j]==true)
continue;
if(adjacencyMatrix[j][closest[j]]<mn)
{
mn=adjacencyMatrix[j][closest[j]];
e.source=j;
e.dest=closest[j];
e.weight=mn;
}

visited[e.source]=true;
spanning_tree.add(e);
for(int k=1;k<=v;k++)
{
if(visited[k]==false)
{
int d1=adjacencyMatrix[k][closest[k]];
int d2=adjacencyMatrix[k][e.source];
if(d2<d1)
closest[k]=e.source;
}
}
for(Edge edge: spanning_tree)
{
System.out.println(edge.source+" "+ edge.dest+" "+edge.weight);
}

}
}
class parallel{
private int v;
Edge edges[];
public static final int MAX_VALUE=99999;
private
private
private
private
private
private

int closest[];
boolean visited[];
int processors;
int adjacencyMatrix[][];
int cur[];
int selected[];

CyclicBarrier cb;
public parallel(int ver,int n)
{
this.v=ver;
edges = new Edge[n+1];
cur=new int[n+1];
closest= new int[ver+1];
visited=new boolean[ver+1];
processors= n;
adjacencyMatrix=new int[v+1][v+1];
selected=new int[n];
for(int i=0;i<n;i++)
selected[i]=0;
cb=new CyclicBarrier(n);

}
public void mstAlgo(int adjMatrix[][])
{
adjacencyMatrix=adjMatrix;

int p=v/processors;
int i=1;
for(int k=0;k<=v;k++)
closest[k]=1;
visited[1]=true;
for(int j=0;j<processors;j++)
{
if(j==processors-1)
p=v-i+1;
edges[j]=new Edge();
Thread t=new Thread(new processor(j,i,p,cb));
t.start();

i=i+p;

}
class processor extends Thread{
private int index;
private int noOfElements;
private int id;
private final CyclicBarrier cc;
public processor(int id,int i, int n,CyclicBarrier cb)
{
this.id=id;
this.index=i;
this.noOfElements=n;
this.cc=cb;
}
public void run()
{
while(true)
{
int mn=MAX_VALUE;
//FINDING THE MINIMUM DIST(VR,VT) PAIR IN VERTICES ASSIGNED TO THIS PROCESSOR
edges[id].weight=mn;
for(int j=index;j<=index+noOfElements-1;j++)
{

if(visited[j]==true)
continue;
if(adjacencyMatrix[j][closest[j]]<mn)
{
mn=adjacencyMatrix[j][closest[j]];
edges[id].source=j;
edges[id].dest=closest[j];
edges[id].weight=mn;
}

}
ruk();
//MINIMUM FUNCTION THAT TAKES O(log n) Time to FIND THE MINIMUM
for(int k=0;k<=log2(processors);k++)

int c=(int) Math.pow(2, k);


int d=Math.min(id+c,processors-1);
if(edges[id].weight>edges[d].weight && edges[d].weight!=0)
{
edges[id].source=edges[d].source;
edges[id].dest=edges[d].dest;
edges[id].weight=edges[d].weight;
}
ruk();
}
//ADDING THE EDGE TO THE SPANNING TREE
if(id==0 && edges[id].weight!=MAX_VALUE)
{
System.out.println(edges[id].source+"
"+edges[id].dest+" "+edges[id].weight);
visited[edges[id].source]=true;
cur[id]=edges[id].source;
for(int k=0;k<processors;k++)
selected[k]++;
}
ruk();
//BROADCAST FUNCTION THAT TAKES O(log n) Time to BROADCAST MINIMUM
for(int k=0;k<=log2(processors);k++)
{
int l=0;
int r=(int) (Math.pow(2, k)-1);
if(id<=r && id>=l)
{

cur[Math.min(id+r+1,processors-1)]=cur[id];

ruk();
}
ruk();
//******UPDATING THE CLOSEST VARIABLE*******************************
for(int i=index;i<=index+noOfElements-1;i++)
{
if(visited[i]==false && cur[id]!=i)
{
int d1=adjacencyMatrix[i][closest[i]];
int d2=adjacencyMatrix[i][cur[id]];
if(d2<d1 && d2!=0)
{
closest[i]=cur[id];
}
}
}

if(selected[id]==v-1)

break;
ruk();
}
}
public void ruk()
{
try{
cc.await();}
catch(Exception e)
{
e.printStackTrace();
}
}
public void soja(int i)
{
try{
Thread.sleep(i);}
catch(Exception e)
{
e.printStackTrace();
}
}

}
public static int log2(int n){
if(n <= 0) throw new IllegalArgumentException();
return 31 - Integer.numberOfLeadingZeros(n);
}

class Edge{
int source;
int dest;
int weight;
};

C++ Code (CRCW)


#include<bits/stdc++.h>
using namespace std;
typedef long long ll;
ll v;
bool visited[10005];
ll closest[1005];
ll adj[1005][1005];
ll src,dest,weight;
int main(int argc, char *argv[])
{
int n;

cout<<"\nEnter the number of processors: ";


cin>>n;
cout<<"\nEnter the number of vertices: ";
cin>>v;
cout<<"\nEnter the weighted adjacency matrix: ";
for(ll i=0;i<v;i++)
{
for(ll j=0;j<v;j++)
{
cin>>adj[i][j];
if(adj[i][j]==0)
adj[i][j]=INT_MAX;}
}
visited[0]=true;
for(ll i=1;i<v;i++)
closest[i]=0;

for(ll j=0;j<v-1;j++){

ll mn=INT_MAX;
#pragma omp parallel for num_threads(n)
for(ll i=1;i<v;i++)
{
if(visited[i]==true)
continue;
if(adj[i][closest[i]]<mn)
{
mn=adj[i][closest[i]];
src=i;
dest=closest[i];
weight=mn;
}
}

cout<<src+1<<" "<<dest+1<<" "<<weight<<"\n";

visited[src]=true;
#pragma omp parallel for num_threads(n)
for(ll i=0;i<v;i++)
{
if(adj[i][src]<adj[i][closest[i]])
closest[i]=src;
}

}
return 0;
}

Result:
The following is the result of execution of the algorithm.
Enter the number of vertices
6
Enter the Weighted Matrix for the graph
0 6 8 6 0 0
6 0 0 5 10 0
8 0 0 7 5 3
6 5 7 0 0 0
0 10 5 0 0 3
0 0 3 0 3 0
Enter the number of processors
3
Using the sequential algorithm
Src Dest Weight
2 1 6
4 2 5
3 4 7
6 3 3
5 6 3
Using the parallel algorithm
Src Dest Weight
2 1 6
4 2 5
3 4 7
6 3 3
5 6 3

Comparison:
Attribute

EREW SM SIMD
model

Sequential Model

Input format

Processors used

Number of vertices,
weight matrix of a
graph and the
number of
processors
N= n1-x , 0<x<1

Time Complexity

O(n1+x)

O(n )

O(n2 )

O(n2 )

Total Cost
Output format

The edges
belonging to the
spanning tree are
printed

Number of vertices
and weight matrix
of a graph

1
2

The edges
belonging to the
spanning tree are
printed

Conclusion:
2
Minimum spanning tree can be solved in O ( n ) time in sequential

execution. For a parallel algorithm the Shared Memory Approach of EREW


SIMD SM model gives us O(n1+x) but n1-x processors are required. Thus the
algorithm is optimal.

S-ar putea să vă placă și