Sunteți pe pagina 1din 2

Project Topics(PT) for CENG577 Parallel Computing - Fall 2011 Deadline: January 6th, 2012 @ 23:55 General notes:

: In every project (when possible) you are expected to: *use nonblocking operations *overlap communication with computation *use efficient kernels In addition to above, grading will be based on the correctness of your implementation and the final report (5-10 pages long) which should contain the algorithm description, scalability plots, comparisons, and interpretation of results. What to submit: Submit (on cow) the final report and the source code as a zip file before the deadline. Important deadlines: October 26th, 2011 - email the names of the people in your group to: manguoglu@ceng.metu.edu.tr November 16th, 2011 - choose your project topic January 6th, 2012 - submit the final reports and files PT#1: Parallel sparse matrix vector multiplication with graph and hypergraph partitioning Given a sparse input matrix, apply a graph (METIS or other) and a hypergraph (hMETIS and/or PaToH or others) partitioning algorithm on the matrix and implement parallel sparse matrix vector multiplication using MPI and using the partitioning computed for a given number of MPI processes. Time the parts of the code (such as graph partitioning algorithm, reordering/partitioning operation itself, and the actual sparse matrix vector multiplication). Provide parallel scalability plots as a function of number of MPI processes using ~10 large square matrices from the University of Florida Sparse Matrix Colelection. Compare the graph and hypergraph partitioning algorithms for the same set of matrices. PT#2: Parallel banded lower (or upper) triangular solver Read the paper: "Practical Parallel Band Triangular System Solvers", ACM Transactions on Mathematical Software, Volume 4 Issue 3, Sept. 1978, S.C.Chen, D.J.Kuck, and A.H. Sameh Implement the algorithm described for solving lower (or upper) triangular systems in the paper (and in the class) using MPI. The input should be a banded matrix stored in LAPACK banded storage format and a right hand side which could be a single vector or a group of vectors (i.e. a dense matrix). Mesaure the parallel scalability as a function of number of MPI processes and compare the results to the banded solver provided in ScaLAPACK as the matrix size and the lower (or upper) bandwidth varied.

PT#3: Parallel banded linear system solver (Spike algorithm) Read the paper: "On Stable Parallel Linear System Solvers", Sameh, A. H.; Kuck, D. J., Journal of the ACM 25(1) pp.81-91, 1978 Implement the algorithm described for solving banded linear systems in the paper (and in the class) using MPI. The input should be a banded matrix stored in LAPACK banded storage format and a right hand side which could be a single vector or a group of vectors (i.e. a dense matrix). Mesaure the parallel scalability as a function of number of MPI processes and compare the results to the banded solver provided in ScaLAPACK as the matrix size and the bandwidth varied.

PT#4: Parallel banded linear system solver (Cyclic reduction algorithm) Read the papers: (1) R. W. Hockney. 1965. A Fast Direct Solution of Poisson's Equation Using Fourier Analysis. J. ACM 12, 1 (January 1965), 95-11 (2) Harold S. Stone. 1975. Parallel Tridiagonal Equation Solvers. ACM Trans. Math. Softw. 1, 4 (December 1975) Implement the algorithm described for solving banded linear systems in the papers (and in the class) using MPI. The input should be a banded matrix stored in LAPACK banded storage format and a right hand side which could be a single vector or a group of vectors (i.e. a dense matrix). Mesaure the parallel scalability as a function of number of MPI processes and compare the results to the banded solver provided in ScaLAPACK as the matrix size and the bandwidth varied. PT#5: Parallel banded linear system solver (Balance algorithm) Read the paper: G.H.Golub, A.H. Sameh, V. Sarin, A parallel balance scheme for banded linear systems, Numerical Linear Algebra with Application, 8(5) , pp.297-316, 2001 Implement the algorithm described for solving banded linear systems in the paper (and in the class) using MPI. The input should be a banded matrix stored in LAPACK banded storage format and a right hand side which could be a single vector or a group of vectors (i.e. a dense matrix). Mesaure the parallel scalability as a function of number of MPI processes and compare the results to the banded solver provided in ScaLAPACK as the matrix size and the bandwidth varied.

S-ar putea să vă placă și