Documente Academic
Documente Profesional
Documente Cultură
In the introductory course, we used nano as our text editor. In the intermediate
course, we used vim. Finally, in this advanced course, we'll provide an
introduction to Emacs, which can be described as the most advanced text editor
and is particularly popular among programmers.
The big features of Emacs is the extremely high level of built-in commands,
customisation, and extensions, so extensive that those explored here only begin
to touch the extraordinary diverse world that is Emacs. Indeed, Eric Raymond,
notes "[i]t is a common joke, both among fans and detractors of Emacs, to
describe it as an operating system masquerading as an editor".
email client, and an ftp client. It provides file difference, merging, and version
control, a text-based adventure game, and even a Rogerian psychotherapist.
This all said, Emacs is not easily to learn for beginners. The level of
customisation and the detailed use of meta- and control- characters does serve as
a barrier to immediate entry.
Emacs is launched by simply typing 'emacs' on the command line. Commands are
invoked by a combination of the Control (Ctrl) key and a character key (C-<chr>)
or the Meta key (Alt, or Esc) and a character key (M-<char>). Of course, if
you're using a terminal emulator, as if oten the case, the "alt" key will probably
be superseded by the emulator itself, so you'll want to use Esc instead. However
the Esc key is not a shift key; rather than hold it down, type it distinctly.
To quit emacs use C-x C-c you'll use this a lot as a beginner! Note that with all
Emacs commads this represents two sets of keystrokes. The space is not actually
typed.
If an Emacs session crashed recently, M-x recover-session can recover the files
that were being edited.
The help files are accessed with C-h and the manual with C-h r.
Emacs has three main data structures, Files, Buffers, and Windows which are
essential to understand.
A file is the what is the actual file on disk. Strictly, when using Emacs one does
not actually edit file. Rather, what happens is the file is copied into a buffer, then
edited, and then saved. Buffers can be deleted without deleting the file on disk.
The buffer is an data space within Emacs for editing a copy of the file. Emacs can
handle many buffers simultaneously, the effective limit being the maximum
buffer size, determined by integer capacity of the processor and memory (e.g.,
for 64-bit machines, this maximum buffer size is 2^61 - 2 bytes). A buffer has a
name, usually after the file from which it has copied the data.
A window is the user's view of a buffer. Not all buffers may be visible to the user
at once due to the limits of screen size. A user may split the screen into multiple
windows. Windows can be created and deleted, without deleting the buffer
associated with the window.
Emacs also has a blank line below the mode line to display messages, and for
input for prompts from Emacs. This is called the mini-buffer, or echo.
Cursor keys can be used to mover around the text, along with Page Up and Page
Down, if the terminal uses them. However Emacs afficiandos will recommend the
use of the control key for speed. Common commands include the following; you
may notice a pattern in the command logic:
C-q (prefix command; use when you want to enter a control key into the buffer
e.g,. C-q ESC inserts an Escape)
Like the page-up and page-down keys on a standard keyboard you will discover
that Emacs also interprets the Backspace and Delete key as expected.
A selection can be cut (or 'killed' in Emacs lingo) by marking the beginning of the
selected text with C-SPC (space) and ending it with with standard cursor
movements and entering C-w. Text that has been cut can be pasted ('yanked') by
moving the cursor to the appropriate location and entering C-y.
Emacs commands also accept a numeric input for repetition, in the form of C-u,
the number of times the command is to be repeated, followed bythe command
(e.g., C-u 8 C-n moves eight lines down the screen.)
There are only three main file manipulation commands that a user needs to
know; how to find a file, how to save a file from a buffer, and how to save all.
The first command is C-x C-f, shorthand for "find-file". At first this command
checks prompts for the name of the file. If it is already copied into a buffer it will
switch to that buffer. If it is not, it will create a new buffer with the name
requested.
For the second command, to save a buffer to a file file with the buffer name use
C-x C-s, shorthand for "save-buffer".
The third command is C-x s. This is shorthand for "save-some-buffers" and will
cycle through each open buffer and prompt the user for their action (save, don't
save, check and maybe save, etc)
There are four main commands relating to buffer management that a user needs
to know. How to switch to a buffer, how to list existing buffers, how to kill a
buffer, and how to read a buffer in read-only mode.
To switch to a buffer, user C-x b. This will prompt for a buffer name, and switch
the buffer of the current window to that buffer. It does not change your existing
windows. If you type a new name, it will create a new empty buffer.
To list current active buffers, use C-x C-b. This will provide a new window which
lists current buffers, name, whether they have been modified, their zie, and the
file that they are associated with.
To kill a buffer, use C-x k. This will prompt for the buffer name, and then remove
the data for that buffer from Emacs, with an opportunity to save it. This does not
delete any associated files.
Emacs has it's own windowing system, consisting of several areas of framed text.
The behaviour is similar to a tiling window manager; none of the windows
overlap with each other.
C-x
C-x
C-x
C-x
C-x
C-x
C-x
C-x
A command use is to bring up other documents or menus. For example, with the
key sequence C-h one usually calls for help files. If this is followed by k, it will
open a new vertical window, and with C-f, it will display the help information for
the command C-f (i.e., C-h k C-f). This new window can be closed with C-x 1.
Emacs is notable for having a very large undo sequence, limited by system
resources, rather than application resources. This undo sequence is invoked with
C-_ (control underscore), or with C-x u . However it has a special feature that, by
engaging in a simple navigation command (e.g., C-f) the undo action is pushed to
the top of the stack and therefore the user can undo an undo command.
Emacs can make it easier to read C and C++ by colour-coding such files,
through the ~/.emacs configuration file, and adding global-font-lock-mode t.
Programmers also find the feature on being able to run the GNU debugger (GDB)
from within Emacs as well. The command M-x gdb will start up gdb. If theres a
breakpoint, Emacs automatically pulls up the appropriate source file, which gives
a better context than the standard GDB.
Good knowledge of scripting is required for any advanced Linux user, and
especially those who find that they have regular tasks, such as the processing of
data through a program. Shell scripting is no terribly difficult, although
sometimes some austere syntax bugs may prove frustrating - but the machine is
just doing what you asked it to. Despite their often under-rated utility shell
scripts are not the answer to everything. They are not great at resource intensive
tasks (e.g., extensive file operations) where speed is important. They are not
recommended for heavy-duty maths operations (use C, C++, or Fortran instead).
It is not recommended in situations where data structures, multi-dimensional
arrays (it's not a database!) and port/socket I/O is important.
scripts was also illustrated. In this Advanced course we will revisit these
concepts but with more sophisticated and complex examples. In addition there
will be a close look at internal commands and filters, process substitution,
functions, arrays, and debugging.
The simplest script is simply one that runs a list of system commands. At least
this saves the time of retyping the sequence each time it is used, and reduces the
possibility of error. For example, in the Intermediate course, the following script
was recommended to calculate the disk use in a directory. It's a good script, very
handy, but how often would you want to type it? Instead, type enter it once and
keep it. You will recall of course, that a script starts with an invocation of the
shell, followed by commands.
emacs diskuse.sh
#!/bin/bash
du -sk * | sort -nr | cut -f2 | xargs -d "\n" du -sh > diskuse.txt
chmod +x diskuse.sh
Making the script a little more complex, variables are usually better than hardcoded values. There are two potential variables in this script, the wildcard '*' and
the exported filename "diskuse.txt". In the former case, we'll keep the wildcared
as it allows a certain portibility of the script - it can run in any directory it is
invoked from. For the latter case however, we'll use the date command so that a
history of diskuse can be created which can be reviewed for changes. It's also
good practise to alert the user when the script is completed and, although it is
often necessary, it is also good practise to cleanly finish any script with with
'exit'.
emacs diskuse.sh
#!/bin/bash
DU=diskuse$(date +%Y%m%d).txt
du -sk * | sort -nr | cut -f2 | xargs -d "\n" du -sh > $DU
echo "Disk summary completed and sorted."
exit
The following script searches through any specified text file for text before and
after the ubiquitous email "@" symbol and outputs these as a csv file through use
of grep, sed, and sort (for neatness). If the input or the output file are not
specified, it exits after echoing the error.
emacs findemails.sh
#!/bin/bash
# Search for email addresses in file, extract, turn into csv with designated file
name
INPUT=${1}
OUTPUT=${2}
{
if [ !$1 -o !$2 ]; then
echo "Input file not found, or output file not specified. Exiting script."
exit 0
fi
}
grep --only-matching -E '[.[:alnum:]]+@[.[:alnum:]]+' $INPUT > $OUTPUT
sed -i 's/$/,/g' $OUTPUT
sort -u $OUTPUT -o $OUTPUT
sed -i '{:q;N;s/\n/ /g;t q}' $OUTPUT
echo "Data file extracted to" $OUTPUT
exit
chmod +x findemails.sh
Test this file with hidden.txt as the input text and found.csv as the output text.
The output will include a final comma on the last line but this is potentially useful
if one wants to run the script with several input files and append to the same
output file (simply change the single redirection in the grep statement to an
double appended redirection.
A serious weakness of the script (so far) is that it will gather any string with the
'@' symbol in it, regardless of whether it's a well-formed email address or not. So
it's not quite suitable for screen-scraping usenet for email address to turn into a
spammers list. But it's getting close.
1.2.3 Reads
The read command simply reads a line from standard input. By applying the -n
option is can read in a number of characters, rather than a whole line, so -n1 is
"read a single character". The use of the -r option reads the input as raw input,
so that the backslash key (for example) doesn't act like a a newline escape
character, and the -p option displays the prompt. Plus, a -t timeout in seconds
option can also added. Combined, can be used in the effect of "press any key to
continue", with a limited timeframe.
emacs findemails.sh
#!/bin/bash
# Search for email addresses in file, extract, turn into csv with designated file
name
..
..
read -t5 -n1 -r -p "Press any key too see the list, sorted and with unique
record..."
if [ $? -eq 0 ]; then
echo A key was pressed.
else
echo No key was pressed.
exit 0
fi
less $OUTPUT | \
# Output file, piped through sort and uniq.
sort | uniq
exit
Any text following a # (with the exception of #!) is comments and will not be
executed. Comments may begin at the beginning of a line, following whitespace,
following the end of a command, and even be embedded within a piped command
(as above in section 3).
A comment ends at the end of the line, and as a result a command may not follow
a comment on the same line. A quoted or an escaped # in an echo statement
does not begin a comment.
..
case $1 in
*.tar.bz2)
*.tar.gz)
*.bz2)
tar xvjf $1
tar xvzf $1
bunzip2 $1
;;
;;
;;
..
..
esac
In contrast, the colon acts as a null command. Whilst this obviously has a variety
of uses (e.g., an alternative to the touch command, a really practical advantage
of this is that comes with a true exit status, and as such it can be used as
placeholder in if/then tests. An example from the Intermediate course;
for i in *.plot.dat; do
if [ -f $i.tmp ]; then
: # do nothing and exit if-then
else
touch $i.tmp
The use of the null command as a test at the beginning of a loop will cause it to
run endlessley (e.g., <code>while : do ... done</code>) as the test always
evaluates as true. Note that the colon is also used as a field separator in
/etc/passwd and in the $PATH variable.
for a in {1..10}
do
echo -n "$a "
done
Like the dot, the comma operator has multiple uses. Usually it is used to link
multiple arithmetic calculations. This is typically used in for loops, with a C-like
syntax. e.g.,
Enclosing a referenced value in double quotes (" ... ") does not interfere with
variable substitution. This is called partial quoting, sometimes referred to as
"weak quoting." Using single quotes (' ... ') causes the variable name to be used
literally, and no substitution will take place. This is full quoting, sometimes
referred to as 'strong quoting.' It can also be used to combine strings.
for file in /{,usr/}bin/*sh
do
if [ -x "$file" ]
then
echo $file
fi
done
Related to quoting is the use of the backslash (\) used to escape single
characters. Do not confuse it with the forward slash (/) has multiple uses as both
the separator in pathnames (e.g., (/home/train01), but also a the division
operator.
In some scripts backticks (`) are used for command substitution, where the
output of a command can be assigned to a variable. Whilst this is not a POSIX
standard, it does exist for historical reasons. Nesting commands with backticks
also requires escape characters; the deeper the nesting the more escape
characters required (e.g., echo `echo \`echo \\\`pwd\\\`\``). The preferrred and
POSIX standard method is to use the dollar sign and parentheses. e.g., echo
"Hello, $(whoami)." rather than echo "Hello, `whoami)`."
This is the simplest and, until recently, the most common processor
architecture on desktop computer systems. Also known as a
uniprocessor system it offers a single instruction set and a single
data stream. Uniprocessors could however simulate or include
concurrency through a number of different methods:
a) It is possible for a uniprocessor system to run processes
concurrently by switching between one and another.
d) Pipelines, on the instruction level or the graphics level, can also serve as an
example of concurrent activity. An instruction pipeline (e.g., RISC) allows
multiple instructions on the same circuty by dividing the task into stages. A
graphics pipeline implements different stages of rendering operations to
different arithmetic units.
SIMD was also used especially in the 1970s and notably on the various Cray
systems. For example the Cray-1 (1976) had eight "vector registers," which held
sixty-four 64-bit words each (long vectors) with instructions applied to the
registers. Pipeline parallelism was used to implement vector instructions with
separate pipelines for different instructions, which themselves cuold be run in
batch and pipelined (vector chaining). As a result the Cray-1 could have a peak
performance of 240 mflops - extraordinary for the day, and even acceptable in
the early 2000s.
it, and then returned. With SIMD the same instruction is applied to all the data,
depending on the availability of cores, i.e., get n pixels, apply instruction, return.
The main disadvantages of SIMD, within the limitations of the process itself, is
that it does require additional register, power consumption, and heat.
Multiple Instruction, Single Data (MISD) occurs when different operations are
performed on the same data. This is quite rare and indeed debateable as it is
reasonable to claim that once an instruction has been performed on the data, it's
not thesame data anymore. If one doesn't take this definition and allows for a
variety of instructions to be applied to the same data which can change then
various pipeline architectures can be considered MISD.
Systolic arrays are another form of MISD. They are different to pipelines because
they have non-linear array structure, they have multidirectional data flow, and
each processing element may even have its own local memory . In this situation a
matrix pipe network arrangement of processing units compute data and store it
independently of each other. Matrix multiplication is an example of such an array
in an algorithmic form, where one a matric is introduced one row at a time from
the top of the array, whereas another matrix is introduced one colum at a time.
MISD machines are rare; the Cisco PXF processor is an example. They can be
fast and scalable, as they do operate in parallel, but they are really difficult to
build.
With distributed memory systems, each processor has its own memory. Finally,
another combination is distributed shared memory, where the (physically
separate) memories can be addressed as one (logically shared) address space. A
variant combined method is to have shared memory within each multiprocessor
node, and distributed between them.
processor carries out the usual functions of a CPU, according to the instruction
set; data handling instructions (set register values, move data, read and write),
arithmetic and logic functions (add, subtract, multiply, divide, bitwise operations
for conjunction and disjunction, negate, compare), and control-flow functions
(conditionally branch to another section of a program, indirectly branch and
return). A multicore processor carries out the same functions, but with
independent central processing units (note lower case) called 'cores'.
Manufacturers integrate the multiple cores onto a single integrated circuit die or
onto multiple dies in a single chip package.
the potential for race conditions (e.g., deadlocks, data integrity issues, resource
New multicore systems are being developed all the time. Using RISC CPUs,
Tilera released 64-core processors in 2009 and in 2009, a one hundred core
processor. In 2012 Tilera founder, Dr. Agarwal, is leading a new MIT effort
dubbed The Angstrom Project. It is one of four DARPA-funded efforts aimed at
building exascale supercomputers. The goal is to design a chip with 1,000 cores.
Linear, or ideal, speedup is when S(p) = p. For example, double the processors
resulting in double the speedup.
"When two trains approach each other at a crossing, both shall come to a full
stop and neither shall start up again until the other has gone."
different places and orders can lead to deadlocks. Manual lock inserts is errorprone, tedious and difficult to maintain. Does the programmer know what parts
of a program will benefit from parallelisation? To ensure that parallel execution
is safe, a tasks effects must not interfere with the execution of another task.
However it is not necessarily the case that the ratio of parallel and serial parts of
a job and the number of processors generate the same result, as the variation in
execution time in the specific serial and parallel implementation of a task can
vary. An example can be what is called "embarrassingly parallel", so named
because it is a very simple task to split up into parallel tasks as they have little
communication between each other. For example, the use of GPUs for projection,
where each pixel is rendered independently. Such tasks are often called
"pleasingly parallel". To give an example using the R programming language the
SNOW package (Simple Network of Workstations) package allows for
embarrassingly parallel computations (yes, we have this installed).
Whilst originally expressed by Gene Amdahl in 1967, it wasn't until over twenty
years later in 1988 that an alternative by John L. Gustafson amd Edwin H. Barsis
was proposed. Gustafon noted that Amadahl's Law assumed a computation
problem of fixed data set size. Gustafson and Barsis observed that programmers
tend to set the size of their computational problems according to the available
equipment; therefore as faster and more parallel equipment becomes available,
larger problems can be solved. Thus scaled speedup occurs; although Amdahl's
law is correct in a fixed sense, it can be circumvented in practise by increasing
the scale of the problem.
If the problem size is allowed to grow with P, then the sequential fraction of the
workload would become less and less important. A common metaphor is based
on driving (computation), time, and distance (computational task). In Amdhal's
Law, if a car had been travelling 40kmp/h and needs to reach a point 80km from
the point of origin, no matter how fast the vehicle travels it will can only reach a
maximum of a 80km/h average before reaching the 80km point, even if it
travelled at infinite speed as the first hour has already passed. With the
Gustafon-Barsis Law, it doesn't matter if the first hour has been at a plodding 40
km/h, this can be infinitely increased given enough time and distance. Just make
the problem bigger!
3.1 The Story of the Message Passing Interface (MPI) and OpenMPI
(image from
Lawrence
Livermore National
Laboratory, U.S.A)
Using MPI is a matter of some common sense. It is is the only message passing
library which can really be considered a standard. It is supported on virtually all
HPC platforms, and has replaced all previous message passing libraries, such as
PVM, PARMACS, EUI, NX, Chameleon, to name a few predecessors.
Programmers like it because there is no need to modify their source code when
ported to a different system as long as that system also supports the MPI
standard (there may be other reasons however to modify the code!). MPI has
excellent performance with vendors able to exploit hardware features for
optimisation.
The core principle is that many processors should be able cooperate to solve a
problem by passing messages to each through a common communications
network. The flexible architecture does overcome serial bottlenecks, but it also
does require explicit programmer effort (the "questing beast" of automatic
parallelisation remains somewhat elusive). The programmer is responsible for
identifying opportunities for parallelism and implementing algorithms for
parallelisation using MPI.
MPI programming is best where there is not too many small communications,
and where coarse-level breakup of tasks or data is possible.
"In cases where the data layout is fairly simple, and the communications
patterns are regular this [data-parallel] is an excellent approach. However, when
dealing with dynamic, irregular data structures, data parallel programming can
be difficult, and the end result may be a program with sub-optimal performance."
(Warren, Michael S., and John K. Salmon. "A portable parallel particle program." Computer Physics
Communications 87.1 (1995): 266-290.)
For the purposes of this course, copy a number of files to the home directory:
cd ~
cp -r /common/advcourse .
#include <stdio.h>
#include "mpi.h"
int
argc;
char **argv;
MPI_Comm_size( MPI_COMM_WORLD,
&size );
MPI_Finalize();
return 0;
This is the text for the batch file pbs-helloword which is launched qsub and
reviewed with less.
qsub pbs-helloworld
less pbs-helloworld
MPI compiler wrappers are used to compile MPI programs which perform basic
error checking, integrate the MPI include files, link to the MPI libraries and pass
switches to the underlying compiler. The wrappers are as follows:
mpif77
mpif90
mpicc
mpicxx
Open MPI is comprised of three software layers: OPAL (Open Portable Access
Layer), ORTE (Open Run-Time Environment), and OMPI (Open MPI). Each layer
provides the following wrapper compilers:
OPAL
ORTE
OMPI
The distinction between Fortran and C routines in MPI are fairly minimal. All the
names of MPI routines and constants in both C and Fortran begin with the same
MPI_ prefix. The main differences are:
* Error codes are returned in a separate argument for Fortran as opposed to the
return value for C functions.
A comment
program hello
Program name
include 'mpif.h'
Variables
Start MPI
call
MPI_COMM_SIZE(MPI_COMM_WORLD, size,
ierror)
Number of processers
call
MPI_COMM_RANK(MPI_COMM_WORLD, rank,
ierror)
Process IDs
call MPI_FINALIZE(ierror)
Finish MPI.
end
Compile this with mpi90 (the Fortran 90 wrapper) and submit with qsub:
stream (print hello world) is used across multiple times. It is perhaps best
described as Single Program Multiple Data, as it obtains the effect of running the
same program multiple times, or, if you like different programs with the same
instructions.
The core theoretical concept in MPI programming is the move from a model
where the processor and memory act in a sequence to each other to model
where the memory and processor act in parallel and pass information through a
communications network.
MPI has been described as both small and large. It is large, insofar that there are
well over a hundred different routines. But most of these are only called when
one is engaging in advanced MPI programming (beyond the level of this course),
it is perhaps fair to say that MPI is small, as there are only a handful of basic
routines that are usually needed, of which we've seen four. There are two others
(MPI_Send, MPI_Recv) which can also be considered "basic routines".
3.3.1 MPI_Init()
This routine initializes the MPI execution environment Every MPI program must
call this routine once, and only once, and before any other MPI routines;
subsequent calls to MPI_Init will produce an error. With MPI_Init() processes are
spawned and ranked with communication channels established and the defafult
communicator, MPI_COMM_WORLD created. Communicators are considered
analoguous to the mail or telephone system; every message travels in the
communicator, with every message passing call having a communcator
argument.
The input paramters are argc, a pointer to the number of arguments, and argv,
the argument vector. These are for C and C++ only. The Fortran-only output
parameter is IERROR, as integer.
C Syntax
#include <mpi.h>
int MPI_Init(int *argc, char ***argv)
Fortran Syntax
INCLUDE mpif.h
MPI_INIT(IERROR)
INTEGER
IERROR
C++ Syntax
#include <mpi.h>
void MPI::Init(int& argc, char**& argv)
void MPI::Init()
3.3.2 MPI_Comm_size()
C Syntax
#include <mpi.h>
int MPI_Comm_size(MPI_Comm comm, int *size)
Fortran Syntax
INCLUDE mpif.h
MPI_COMM_SIZE(COMM, SIZE, IERROR)
INTEGER
COMM, SIZE, IERROR
C++ Syntax
#include <mpi.h>
int Comm::Get_size() const
3.3.3 MPI_Comm_rank()
This routine indicates the rank rank number of the calling processes within the
pool of MPI communicator processes. The input parameters are comm, the
communicator handle and the outpit paramters are rank, the rank of the calling
processses expressed as an integer, and the ever present, IERROR error status
for Fortran. It is common for MPI programs to be written in a manager/worker
model, where one process (typically rank 0) acts in a supervisory role, and the
other processes act in a computational role.
C Syntax
#include <mpi.h>
int MPI_Comm_rank(MPI_Comm comm, int *rank)
Fortran Syntax
INCLUDE mpif.h
MPI_COMM_RANK(COMM, RANK, IERROR)
INTEGER COMM, RANK, IERROR
C++ Syntax
#include <mpi.h>
int Comm::Get_rank() const
3.3.4 MPI_Send()
The input parameters include buf, the initial address of the send buffer., count,
an integer of the number of elements., datatype, a handle of the datatype of each
send buffer., dest, an integer rank of the destination., tag, an integer message
tag, and comm, the communicator handle. The only output parameter is
Fortran's , IERROR.
C Syntax
#include <mpi.h>
int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest,
int tag, MPI_Comm comm)
Fortran Syntax
INCLUDE mpif.h
MPI_SEND(BUF, COUNT, DATATYPE, DEST, TAG, COMM, IERROR)
<type>
BUF(*)
INTEGER
COUNT, DATATYPE, DEST, TAG, COMM, IERROR
C++ Syntax
#include <mpi.h>
void Comm::Send(const void* buf, int count, const Datatype&
datatype, int dest, int tag) const
3.3.5 MPI_Recv()
As what is sent should be received, the MPI_Recv routine, provides a standardmode, blocking receive. A message can be received only if addressed to the
receiving process, and if its source, tag, and communicator (comm) values match
the source, tag, and comm values specified. After a matching send has been
initiated, a receive will block and until that send has completed. The length of
the received message must be less than or equal to the length of the receive
buffer, otherwise an overflow error will be returned.
The input paramters include count, the maximum integer number of elements to
receive., datatype, a handle of the datatype for each receive buffer entry., source,
an integer rank of the source., tag, an integer message tag., and comm, the
communicator handle. The output parameters are buf, the initial address of the
receive buffer., status, a status object., and the ever-present Fortran IERROR.
C Syntax
#include <mpi.h>
int MPI_Recv(void *buf, int count, MPI_Datatype datatype,
int source, int tag, MPI_Comm comm, MPI_Status *status)
Fortran Syntax
INCLUDE mpif.h
MPI_RECV(BUF, COUNT, DATATYPE, SOURCE, TAG, COMM, STATUS, IERROR)
<type>
BUF(*)
INTEGER
COUNT, DATATYPE, SOURCE, TAG, COMM
INTEGER
STATUS(MPI_STATUS_SIZE), IERROR
C++ Syntax
#include <mpi.h>
void Comm::Recv(void* buf, int count, const Datatype& datatype,
int source, int tag, Status& status) const
3.3.6 MPI_Finalize()
This routine should be called when all communications are completed. Whilist it
cleans up MPI data structures etc., it does not cancel continuing communications
which the programmer should look out for. Once called, no other routines can be
called (with some minor exceptions), not even MPI_Init. There are no input
parameters. The only output paramter is Fortran's IERROR.
C Syntax
#include <mpi.h>
int MPI_Finalize()
Fortran Syntax
INCLUDE mpif.h
MPI_FINALIZE(IERROR)
INTEGER
IERROR
C++ Syntax
#include <mpi.h>
void Finalize()
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int main(argc,argv)
int argc;
char *argv[];
{
int myid, numprocs;
int tag,source,destination,count;
int buffer;
MPI_Status status;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
tag=1;
source=0;
destination=1;
count=1;
if(myid == source){
buffer=1234;
MPI_Send(&buffer,count,MPI_INT,destination,tag,MPI_COMM_WORLD);
printf("processor %d sent %d\n",myid,buffer);
}
if(myid == destination){
MPI_Recv(&buffer,count,MPI_INT,source,tag,MPI_COMM_WORLD,&status);
printf("processor %d received %d\n",myid,buffer);
}
MPI_Finalize();
}
program sendrecv
include "mpif.h"
integer myid, ierr,numprocs
integer tag,source,destination,count
integer buffer
integer status(MPI_STATUS_SIZE)
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
tag=1
source=0
destination=1
count=1
if(myid .eq. source)then
buffer=1234
Call MPI_Send(buffer, count, MPI_INTEGER,destination,&
tag, MPI_COMM_WORLD, ierr)
write(*,*)"processor ",myid," sent ",buffer
endif
if(myid .eq. destination)then
Call MPI_Recv(buffer, count, MPI_INTEGER,source,&
tag, MPI_COMM_WORLD, status,ierr)
write(*,*)"processor ",myid," received ",buffer
endif
call MPI_FINALIZE(ierr)
stop
end
The following provides a summary use of the six core routines in C and Fortran.
Purpose
Fortran
INCLUDE mpif.h
Initialize MPI
INTEGER IERROR
CALL MPI_INIT(IERROR)
Determine number
of processes within
a communicator
int MPI_Comm_size(MPI_Comm
comm, int *size)
INTEGER COMM,SIZE,IERROR
CALL
MPI_COMM_SIZE(COMM,SIZE,IERROR)
Determine processor
int MPI_Comm_rank(MPI_Comm
rank within a
comm, int *rank)
communicator
INTEGER COMM,RANK,IERROR
CALL
MPI_COMM_RANK(COMM,RANK,IERROR)
<TYPE> BUF(*)
INTEGER COUNT,
DATATYPE,DEST,TAG
INTEGER COMM, IERROR
CALL MPI_SEND(BUF,COUNT,
DATATYPE, DEST, TAG, COMM,
IERROR)
Receive a message
<TYPE> BUF(*)
INTEGER COUNT, DATATYPE,
SOURCE,TAG
INTEGER COMM, STATUS, IERROR
CALL MPI_RECV(BUF,COUNT,
DATATYPE, SOURCE, TAG, COMM,
STATUS, IERROR)
Exit MPI
int MPI_Finalize()
CALL MPI_FINALIZE(IERROR)
Send a message
Like C and Fortran (and indeed, almost every programming language that comes
to mind), MPI has datatypes, a classification for identifying different types of
data (such as real, int, float, char etc). In the introductory MPI program there
wasnt really much complexity in these types; as one delves deeper however
more will be encountered. Forewarned is forearmed, so the following provides a
handy comparison chart between MPI, C, and Fortran.
MPI DATATYPE
FORTRAN DATATYPE
MPI_INTEGER
INTEGER
MPI_REAL
REAL
MPI_DOUBLE_PRECISION
DOUBLE PRECISION
MPI_COMPLEX
COMPLEX
MPI_LOGICAL
LOGICAL
MPI_CHARACTER
CHARACTER
MPI_BYTE
MPI_PACKED
MPI DATATYPE
C Datatype
MPI_CHAR
signed char
MPI_SHORT
MPI_LONG
MPI_UNSIGNED_CHAR
unsigned char
MPI_UNSIGNED_SHORT
MPI_UNSIGNED
unsigned int
MPI_UNSIGNED_LONG
MPI_FLOAT
float
MPI_DOUBLE
double
MPI_LONG_DOUBLE
long double
MPI_BYTE
MPI_PACKED
In the Intermediate course one of the last excersises involved the submission of
mpi-ping and mpi-pong. The first simply tested whether a connection existed
between multiple processors. The second program tested different packet sizes,
asynchronous, and bi-directional. In this example there is ping_pong.c, from the
or
mpicc mpi-pingpong.f90 -o mpi-pingpong and
mpicc -o mpi-pingpong mpi-pingpong.c
qsub pbs-pingpong
However for this course the interesting components is what is inside the code in
terms of the MPI routines. As previously there is the mpi.h include files, the
initialisation routines, the establishment of a communications world and so forth.
In addition however there are some new routines, specifically MPI_Wtime,
MPI_Abort, and MPI_Ssend.
4.2.1 MPI_Wtime()
{
double starttime, endtime;
starttime = MPI_Wtime();
....
endtime
stuff to be timed
...
= MPI_Wtime();
C Syntax
#include <mpi.h>
double MPI_Wtime()
Fortran Syntax
INCLUDE mpif.h
DOUBLE PRECISION MPI_WTIME()
C++ Syntax
#include <mpi.h>
double MPI::Wtime()
4.2.2 MPI_Abort()
MPI_Abort() aborts (or at least tries to) all tasks in the group of a communicator.
All associated processes are sent a SIGTERM. The input parameters include
comm, the communicator of taks to abort and errorcode, the error code to return
to invoking the environment. The only output parameter is Fortran's , IERROR.
C Syntax
#include <mpi.h>
int MPI_Abort(MPI_Comm comm, int errorcode)
Fortran Syntax
INCLUDE mpif.h
MPI_ABORT(COMM, ERRORCODE, IERROR)
INTEGER
COMM, ERRORCODE, IERROR
C++ Syntax
#include <mpi.h>
void Comm::Abort(int errorcode)
4.2.3 MPI_Ssend()
C Syntax
#include <mpi.h>
int MPI_Ssend(void *buf, int count, MPI_Datatype datatype, int dest,
int tag, MPI_Comm comm)
Fortran Syntax
INCLUDE mpif.h
MPI_SSEND(BUF, COUNT, DATATYPE, DEST, TAG, COMM, IERROR)
<type>
BUF(*)
INTEGER
C++ Syntax
#include <mpi.h>
void Comm::Ssend(const void* buf, int count, const Datatype&
datatype, int dest, int tag) const
Although not used in the specific program just illustrated there are actually a
number of other send options for Open MPI. These include MPI_Bsend ,
MPI_Rsend, MPI_Isend, MPI_Ibsend, MPI_Issend, and MPI_Irsend. These are
worth mentioning in summary as follows:
indicates to the system to start copying data out of the send buffer. A send
request can be determined being completed by calling the MPI_Wait,
MPI_Waitany, MPI_Test, or MPI_Testany with request returned by this function.
The send buffer cannot be used until one of these conditions is successful, or an
MPI_Request_free indicates that the buffer is available.
Although MPI_Send and MPI_Ssend are typical, there may be occassions when
some of these routines are preferred. If non-blocking routines are necessary, for
example, then look at MPI_Isend or MPI_Irecv.
MPI_Isend()
The input paramters are buf, the initial address of the send buffer., count, an
integer of the number of elements in the send buffer, datatype, a datatype handle
of each send buffer element., dest, an integer rank of the destination., tag, an
integer message tag, and comm, the comminicator handle. The output paramters
are request, the communication handle request., and Fortran's integer IERROR.
C Syntax
#include <mpi.h>
int MPI_Isend(void *buf, int count, MPI_Datatype datatype, int dest,
Fortran Syntax
INCLUDE mpif.h
MPI_ISEND(BUF, COUNT, DATATYPE, DEST, TAG, COMM, REQUEST, IERROR)
<type>
BUF(*)
INTEGER
COUNT, DATATYPE, DEST, TAG, COMM, REQUEST, IERROR
C++ Syntax
#include <mpi.h>
Request Comm::Isend(const void* buf, int count, const
Datatype& datatype, int dest, int tag) const
MPI_Irecv()
The input paramters are buf, the initial address of the receive buffer., count, an
integer of the number of elements in the receive buffer, datatype, a datatype
handle of each receive buffer element., dest, an integer rank of the source., tag,
an integer message tag, and comm, the comminicator handle. The output
paramters are request, the communication handle request., and Fortran's integer
IERROR.
C Syntax
#include <mpi.h>
int MPI_Irecv(void *buf, int count, MPI_Datatype datatype,
int source, int tag, MPI_Comm comm, MPI_Request *request)
Fortran Syntax
INCLUDE mpif.h
C++ Syntax
MPI_Wait()
Waits for an MPI send or receive to complete. It returns when the operation
identified by request is complete. If the communication object was created by a
nonblocking send or receive call, then the object is deallocated by the call to
MPI_Wait and the request handle is set to MPI_REQUEST_NULL. The input
parameter is request, the request handle. The output paramter is status, the
status object and Fortran's integer IERROR.
C Syntax
#include <mpi.h>
int MPI_Wait(MPI_Request *request, MPI_Status *status)
Fortran Syntax
INCLUDE mpif.h
MPI_WAIT(REQUEST, STATUS, IERROR)
INTEGER
REQUEST, STATUS(MPI_STATUS_SIZE), IERROR
C++ Syntax
#include <mpi.h>
void Request::Wait(Status& status)
void Request::Wait()
Send Mode
Explanation
Benefits
Problems
MPI_Send()
Standard send.
May be
synchronous or
buffering
Flexible tradeoff;
automatically
uses buffer if
available, but
goes for
synchronous if
not.
Can hide
deadlocks,
uncertainty of
type makes
debugging
harder.
MPI_Ssend()
Synchronous
send. Doesn't
return until
receive has also
completed.
Safest mode,
confident that
message has
been received.
Lower
performance,
especially
without nonblocking.
MPI_Bsend()
Buffered send.
Copies data to
buffer, program
free to continue
whilst message
delivered later.
Good
performance.
Need to be
aware of buffer
space.
Buffer
management
issues.
MPI_Rsend()
Receive send.
Message must be
already posted or
is lost.
Slight
performance
increase since
there's no
handshake.
Risky and
difficult to
design.
As described previously the arguments dest and source in the various modes of
send are the ranks of the receiving and the sending processes. MPI also allows
source to be a "wildcard" through the predefined constant MPI_ANY_SOURCE
(to receive from any source) and MPI_ANY_TAG (to receive with any source).
There is no wildcard for dest. Again using the postal analogy, a receipient may be
ready to receive a message from anyone, but they can't send a message to
anywhere!
A serial version of the code is provided (serial-gametheory.c, serialgametheory.f90). Review and then attempt a parallel version from the skeleton
versions of MPI (mpi-skel-gametheory.c, mpi-skel-gametheory.f90). Each process
must run one of the players decision-making, then they both have to transmit
their decision to the other, and then update their own tally of the result. Consider
using MPI_Send(), or MPI_Irecv(), and MPI_Wait(). On completion review with a
solution provided with mpi-gametheory.c and mpi-gametheory.f90 and submit the
tasks with qsub.
The basic principle and motivation is that whilst collective communications this
may provide a performance improvement, it will certainly provide clearer code.
Consider the following C snippet of a root processor sending to all..
if ( 0 == rank ) {
unsigned int proc_I;
for ( proc_I=1; proc_I < numProcs; proc_I++ ) {
MPI_Ssend( ¶m, 1, MPI_UNSIGNED, proc_I, PARAM_TAG, MPI_COMM_WORLD );
}
}
else {
MPI_Recv( ¶m, 1, MPI_UNSIGNED, 0 /*ROOT*/, PARAM_TAG, MPI_COMM_WORLD, &status
);
Replaced with:
4.3.1 MPI_Bcast()
MPI_Bcast Broadcasts a message from the process with rank "root" to all other
processes of the communicator, including itself. It is called by all members of
group using the same arguments for comm, root and on return, the contents of
root's communication buffer is been copied to all processes.
The input paramters include count, an integer of the number of entries in the
buffer., datatype, the datatyope of the buffer handle., root, an integer rank of the
broadcast root, and comm, the communicator handle., and buf, the starting
address of the buffer (input and output). The only other output parameter is
Fortran's , IERROR.
C Syntax
#include <mpi.h>
int MPI_Bcast(void *buffer, int count, MPI_Datatype datatype,
int root, MPI_Comm comm)
Fortran Syntax
INCLUDE mpif.h
MPI_BCAST(BUFFER, COUNT, DATATYPE, ROOT, COMM, IERROR)
<type>
BUFFER(*)
INTEGER
COUNT, DATATYPE, ROOT, COMM, IERROR
C++ Syntax
#include <mpi.h>
MPI_Scatter sends data from one task to all tasks in a group; the inverse
operation of MPI_Gather. The outcome is as if the root executed n send
operations and each process executed a receive. MPI_Scatterv scatters a buffer
in parts to all tasks in a group.
The input parameters include sendbuff, the address of the send buffer.,
sendcount, an integer for root that is the number of elements to send to each
process., sendtype, the datatype handle for root for send buffer elements.,
recvcount, an integer of the number of elements in the receive buffer., rectype,
the datatype handle of receive buffer elements., root, the integer rank of the
sending process., and comm, the communicator handle. MPI_Scatterv also has
the input paramter of displs, an integrer array of group length size, which
specifies a displacement relative to sendbuf. The output paramters include
recvbuf, the address of the receive buff and the ever-dependable IERROR for
Fortran,
C Syntax
#include <mpi.h>
int MPI_Scatter(void *sendbuf, int sendcount, MPI_Datatype sendtype,
void *recvbuf, int recvcount, MPI_Datatype recvtype, int root,
MPI_Comm comm)
Fortran Syntax
INCLUDE mpif.h
MPI_SCATTER(SENDBUF, SENDCOUNT, SENDTYPE, RECVBUF, RECVCOUNT,
RECVTYPE, ROOT, COMM, IERROR)
<type>
SENDBUF(*), RECVBUF(*)
INTEGER
SENDCOUNT, SENDTYPE, RECVCOUNT, RECVTYPE, ROOT
INTEGER
COMM, IERROR
C++ Syntax
#include <mpi.h>
void MPI::Comm::Scatter(const void* sendbuf, int sendcount,
const MPI::Datatype& sendtype, void* recvbuf,
int recvcount, const MPI::Datatype& recvtype,
int root) const
4.3.3 MPI_Gather()
Gathers data and combines a partial array from each processor into one array on
the root processor. Each process, including the root process, sends the contents
of its send buffer to the root process. The root process receives the messages
and stores them in rank order. The outcome is as if each of the n processes in the
group (including the root process) had executed a call to MPI_Send() and root
had executed n calls to MPI_Recv().
The input parameters include sendbuff, the address of the send buffer.,
sendcount, an integer of the number of elements in the send buff., sendtype, the
datatype handle send buffer elements., recvcount, an root integer of the number
of elements in the receive buffer., rectype, the datatype handle for root of receive
buffer elements., root, the integer rank of the sending process., and comm, the
communicator handle. The output paramters include recbuf, the address of the
receive buff for root and the ever-dependable IERROR for Fortran,
C Syntax
#include <mpi.h>
int MPI_Gather(void *sendbuf, int sendcount, MPI_Datatype sendtype,
void *recvbuf, int recvcount, MPI_Datatype recvtype, int root,
MPI_Comm comm)
Fortran Syntax
INCLUDE mpif.h
MPI_GATHER(SENDBUF, SENDCOUNT, SENDTYPE, RECVBUF, RECVCOUNT,
RECVTYPE, ROOT, COMM, IERROR)
<type>
SENDBUF(*), RECVBUF(*)
INTEGER
SENDCOUNT, SENDTYPE, RECVCOUNT, RECVTYPE, ROOT
INTEGER
COMM, IERROR
C++ Syntax
#include <mpi.h>
void MPI::Comm::Gather(const void* sendbuf, int sendcount,
const MPI::Datatype& sendtype, void* recvbuf,
int recvcount, const MPI::Datatype& recvtype, int root,
const = 0
C Syntax
#include <mpi.h>
int MPI_Reduce(void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm)
Fortran Syntax
INCLUDE mpif.h
MPI_REDUCE(SENDBUF, RECVBUF, COUNT, DATATYPE, OP, ROOT, COMM,
IERROR)
<type>
SENDBUF(*), RECVBUF(*)
INTEGER
COUNT, DATATYPE, OP, ROOT, COMM, IERROR
C++ Syntax
#include <mpi.h>
void MPI::Intracomm::Reduce(const void* sendbuf, void* recvbuf,
int count, const MPI::Datatype& datatype, const MPI::Op& op,
int root) const
MPI_Name
Function
MPI_Max
Maximum
MPI_MIN
Minimum
MPI_SUM
Sum
MPI_PROD
Product
MPI_LAND
Logical AND
MPI_BAND
Bitwise AND
MPI_LOR
Logical OR
MPI_BOR
Bitwise OR
MPI_LXOR
Logical exclusive OR
MPI_BXOR
Bitwise exclusive OR
MPI_MAXLOC
MPI_MINLOC
Derived types are essentially a user define type for MPI_Send(). They are
described as 'derived' as they are derived from existing primitive datatype like
int and float. The main reason to use them in in MPI context is that they make
message passing more efficient and easier to code.
For example if a program has data in double results[5][5], what does the user do
if they want to send results[0][0], results[1][0], results[5][0]?
double results[5][5];
int i;
for ( i = 0; i < 5; i++ ) {
MPI_Send( &(results[i][0]), 1, MPI_DOUBLE,
dest, tag, comm );
}
To create a derived type there are two steps: Firstly construct the datatype, with
MPI_Type_vector() or MPI_Type_struct() and and then commit the datatype with
MPI_Type_Commit().
When all the data to send is the same data type use the vector method e.g.,
int MPI_Type_vector( int count, int blocklen, int stride, MPI_Datatype old_type,
MPI_Datatype* newtype )
double
double
MPI_Datatype
MPI_Status
recvData[COUNT*BLOCKLEN];
sendData[COUNT][STRIDE];
vecType;
st;
If you have specific parts of a struct you wish to send and the members are of
different types, use the struct datatype.
For example....
Another example .
int blockLens[3] = { 1, 3, 10 };
MPI_Aint intSize, doubleSize;
MPI_Aint displacements[3];
MPI_Datatype types[3] = { MPI_INT, MPI_DOUBLE, MPI_CHAR };
MPI_Datatype myType;
MPI_Type_extent(
MPI_Type_extent(
displacements[0]
displacements[1]
displacements[2]
MPI_Type_contiguous
MPI_Type_hvector
MPI_Type_indexed
MPI_Type_hindexed
MPI_Status *status )
Then the application dynamically allocate the recv buffer, and call MPI_Recv.
The first example is designed to gain familiarity with the MPI_Scatter() routine
as a means of distributing global arrays among multiple processesors via
collective commuinication. Use the skeleton code provided and determine the
number of particles to assign to each processor. Then use the function
MPI_Scatter() to spread the global particle coordinates, ids and tags among the
processors.
For an advanced tests, on the root processor only, calculate the particle with the
smallest distance from the origin (hint: MPI_Reduce( ) ). If the particle with the
smallest distance is < 1.0 from the origin, then flip the direction of movement of
all the particles. Then modify your code to use the MPI_Scatterv() function to
allow the given number of particles to be properly distributed among a variable
number of processors.
int MPI_Scatterv (
void
int
int
MPI_Datatype
void
int
MPI_Datatype
int
MPI_Comm
*sendbuf,
*sendcnts,
*displs,
sendtype,
*recvbuf,
recvcnt,
recvtype,
root,
comm )
The second example is designed to gain a practical example of the use of MPI
derived data types. Implement a data type storing the particle information from
the previous exercise and use this data type for collective communications. Set
up and commit a new MPI derived data type, based on the struct below:
Then seed the random number sequence on the root processor only, and
determine how many particles are to be assigned among the respective
processors (same as for last exercise) and collectively assign their data using the
MPI derived data type you have implemented.
MPI_Group
MPI_Comm
int
worldGroup, subGroup;
subComm;
*procsToExcl, numToExcl;
*
*
*
*
*
TAU (Tuning and Analysis Utilities) is a portable profiling and tracing toolkit for
performance analysis of parallel programs written in Java, C, C++ and Fortran.
The instrumentation of source code can be done manually or with the help of
another utility called PDT, which automatically parses source files and
instruments them with Tau macros.
It has taken many years for this essential truth to be realised, but software
equals bugs. In parallel systems, the bugs are particularly difficult to diagnose,
and the core principle of parallelisation suggests race conditions and deadlocks.
For example, what happens when two processers try to send a message to one
another at the same time.
When debugging MPI programs it is usually a good idea to do this in ones own
environment, i.e., install (from source) the compilers and version of openmpi on
your own system. The reason for this is it is quite time prohibitive to conduct
debugging activities on a batch-processing high-performance computer. The HPC
systems that we have may run tasks fairly quickly when launched, but they can
take some time to begin whilst they are in the queue.
It is possible, for small tests, to bypass this by running small jobs interactively
(following the instructions given in the Intermediate course). e,g,
qsub -l walltime=0:30:0,nodes=1:ppn=2 -I
module load vpac
qsub pbs-sendrecv
In general however, parallel programs are hard to program and hard to debug.
Parallelism adds a whole new abstract layer. Although the program is being
executed on N processors, it may be running in N slightly different ways on
different data.
For example, consider the following simple send-recv programs; compile these
with openmpi-gcc as follows:
or
qsub -l walltime=0:20:0,nodes=1:ppn=2 -I
module load vpac
module load valgrind/3.8.1-openmpi-gcc
Note that an interactive job starts the user in their home directory requring a
change in directories.
The file valgrind.out in this case will contain quite a few errors, but none of these
are critical to the operation of the program.
As with serial programs, gdb can also be used for thorough debugging. Execute
as
Where gdb.cmd is a text file of the commands that you want to send to gdb. e.g.,
68
68
This of course, simple mentions that the program successfully with the final
values as listed (hooray!). To use a serial debugger with GDB that is running in
parallel is slightly more difficult. A common hack and it is a hack is to find out
what process IDs that the job is doing then to log in to the appropriate node and
run gdb -p PID. However in order to discover that the following code snippet is
usually implemented:
{
int i = 0;
char hostname[256];
gethostname(hostname, sizeof(hostname));
printf("PID %d on %s ready for attach\n", getpid(), hostname);
fflush(stdout);
while (0 == i)
sleep(5);
}
Then login to the appropriate nodes and run gdb -p 23166 and gdb -p 23167 amd
step through the function stack and set the variable to a non-zero value, e.g.,
Then set a breakpoint after your block of code and continue execution until the
breakpoint is hit (e.g., by adding break in the loops on lines 49 and 64) and using
the gdb commands to display the values as they are being generated (e.g., print
loop, print value, or info locals).