Sunteți pe pagina 1din 26

PREPARED BY

NAME:-MADHURIMA PATRA
ROLL NO:-14401062011
In the simplest sense, parallel programming is the
simultaneous use of multiple compute resources to solve a
computational problem:

• To be run using multiple CPUs.


• A problem is broken into discrete parts that can be
solved concurrently.
• Each part is further broken down to a series of
instructions.
• Instructions from each part execute simultaneously on
different CPUs.
BIT LEVEL PARALLELISM –By increasing the processor word size
we can reduce the number of instructions the processor must execute in
order to perform an operation on variables whose sizes are greater than
the length of the word .
INSTRUCTION LEVEL PARALLELISM -How many of the operations
in a computer program can be performed simultaneously.
DATA PARALLELISM -Focuses on distributing the data across
different parallel computing nodes. Same calculation done on the same
data set or on different data set.
TASK PARALLELISM -Entirely different calculations can be
performed on either the same or different sets of data
AMDAHL‘S LAW:
Amdahl's Law states that potential program
speedup is defined by the fraction of
code (P) that can be parallelized:

1
speed up= ------------------
1 - P

• If none of the code can be parallelized,P = 0 and the speedup = 1 (no


speedup).

• If all of the code is parallelized, P = 1 and the speedup is infinite (in theory).
Introducing the number of processors
performing the parallel fraction of
work, the relationship can be modeled by:

1
speedup = --------------
P+S
---
N

where P = parallel fraction,


N = number of processors and S = serial fraction.
GUSTAFSON’S LAW
Gustafson's Law (also known as Gustafson-Barsis' law)
is a law in computer science which states that any
sufficiently large problem can be efficiently parallelized.

Gustafson's Law is closely related to Amdahl's law, which


gives a limit to the degree to which a program can be sped
up due to parallelization. It was first described by John L.
Gustafson:
.

where P is the number of processors, S is the speedup, and


α the non-parallelizable part of the process.
DATA DEPENDENCY
A data dependency exists when there is multiple use of the same
storage location.
No program can run more quickly than the longest chain of dependent
calculations since calculations that depend upon prior calculations in
the chain must be executed in order.
Let Pi and Pj be two program fragments. Bernstein's conditions
describe when the two are independent and can be executed in
parallel. For Pi, let Ii be all of the input variables and Oi the output
variables, and likewise for Pj. P i and Pj are independent if they satisfy
 RACE CONDITION,
 MUTUAL EXCLUSION,
 SYNCHRONISATION,

Example, consider the following program:

Thread A Thread B
1A: Read variable V 1B: Read variable V

2A: Add 1 to variable V 2B: Add 1 to variable V

3A: Write back to variable V 3B: Write back to variable V


RACE CONDITION
A situation in which multiple processes read and write a shared
data item and the final result depends on the relative timing of their
execution.

MUTUAL EXCLUSION
A collection of techniques for sharing resources so that different
uses do not conflict and cause unwanted interactions.

SYNCHRONISATION

The coordination of parallel tasks in real time, very often associated with
communications. Often implemented by establishing a synchronization point
within an application where a task may not proceed further until another task(s)
reaches the same or logically equivalent point.
PARALLEL SLOWDOWN
GRANULITY
EMBARRASSINGLY PARALLEL

PARALLEL SLOWDOWN
When a task is split up into more and more threads, those threads spend an
ever-increasing portion of their time communicating with each other.
Eventually, the overhead from communication dominates the time spent
solving the problem, and increases the amount of time required to finish.

GRANULITY
In parallel computing, granularity is a qualitative measure of the ratio of
computation to communication.
•Coarse
•Fine
EMBARRASSINGLY PARALLEL
If the sub tasks rarely or never have to communicate .
LOAD BALANCING
Load balancing refers to the practice of distributing work among tasks so
that all tasks are kept busy all of the time. It can be considered a
minimization of task idle time.

Load balancing is important to parallel programs for performance reasons.


For example, if all tasks are subject to a barrier synchronization point, the
slowest task will determine the overall performance.
FLYNN’S TAXONOMY

Single Data Multiple Data

Single Instruction SISD SIMD

Multiple Instruction MISD MIMD


SIMD Model: Single Instruction, Multiple Data Stream

DS1
PU1 MM1

DS2 MM2
PU2
CU IS

SM

DSn
PUn MMn

IS
MIMD Multiple Instruction, Multiple Data Stream

IS1 IS1 DS1 IS1


CU1 PU1 MM1

IS2
IS2 IS2 DS2 MM2
CU2 PU2

SM

ISn ISn
ISn DSn
CUn PUn MMn
SHARED MEMORY
Shared memory parallel computers vary widely, but generally have in
common the ability for all processors to access all memory as global
address space.
Multiple processors can operate independently but share the same
memory resources.
Changes in a memory location effected by one processor are visible to
all other processors.
Shared memory machines can be divided into two main classes based
upon memory access times:

• UMA
• NUMA.
UNIFORM MEMORY ACCESS
Most commonly represented today by Symmetric Multiprocessor
(SMP) machines
Identical processors
Equal access and access times to memory

SHARED MEMORY (UMA)


NON UNIFORM MEMORY ACCESS
Often made by physically linking two or more SMPs
One SMP can directly access memory of another SMP
Not all processors have equal access time to all memories
Memory access across link is slower

SHARED MEMORY (NUMA)


DISTRIBUTED MEMORY
Distributed memory systems require a communication network to connect
inter-processor memory.
Processors have their own local memory. There is no concept of global
address space across all processors.
The concept of cache coherency does not apply.
When a processor needs access to data in another processor, it is usually the
task of the programmer to explicitly define how and when data is communicated.
Synchronization between tasks is likewise the programmer's responsibility.

DISTRIBUTED MEMORY
HYBRID DISTRIBUTED SHARED MEMORY

The shared memory component is usually a cache coherent SMP


machine. Processors on a given SMP can address that machine's memory
as global.
The distributed memory component is the networking of multiple SMPs.
SMPs know only about their own memory - not the memory on another
SMP. Therefore, network communications are required to move data from
one SMP to another.

HYBRID DISTRIBUTED SHARED MEMORY


Parallel programming models exist as an abstraction above
hardware and memory architectures.

There are several parallel programming models in common use:

Shared Memory
Threads
Message Passing
Data Parallel

Although it might not seem apparent, these models are NOT specific
to a particular type of machine or memory architecture. In fact, any of
these models can (theoretically) be implemented on any underlying
hardware
SHARED MEMORY MODEL
In the shared-memory programming model, tasks share a common
address space, which they read and write asynchronously.

Various mechanisms such as locks / semaphores may be used to control


access to the shared memory.
THREADS MODEL
In the threads model of parallel
programming, a single process can have
multiple, concurrent execution paths.
The main program a.out is scheduled to run
by the native operating system. a.out loads
and acquires all of the necessary system and
user resources to run.
a.out performs some serial work, and then
creates a number of tasks (threads) that can
be scheduled and run by the operating system
concurrently.
Each thread has local data, but also, shares
the entire resources of a.out. Each thread also
benefits from a global memory view because it
shares the memory space of a.out.
MESSAGE PASSING MODEL

A set of tasks that use their own


local memory during computation.
Multiple tasks can reside on the same
physical machine as well across an
arbitrary number of machines.

Tasks exchange data through


communications by sending and
receiving messages.

Data transfer usually requires


cooperative operations to be
performed by each process. For
example, a send operation must have
a matching receive operation.
DATA PARALLEL MODEL

Most of the parallel work focuses


on performing operations on a data
set. The data set is typically
organized into a common structure,
such as an array or cube.

A set of tasks work collectively on


the same data structure, however,
each task works on a different
partition of the same data structure.

Tasks perform the same operation


on their partition of work,
for example, "add 4 to every array
element".

S-ar putea să vă placă și