Sunteți pe pagina 1din 22

A SHORT INTRODUCTION TO

PROBABILITY

Because of the stochastic nature of genetics


and evolution, we have to rely on the theory
of probability.
Terminology
The possible outcomes of a stochastic process
are called events. (A deterministic process
has only one possible outcome.)
A stochastic process may have a finite or an
infinite number of outcomes.
The probability of a particular event is the
fraction of outcomes in which the event
occurs. The probability of event A is denoted
by P(A).
Terminology
Probability values are between 0 (the event
never occurs) and 1 (the event always
occurs).
Events may or may not be mutually
exclusive.
Events that are not mutually exclusive are
called independent events.
The birth of a son or a
daughter are
mutually exclusive
events.

The birth of a daughter


and the birth of carrier
of the sickle-cell anemia
allele are not mutually
exclusive (they are
independent events).
Terminology
The sum of probabilities of all mutually
exclusive events in a process is 1. For
example, if there are n possible mutually
exclusive outcomes, then

n
 P(i)  1
i1
Simple probabilities
If A and B are mutually exclusive events,
then the probability of either A or B to occur
is the union
P(A  B)  P(A)  P(B)
Example: The probability of a hat being red is ¼, the probability of
the hat being green is ¼, and the probability of the hat being black is
½. Then, the probability of a hat being red OR black is ¾.

Simple probabilities
If A and B are independent events, then the
probability that both A and B occur is the
intersection

P(A  B)  P(A) P(B)


Simple probabilities
Example: The probability that a US president is bearded is
~14%, the probability that a US president died in office is
~19%, thus the probability that a president both had a beard
and died in office is ~3%. If the two events are independent,
1.3 bearded presidents are expected to fulfill the two
conditions. In reality, 2 bearded presidents died in office. (A
close enough result.)
Harrison, Taylor, Lincoln*, Garfield*, McKinley*, Harding, Roosevelt, Kennedy* (*assassinated)
Conditional probabilities
What is the probability of event A to occur
given than event B did occur. The conditional
probability of A given B is
P(A  B)
P(A | B) 
P(A)
Example: The probability that a US president dies in office if he is
bearded 0.03/0.14 = 22%. Thus, out of 6 bearded presidents, 22% (or
1.3) are expected to die. In reality, 2 died. (Again, a close enough
result.)

Permutations
The number of possible permutations is the
number of different orders in which particular
events occur. The number of possible
permutations are
n!
Np 
(n  r )!
where r is the number of events in the series, n is the
number of possible events, and n! denotes the factorial of n
= the product of all the positive integers from 1 to n.

Permutations
In how many ways can 8 CD’s be
arranged on a shelf?
n! n 8
Np 
(n  r )! r 8

8! 8!
Np    40,320

(8  8)! 1
Permutations
In how many ways can 4 CD’s (out of a
collection of 8 CD’s) be arranged on a
shelf?
n! n 8
Np 
(n  r )! r4

8! 8!
Np    1,680
(8  4)! 4!

Combinations
When the order in which the events occurred
is of no interest, we are dealing with
combinations. The number of possible
combinations is  
n n!
Nc   
r  r!(n  r)!
where r is the number of events in the series, n is the
number of possible events, and n! denotes the factorial of n
= the product of all the positive integers from 1 to n.
Combinations
How many groups of 4 CDs are there in a
collection of 8 CDs)?

n  n! n 8
Nc   
r  r!(n  r)! r4

8 8! 8!
Nc      70
4 4!(8 4)! 4!4!
Probability Distribution
The probability distribution refers
to the frequency with which all
possible outcomes occur. There are
numerous types of probability
distribution.
The uniform distribution
A variable is said to be uniformly distributed if the
probability of all possible outcomes are equal to one
another. Thus, the probability P(i), where i is one of n
possible outcomes, is
1
P(i) 
n
The binomial distribution
A process that has only two possible outcomes is called a
binomial process. In statistics, the two outcomes are
frequently denoted as success and failure. The
probabilities of a success or a failure are denoted by p and
q, respectively. Note that p + q = 1. The binomial
distribution gives the probability of exactly k successes in
n trials
n  k 
P(k)   p 1 pn k
k 
The binomial distribution
The mean and variance of a binomially distributed variable

  np
are given by

V  npq
The Poisson distribution

Siméon Denis Poisson


Poisson d’April 1781-1840
The Poisson distribution
When the probability of “success” is very small, e.g., the
probability of a mutation, then pk and (1 – p)n – k become
too small to calculate exactly by the binomial distribution.
In such cases, the Poisson distribution becomes useful.
Let l be the expected number of successes in a process
consisting of n trials, i.e., l = np. The probability of
observing k successes is
k
le l
P(k) 
k!
The mean and variance of a Poisson distributed
variable are given by  = l and V = l, respectively.
Normal Distribution
Gamma Distribution

S-ar putea să vă placă și