Unit 2

UNIT 2
Atul B Surwase
PROBABILITY
PROBABILITY
 Probability is a measure of how likely an
event is to occur.
PROBABILITY
 Probabilities are written as:
Fractions from 0 to 1
Decimals from 0 to 1
Percents from 0% to 100%
PROBABILITY
 If an event is certain to happen, then the
probability of the event is 1 or 100%.
 If an event will NEVER happen, then the
probability of the event is 0 or 0%.
 If an event is just as likely to happen as to
not happen, then the probability of the
event is ½, 0.5 or 50%.
PROBABILITY
Impossible Unlikely Equal Chances Likely Certain
0 0.5 1
0% 50% 100%
½
PROBABILITY
 When a meteorologist states that the chance of
rain is 50%, the meteorologist is saying that it is
equally likely to rain or not to rain.
 If the chance of rain rises to 80%, it is more likely
to rain.
 If the chance drops to 20%, then it may rain, but
it probably will not rain.
PROBABILITY
 What are some events that will never
happen and have a probability of 0%?
 What are some events that are certain to
happen and have a probability of 100%?
 What are some events that have equal
chances of happening and have a
probability of 50%?
PROBABILITY
 The probability of an event is written:
P(event) = number of ways event can occur
total number of outcomes
PROBABILITY
 An outcome is a possible result of a
probability experiment
 When rolling a number cube, the possible
outcomes are 1, 2, 3, 4, 5, and 6
PROBABILITY
 An event is a specific result of a
probability experiment
 When rolling a number cube, the event of
rolling an even number is 3 (you could roll a
2, 4 or 6).
PROBABILITY
What is the probability of getting heads
when flipping a coin?
P(heads) = number of ways = 1 head on a coin = 1
total outcomes = 2 sides to a coin = 2
P(heads)= ½ = 0.5 = 50%
Properties of Probabilities
1. The probability of any event E, P(E), must be between 0

and 1 inclusive. That is,
0 < P(E) < 1.
2. If an event is impossible, the probability of the event is

0.
3. If an event is a certainty, the probability of the event is 1.
4. If S = {e1, e2, …, en}, then
P(e1) + P(e2) + … + P(en) = 1.
13
TRY THESE:
B A 1. What is the probability that the spinner

C D will stop on part A?
2. What is the probability that the

3 1 spinner will stop on
2 (a) An even number?
(b) An odd number?
A 3. What is the probability that the

C B spinner will stop in the area
marked A?
PROBABILITY WORD
PROBLEM:
 Lawrence is the captain of his track team. The
team is deciding on a color and all eight members
wrote their choice down on equal size cards. If
Lawrence picks one card at random, what is the
probability that he will pick blue?
Number of blues = 3
Total cards = 8 3/8 or 0.375 or 37.5%
blue blue
green black
yellow
blue
red black
LET’S WORK THESE
TOGETHER
 Donald is rolling a number cube labeled 1 to 6.
What is the probability of the following?
a.) an odd number
odd numbers – 1, 3, 5
3/6 = ½ = 0.5 = 50%
total numbers – 1, 2, 3, 4, 5, 6
b.) a number greater than 5
numbers greater – 6 1/6 = 0.166 = 16.6%
total numbers – 1, 2, 3, 4, 5, 6
TRY THESE:
1 2 1. What is the probability of spinning a

3 4 number greater than 1?
2. What is the probability that a spinner

with five congruent sections numbered
1-5 will stop on an even number?
3. What is the probability of rolling a

multiple of 2 with one toss of a number
cube?
1) If you toss a die, what’s
the probability that you
roll a 3 or less?
a. ½
 Two dice are rolled and the sum of the face values is six.
What is the probability that at least one of the dice came
up a 3?
How can you get a 6 on two dice?
15, 51, 24, 42, 33
One of these five has a 3.
 1/5
Finding Probabilities of Events
You roll a six-sided die whose sides are

numbered from
1 through 6.
Find the probability of rolling a 4.

SOLUTIO
N
Only one outcome corresponds to rolling a 4.
number of ways to roll a 4 1

P (rolling a 4) = =
number of ways to roll the die 6

numbered from
1 through 6.
Find the probability of rolling an odd number.

SOLUTIO
N
Three outcomes correspond to rolling an odd number:
rolling a 1, 3, or a 5.
number of ways to roll an odd number 3 1

P (rolling odd number) = = =
number of ways to roll the die 6 2

numbered from
1 through 6.
Find the probability of rolling a number less than 7.
SOLUTIO
N
All six outcomes correspond to rolling a number less than 7.
number of ways to roll less than 7 6

P (rolling less than 7 ) = = =1
number of ways to roll the die 6
There are 52 cards in a
deck. So what are my
chances of picking an
ace?
How many aces are in a deck? 4
How many cards are in a deck? 52
So I have a
4/52 or 1/13
chance of
drawing an
ace!
PROBABLY OF EVENTS
 1 ) Favourable event
 2) Equally likely event
 3)Mutually Exclusive event
 4)Exhaustive event
 5)Independent event
 6)Dependent event
 7)Complementary event
1 ) FAVORABLE EVENT
 The number of cases favorable to an event in a
trial is the number of outcomes which entail the
happening of event.
 Ex. In tossing two coins, the number of cases to
the event of getting a head are i.e. HH,HT,TH
2) EQUALLY LIKELY EVENT
 The outcomes are said to be equally likely if none
of them is expected to occur in preference to
other.
 If each event has equal chance of happening.
 Ex. When coin is toss, he head is likely to urn up
as the tail. So H,T are equally likely
3)MUTUALLY EXCLUSIVE EVENT
 Two events are said to be mutually exclusive
when both can not happen simultaneously in a
single trial.
 Ex. If a coin is tossed head can be up or tail can
be up; but both cannot be up at the same time.
 H and T are mutually exclusive.
4)EXHAUSTIVE EVENT
 Outcomes are said o be exhaustive when they
include all possible outcomes.
 Ex. rolling a die ,the possible outcomes are
1,2,3,4,5,6. Hence exhaustive number of cases is
6.
 In Tossing two coins ,the possible outcomes are 4
i.e. HH,HT,TT,TH
5)INDEPENDENT EVENT
 Two or more events are said to be if the
occurrence of an event does not affect he
occurrence of other.
 Ex. IF a coin is thrown twice, the result of second
is no way affected by the result of the first thrown.
6)DEPENDENT EVENT
 Two events are said to be dependent , if the
occurrence of one in any trial affect the
occurrence of the other event in other trial.
 Ex. Consider he even of drawing a Card twice out
of 52 cards without replacement.
 In first case, we draw one card out of 52 cards
and the second case (without replacement), we
draw one card out of 51 cards only . Thus
outcomes of first event affect the outcomes of the
second event and they are dependent.
7)COMPLEMENTARY EVENT
 Two events are said to be complementary if they
are mutually exclusive and exhaustive.
 When a die is thrown, occurrence of an even
number(2,4,6) and odd number (1,3,5) are
complementary….
Examples on Probability
Ex.
There are 5 marbles in a bag: 4 are blue, and 1 is
red. What is the probability that a blue marble
gets picked?
 Number of ways it can happen: 4 (there are 4
blues)
 Total number of outcomes: 5 (there are 5
marbles in total)
So the probability = 4/5 = 0.8
 Suppose a coin is flipped 3 times. What is the
probability of getting two tails and one head?
 Solution: For this experiment, the sample space
consists of 8 sample points.
S = {TTT, TTH, THT, THH, HTT, HTH, HHT,
HHH}
Each sample point is equally likely to occur, so the
probability of getting any particular sample point
is 1/8. The event "getting two tails and one head"
consists of the following subset of the sample
space.
A = {TTH, THT, HTT}
The probability of Event A is the sum of the
probabilities of the sample points in A. Therefore,
P(A) = 1/8 + 1/8 + 1/8 = 3/8
 If you draw a card from a standard deck of cards,
what is the probability of drawing a face card?
 There are 4 suits with 3 face cards each. This
makes a total of 3 × 4 = 12 face cards out of 52
cards.
 An urn contains 6 red marbles and 4 black marbles.
Two marbles are drawn without replacement from
the urn. What is the probability that both of the
marbles are black?
 Solution: Let A = the event that the first marble is
black; and let B = the event that the second marble
is black. We know the following:
In the beginning, there are 10 marbles in the urn, 4 of
which are black. Therefore, P(A) = 4/10.
After the first selection, there are 9 marbles in the
urn, 3 of which are black. Therefore, P(B|A) = 3/9.
Therefore, based on the rule of multiplication:
P(A ∩ B) = P(A) P(B|A)
P(A ∩ B) = (4/10) * (3/9) = 12/90 = 2/15
 An urn contains 6 red marbles and 4 black
marbles. Two marbles are drawn with
replacement from the urn. What is the probability
that both of the marbles are black?
 Solution
 The correct answer is A. Let A = the event that the first
marble is black; and let B = the event that the second
marble is black. We know the following:
 In the beginning, there are 10 marbles in the urn, 4 of
which are black. Therefore, P(A) = 4/10.
 After the first selection, we replace the selected marble; so
there are still 10 marbles in the urn, 4 of which are black.
Therefore, P(B|A) = 4/10.
 Therefore, based on the rule of multiplication:
 P(A ∩ B) = P(A) P(B|A)
P(A ∩ B) = (4/10)*(4/10) = 16/100 = 0.16
 If you roll a die, what is the probability of it being
odd and less than 5?
There 2 numbers that are odd and less than 5: 1
and 3.

 If you roll two dice and find their sum, what is
the probability of the sum being even or greater
than 8?

There are 36 possible combinations, half (18) of
which are even; these are shaded in dark blue.
In addition, there are 6 outcomes that are greater
than 8 and not even, which are shaded in light
blue.
 This makes a total of 18 + 6 = 24 favorable
outcomes.


 If you draw a card from a deck of cards, what is
the probability of it being a face card and an ace?
 Sol: there are no cards that satisfy both
conditions, so the probability is 0
 Which of these numbers cannot be a probability?
a) 0.00001
b) 0.5
c) 1.001
d) 0
e) 1
f) 20%
 A probability is always greater than or equal to 0
and less than or equal to 1, hence
only a) and c) above cannot represent
probabilities: 0.00010 is less than 0 and 1.001 is
greater than 1.
 Two dice are rolled, find the probability that the
sum is
a) equal to 1 b) equal to 4 c) less than 13
 The sample space S of two dice is shown below.
S = { (1,1),(1,2),(1,3),(1,4),(1,5),(1,6)
(2,1),(2,2),(2,3),(2,4),(2,5),(2,6)
(3,1),(3,2),(3,3),(3,4),(3,5),(3,6)
(4,1),(4,2),(4,3),(4,4),(4,5),(4,6)
(5,1),(5,2),(5,3),(5,4),(5,5),(5,6)
(6,1),(6,2),(6,3),(6,4),(6,5),(6,6) }
 Let E be the event "sum equal to 1". There are no outcomes
which correspond to a sum equal to 1, hence
P(E) = n(E) / n(S) = 0 / 36 = 0
 b) Three possible outcomes give a sum equal to 4: E = {(1,3),
(2,2),(3,1)}, hence.
P(E) = n(E) / n(S) = 3 / 36 = 1 / 12
 c) All possible outcomes, E = S, give a sum less than 13,
hence.
P(E) = n(E) / n(S) = 36 / 36 = 1
 A die is rolled and a coin is tossed, find the
probability that the die shows an odd number
and the coin shows a head.
 The sample space S of the experiment described
in question 5 is as follows
S = { (1,H),(2,H),(3,H),(4,H),(5,H),(6,H)
(1,T),(2,T),(3,T),(4,T),(5,T),(6,T)}
 Let E be the event "the die shows an odd number
and the coin shows a head". Event E may be
described as follows
E={(1,H),(3,H),(5,H)}
 The probability P(E) is given by
P(E) = n(E) / n(S) = 3 / 12 = 1 / 4
 A card is drawn at random from a deck of cards.
Find the probability of getting the 3 of diamond.
 The sample space S of the experiment in question
6 is shwon below Let E be the event "getting the
3 of diamond". An examination of the sample
space shows that there is one "3 of diamond" so
that n(E) = 1 and n(S) = 52. Hence the probability
of event E occurring is given by P(E) = 1 / 52
 A jar contains 3 red marbles, 7 green marbles
and 10 white marbles. If a marble is drawn from
the jar at random, what is the probability that
this marble is white?
 We first construct a table of frequencies that
gives the marbles color distributions as follows
 Color frequency
Red 3
Green 7
White 10
Frequency for white color
 P(E)= ________________________________________________
Total frequencies in the above table
 = 10 / 20 = 1 / 2
 The blood groups of 200 people is distributed as
follows: 50 have type A blood, 65 have B blood type,
70 have O blood type and 15 have type AB blood. If
a person from this group is selected at random, what
is the probability that this person has O blood type?
 We construct a table of frequencies for the the blood
groups as follows
Group frequency
A 50
B 65
O 70
AB 15
Frequency for O blood
 P(E)= ________________________________________________
Total frequencies
= 70 / 200 = 0.35
 Exercises:
a) A die is rolled, find the probability that the
number obtained is greater than 4.
b) Two coins are tossed, find the probability that
one head only is obtained.
c) Two dice are rolled, find the probability that
the sum is equal to 5.
d) A card is drawn at random from a deck of
cards. Find the probability of getting the King of
heart.
 Answers to above exercises:
a) 2 / 6 = 1 / 3
b) 2 / 4 = 1 / 2
c) 4 / 36 = 1 / 9
d) 1 / 52
 Random Variables
NEED OF RANDOM VARIABLE
 Actually we know the concept of random experiment,
events, sample space and sample points. The nature of
random experiment is may be numerical or non
numerical(descriptive).
 Ex. 1) Throw a die (1,2,3,4,5,6)
 2) Toss a coin(T,H)
In nature outcome of many experiment is non numerical. It is
however inconvenient to deal with descriptive output. In
engineering we always interested and feel convenient to
deal with numerical outcomes.
So we are doing mapping from original sample space
(numerical or nonnumerical ) to numerical(real) sample
space, subject to certain constraint is called random
variable
 Random variable
 A numerical value to each outcome of a particular
experiment
S
-3 -2 -1 0 1 2 3
RANDOM VARIABLES
 A Random Variable is a set of possible
values from a random experiment.
Example: Tossing a coin: we could get
Heads or Tails.
Let's give them the
values Heads=0 and Tails=1 and we
have a Random Variable "X":
X = {0, 1}
So:
 We have an experiment (such as
tossing a coin)
 We give values to each event
 The set of values is a Random
Variable
 Sample Space
 A Random Variable's set of values
is the Sample Space.
 Example: Throw a die once
Random Variable X = "The score
shown on the top face".
 X could be 1, 2, 3, 4, 5 or 6
 So the Sample Space is {1, 2, 3, 4,
5, 6}
Random Variables can be either
Discrete or Continuous:
 Discrete Data can only take certain values (such
as 1,2,3,4,5)
 Continuous Data can take any value within a
range (such as a person's height)
 Data can be Descriptive (like "high" or "fast") or
Numerical (numbers).
 And Numerical Data can be
 Discrete or Continuous:
Discrete data is counted,
Continuous data is measured
 Discrete random variables have a countable
number of outcomes
 Examples: Dead/alive, treatment/placebo, dice,
counts, etc.
 Continuous random variables have an
infinite continuum of possible values.
 Examples: blood pressure, weight, the speed of a
car, the real numbers from 1 to 6.
DISCRETE DATA
 Discrete Data can only take certain values.
 Example: the number of students in a class (you
can't have half a student).
 Example: the results of rolling 2 dice:
 can only have the values 2, 3, 4, 5, 6, 7, 8, 9, 10,
11 and 12
PROBABILITY FUNCTIONS
 A probability function maps the possible
values of x against their respective
probabilities of occurrence, p(x)
 p(x) is a number from 0 to 1.0.
 The area under a probability function is
always 1.
DISCRETE EXAMPLE: ROLL OF A
DIE
p(x)
1/6
x
1 2 3 4 5 6
 P(x)  1
all x
CONTINUOUS DATA
 Continuous Data can take any value (within a
range)
 Examples:
 A person's height: could be any value (within the
range of human heights), not just certain fixed
heights,
 Time in a race: you could even measure it to
fractions of a second,
 A dog's weight,
 The length of a leaf,
 Lots more!
CONTINUOUS CASE
 The probability function that accompanies a

continuous random variable is a continuous
mathematical function that integrates to 1.
 For example, recall the negative exponential function
(in probability, this is called an “exponential
distribution”):
f ( x)  e  x
 This function integrates to 1:

 
e
x x
 e  0 1 1
0
0
CONTINUOUS CASE: “PROBABILITY
DENSITY FUNCTION” (PDF)
p(x)=e-x
The probability that x is any exact particular value (such as 1.9976) is 0;

we can only assign probabilities to possible ranges of x.
For example, the probability of x falling within 1 to 2:
Clinical example: Survival times

after lung transplant may roughly
follow an exponential function. p(x)=e-x
Then, the probability that a patient
will die in the second year after 1
surgery (between years 1 and 2) is
23%.
x
1 2
2 2

x x
P(1  x  2)  e  e  e  2  e 1  .135  .368  .23
1
1
EXPECTED VALUE AND VARIANCE
 All probability distributions are characterized by
an expected value (mean) and a variance
(standard deviation squared).
 Mean, Variance and Standard Deviation
 They have special notation:
 μ is the Mean of X and is also called the Expected
Value of X
 Var(X) is the Variance of X
 σ is the Standard Deviation of X
MEAN OR EXPECTED VALUE
 When we know the probability p of every
value x we can calculate the Expected Value
(Mean) of X:
 μ = Σxp
 Note: Σ is Sigma Notation, and means to sum up.
 To calculate the Expected Value:
 multiply each value by its probability
 sum them up
 It is a weighted mean: values with higher
probability have higher contribution to the mean.
EXPECTED VALUE, FORMALLY
Discrete case:
E( X )   x p(x )
all x
i i
Continuous case:
E( X )  
all x
xi p(xi )dx
SYMBOL INTERLUDE
 E(X) = µ
these symbols are used interchangeably
EXAMPLE: EXPECTED VALUE
 Recallthe following probability

distribution of ER arrivals:
x 10 11 12 13 14
P(x) .4 .2 .2 .1 .1
 x p( x)  10(.4)  11(.2)  12(.2)  13(.1)  14(.1)  11.3

i 1
i
EXPECTED VALUE
 Expected value is an extremely useful concept for
good decisionmaking!
2.3.2 EXPECTATIONS OF CONTINUOUS
RANDOM VARIABLES (2/2)
 Symmetric Random
f ( x)
Variables E( X )  m
x
 If has a p.d.f that is
f ( x) m
symmetric about a point
so that
f ( m  x)  f ( m  x)
E( X )  m
 Then, (why?)
 So that the expectation of m x
the random variable is equal
to the point of symmetry
E ( X )   xf ( x) dx
m 
 
-
xf ( x ) dx + xf ( x)dx
m
 y  2m  x
m m
 
-
xf ( x )dx +m  
-
yf ( y )dy
m
VARIANCE
 The Variance is:
 Var(X) = Σx2p − μ2
 To calculate the Variance:
 square each value and multiply by its probability
 sum them up and we get Σx2p
 then subtract the square of the Expected
Value μ2

STANDARD DEVIATION
 The Standard Deviation is the square root of the
Variance:
 σ = √Var(X)
DEFINITION AND INTERPRETATION OF

VARIANCE
 s2
Variance( )
 A positive quantity that measures the spread of the
distribution of the random variable about its mean value
 Larger values of the variance indicate that the distribution
is more spread out
 Definition: Var( X )  E (( X  E ( X )) 2 )
 E ( X 2 )  ( E ( X ))2
 Standard Deviation s
 The positive square root of the variance
 Denoted by
DEFINITION AND INTERPRETATION
OF VARIANCE
Var( X )  E (( X  E ( X )) )
2
 E ( X 2  2 XE ( X )  ( E ( X )) 2 )
 E ( X 2 )  2 E ( X ) E ( X )  ( E ( X )) 2
 E ( X 2 )  ( E ( X )) 2
f ( x)
Two distribution with
identical mean values
but different variances
x
SUMMARY
 A Random Variable is a variable whose
possible values are numerical outcomes of a
random experiment.
 The Mean (Expected Value) is: μ = Σxp
 The Variance is: Var(X) = Σx2p − μ2
 The Standard Deviation is: σ = √Var(X)
Joint Distributions
and Densities
PROBABILITY DENSITY
FUNCTION (PDF)
 In probability theory, a probability density
function (PDF), or density of a
continuous random variable, is a function that
describes the relative likelihood for this random
variable to take on a given value.
 The probability of the random variable falling within
a particular range of values is given by the integral of
this variable’s density over that range—that is, it is
given by the area under the density function but
above the horizontal axis and between the lowest and
greatest values of the range.
 The probability density function is nonnegative
everywhere, and its integral over the entire space is
equal to one.
2.2.2 PROBABILITY DENSITY FUNCTION
 Probability Density Function (p.d.f.)
 Probabilistic properties of a continuous random variable
f ( x) �0
�statespace
f ( x)dx  1
2.2.2 PROBABILITY DENSITY FUNCTION
 Example 14
 Suppose that the diameter of a metal cylinder has a p.d.f
f ( x)  1.5  6( x  50.2) 2 for 49.5 �x �50.5

f ( x)  0, elsewhere
f ( x)
49.5 50.5 x
JOINT DISTRIBUTIONS,
CONTINUOUS CASE
 In the following, X and Y are continuous random
variables. Most of the concepts and formulas
below are analogous to those for the discrete
case, with integrals replacing sums.
The principal difference between continuous lies in
the definition of the p.d.f./p.m.f. f(x, y):
The formula f(x, y) = P(X = x, Y = y) is no longer
valid, and there is no simple and direct way to
obtain f(x, y) from X and Y .
In the general formulas below, if a range of integration is not
explicitly given, the integrals are to be taken over the range in
which the density function is defined.
TWO CONTINUOUS RANDOM
VARIABLES
Example 5-12
VARIABLES
Figure 5-9 Region of integration for the probability that X <

1000 and Y < 2000 is darkly shaded.
VARIABLES
5-2.2 Marginal Probability Distributions

Definition
VARIABLES
Example 5-13
VARIABLES
Figure 5-10 Region of

integration for the
probability that Y <
2000 is darkly shaded
and it is partitioned into
two regions with x <
2000 and and x > 2000.
TWO DISCRETE RANDOM
VARIABLES
5-1.1 Joint Probability Distributions

TWO DISCRETE RANDOM
VARIABLES
Definition: Marginal Probability Mass Functions

Moments of Random
Variable
NEED OF MOMENTS
 Mean is not a sufficient criteria for classification
because mean of two different random variable may
be same.
 We have to find some more numbers other than
mean of random variable
 So we use concept of moments…
 It is possible to completely characterize the behavior
of random variable through its higher order
moments……..
 In mathematics, a moment is a specific quantitative
measure, used in both mechanics and statistics, of the
shape of a set of points.
 If the points represent probability density, then the
zeroth moment is the total probability (i.e. one),
 the first moment is the mean,
 the second central moment is the variance,
 the third moment is the skewness, and
 the fourth moment (with normalization and shift) is
the kurtosis.
 Mean
 The expectation (mean or the first moment) of a
discrete random variable X is defined to be:
 E(X)=∑x f(x)
 where the sum is taken over all possible values of X.
 E(X) is also called the mean of X or the average of X,
because it represents the longrun average value if the
experiment were repeated infinitely many times.
TYPES OF MEAN
 Arithmetic mean
 The arithmetic mean (or simply
"mean") of a sample , usually
denoted by , is the sum of the
sampled values divided by the
number of items in the sample:
 For example, the arithmetic
mean of five values: 4, 36, 45, 50,
75 is
 Geometric mean (GM)
 The geometric mean is an
average that is useful for
sets of positive numbers
that are interpreted
according to their product
and not their sum (as is
the case with the
arithmetic mean) e.g.
rates of growth.
 For example, the
geometric mean of five
values: 4, 36, 45, 50, 75 is:
 Harmonic mean (HM)
 The harmonic mean is an
average which is useful for
sets of numbers which are
defined in relation to some
unit, for example speed
(distance per unit of time).
 For example, the harmonic
mean of the five values: 4, 36,
45, 50, 75 is
 Relationship between AM, GM, and HM
 AM, GM, and HM satisfy these inequalities:
 Equality holds only when all the elements of the given
sample are equal.
VARIANCE
 In probability theory and statistics
, variance measures how far a set of numbers is
spread out.
 A variance of zero indicates that all the values
are identical.
 Variance is always nonnegative:
 a small variance indicates that the data points
tend to be very close to the mean (expected value)
and hence to each other,
 while a high variance indicates that the data
points are very spread out around the mean and
from each other.
 An equivalent measure is the square root of the
variance, called the standard deviation. The
standard deviation has the same dimension as
the data, and hence is comparable to deviations
from the mean.
 There are two distinct concepts that are both
called "variance". One variance is a characteristic
of a set of observations. The other is part of a
theoretical probability distribution and is defined
by an equation.
 The variance is one of several descriptors of
a probability distribution. In particular, the
variance is one of the moments of a distribution.
CALCULATING THE VARIANCE OF A FIXED
SET OF NUMBERS
 Suppose a population of numbers consists of 3, 4, 7,
and 10. The arithmetic mean of these numbers, often
informally called the "average", is (3+4+7+10)÷4 = 6.
The variance of these four numbers is the average
squared deviation from this average. These deviations
are (3–6) = –3, (4–6) = –2, (7–6) = 1, and (10–6) = 4.
Thus the variance of the four numbers is
 Definition
 The variance of a set of observed values that is
represented by random variable X is its second central
moment, the expected value of the squared deviation
from the meanμ = E[X]:
 Continuous random variable
 If the random variable X represents samples
generated by a continuous
distribution with probability density function f(x), then
the population variance is given by
 where is the expected value,
 Discrete random variable
 If the generator of random
variable X is discrete with probability mass
function x1 ↦ p1, ..., xn ↦ pn, then
 or equivalently
 where u is the expected value, i.e.
 Skewness
 The third central moment is a measure of the
lopsidedness of the distribution; any symmetric
distribution will have a third central moment, if
defined, of zero.
 The normalised third central moment is called
the skewness, often γ.
 A distribution that is skewed to the left (the tail of the
distribution is longer on the left) will have a negative
skewness.
 A distribution that is skewed to the right (the tail of
the distribution is longer on the right), will have a
positive skewness.
 Introduction
 Consider the two distributions in the figure just below.
Within each graph, the bars on the right side of the
distribution taper differently than the bars on the left side.
These tapering sides are called tails, and they provide a
visual means for determining which of the two kinds of
skewness a distribution has:
 negative skew: The left tail is longer; the mass of the
distribution is concentrated on the right of the figure. The
distribution is said to be leftskewed, lefttailed, or skewed to
the left.
 positive skew: The right tail is longer; the mass of the
distribution is concentrated on the left of the figure. The
distribution is said to be rightskewed, righttailed,
or skewed to the right.
EXAMPLE DISTRIBUTION WITH NONZERO
(POSITIVE) SKEWNESS. THESE DATA ARE
FROM EXPERIMENTS ON WHEAT GRASS
GROWTH.
KURTOSIS
 The fourth central moment is a measure of the
heaviness of the tail of the distribution,
compared to the normal distribution of the same
variance.
 Since it is the expectation of a fourth power, the
fourth central moment, where defined, is always
positive; and except for a point distribution, it is
always strictly positive.
 The fourth central moment of a normal
distribution is 3σ4.
 A highkurtosis distribution has a sharper peak and
fatter tails,
 while a lowkurtosis distribution has a more rounded
peak and thinner tails.
 A distribution with positive excess kurtosis is
called leptokurtic, or leptokurtotic. "Lepto"
means "slender“. In terms of shape, a
leptokurtic distribution has a more acute peak
around the mean and fatter tails.
 Examples of leptokurtic distributions include
the Student's tdistribution, Rayleigh
distribution, Laplace distribution, exponential
distribution,Poisson distribution and
the logistic distribution
 A distribution with negative excess kurtosis is
called platykurtic, or platykurtotic. "Platy" means
"broad". In terms of shape, a platykurtic distribution
has a lower, wider peak around the mean and thinner
tails
 Examples of platykurtic distributions include the
continuous or discrete uniform distributions, and
the raised cosine distribution.
PHYSICAL SIGNIFICANCE OF
MOMENTS
In communication system we deal with signals
which may be modeled as random variables. The
various moments of random variable indicated various
characteristics of RV.
1) Mean: Represent dc contents of signal.
2)Variance: Representation of intensity varying
component (ac component) of signal.
SD is measure of spread of density function about the
mean.
3) Skewness:Asymmetry of density function about mean
4) Kurtosis: Measure of peakdness or flatness of a
distribution
ESTIMATION OF
PARAMETER FROM
SAMPLES
INTRODUCTION
 Estimation theory is a branch of statistics that
deals with estimating the values
of parameters based on measured/empirical
data that has a random component.
 The parameters describe an underlying physical
setting in such a way that their value affects the
distribution of the measured data.
 Thus parameters are used for classification…
WHAT IS A PARAMETER?
 Parameters are descriptive measures of an entire
population used as the inputs for a probability
distribution function (PDF) to generate
distribution curves.
 Parameters are usually signified by Greek letters
to distinguish them from sample statistics.
 For example, the population mean is represented
by the Greek letter mu (μ) and the population
standard deviation by the Greek letter sigma ( σ).
 Parameters are fixed constants, that is, they do
not vary like variables. However, their values are
usually unknown because it is infeasible to
measure an entire population.
 Each distribution is entirely defined by several
specific parameters, usually between one and
three. The following table provides examples of
the parameters required for three distributions.
The parameter values determine the location and
shape of the curve on the plot of distribution, and
each unique combination of parameter values
produces a unique distribution curve.
 For example, a normal distribution is defined by
two parameters, the mean and standard
deviation. If these are specified, the entire
distribution is precisely known.
 The solid line represents a normal distribution
with a mean of 100 and a standard deviation of
15. The dashed line is also a normal distribution,
but it has a mean of 120 and a standard
deviation of 30.
 Parameters are descriptive measures of an entire
population. However, their values are usually
unknown because it is infeasible to measure an
entire population. Because of this, you can take a
random sample from the population to obtain
parameter estimates.
 These estimates are also known as sample

statistics. A fitted distribution line is a curve
based on the parameter estimates instead of on
the true parameter values.
PARAMETERS
1) Mean
2)Variance
3)Standard deviation
4)Moments
a) Mean
b) Variance
c) Skewness
d) Kurtosis
5) Pdf
6) Correlation
7) Cdf
8) Sample size
9) Distances
OVERVIEW OF PR
 Pattern recognition constitutes an important tool
in various application domains, but
unfortunately, that is not always an easy task to
carry out. Commonly, one can encounter four
major methodologies in PRSs; which are:
 statistical approach,
 syntactic approach,
 template matching,
 neural networks
STATISTICAL APPROACH
 Typically, statistical PRSs are based on statistics
and probabilities. In these systems, features are
converted to numbers which are placed into a vector
to represent the pattern.
 This approach is most intensively used in practice
because it is the simplest to handle
 In this approach, patterns to be classified are
represented by a set of features defining a specific
multidimensional vector: by doing so, each pattern
is represented by a point in the multidimensional
features space.
 To compare patterns, this approach uses measures
by observing distances between points in this
statistical space.
SYNTACTIC APPROACH
 Also called structural PRSs, these systems are
based on the relation between features. In this
approach, patterns are represented by structures
which can take into account more complex
relations between features than numerical
feature vectors used in statistical PRSs
TEMPLATE MATCHING
 Template matching approach is widely used in
image processing to localize and identify shapes in
an image.
 In this approach, one looks for parts in an image
which match a template (or model). In visual
pattern recognition, one compares the template
function to the input image by maximizing the
spatial crosscorrelation or by minimizing a
distance: that provides the matching rate.
NEURAL NETWORKS
 Typically, an artificial neural network (ANN) is a
selfadaptive trainable process that is able to
learn to resolve complex problems based on
available knowledge.
 A set of available data is supplied to the system
so that it finds the most adapted function among
an allowed class of functions that matches the
input.
A GENERIC SCHEME OF A
PATTERN RECOGNITION SYSTEM
PATTERN RECOGNITION
APPLICATIONS AND AN
OVERVIEW OF ADVANCES
 Pattern recognition is studied in many fields,
including psychology, ethnology, forensics,
marketing, artificial intelligence, remote sensing,
agriculture, computer science, data mining,
document classification, multimédia, biometrics,
surveillance, médical Imaging, bioinformatics and
internet search. Pattern recognition helps to resolve
various problems such as: optical character
recognition (OCR), zipcode recognition, bank check
recognition, industrial part inspection, speech
recognition, document recognition, face recognition,
gait recognition or gesture recognition, fingerprint
recognition,image indexing or retrieval, image
segmentation
3.1 PATTERN RECOGNITION IN
ROBOTICS
 In robotics, visual serving or visual tracking is of
high interest. For example visual tracking
allows, robots to extract themselves the content
of the observed scene as a human observer can do
it by changing his different perspectives and
scales of observation.
PATTERN RECOGNITION IN
BIOMETRICS
 Access control to governmental applications like
biometric passport and fight against terrorism.
 In this applications domain, one measures and
analyses human physical (or physiological or
biometric) and behavioral characteristics for
authentication (or recognition) purposes.
 Examples of biometric characteristics include
fingerprints, eye retinas and irises, facial
patterns and hand geometry measurement, DNA
(Deoxyribonucleic acid). Examples of biometric
behavioral characteristics include signature, gait
and typing patterns.
CONTENTBASED IMAGE
RETRIEVAL
 Contentbased image retrieval systems aim at
automatically describing images by using their
own content: the colour, the texture and the
shape or their combination.
 During the last decade the research on image
retrieval became of high importance.
 Colourbased features
 Colour features are based on colour distribution inside the
image.
 There are many approaches to define colourbased features:
colour space dominant, colour histogram o colour space.
Various colour representation space exist: redgreenblue
(RGB) space, huesaturation value (HSV).From these
representation features are defined based the colour
histograms.
 There are different types of colour histograms depending on
how the colour space is partitioned. The fixed binning for all
images based on scalar linear quantisation, the adaptive
binning based on an adaptive quantisation and the clustered
binning based on the concept of vector quantisation. Some
particular distances between histograms or main modes of
histograms are used to measure the similarity/dissimilarity
between colour histograms:
 Texturebased features
 For each pixel of the image, one can determine
the histogram of grey levels in predefined
 neighbouring region centred on that pixel.
Distribution of pairs of grey levels for a given
 spatial relation on pixels can be observed in co
occurrence matrix M(i,j)
SHAPEBASED FEATURES
 There are many approaches to estimates some
properties of the shapes.
 The elongation (EL), The mass deficit coefficient
(MD), The mass excess coefficient (ME), The
isotropic factor (IF), The compactness(CO)
PARAMETER ESTIMATION
METHODS
 Maximum likelihood: values of parameters are
fixed but unknown
 Bayesian estimation: parameters as random
variables having some known a priori
distribution

Unit 2

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Unit 2

Încărcat de

Drepturi de autor:

Formate disponibile

UNIT 2

1. The probability of any event E, P(E), must be between 0

0 < P(E) < 1.

2. If an event is impossible, the probability of the event is

P(e1) + P(e2) + … + P(en) = 1.

B A 1. What is the probability that the spinner

2. What is the probability that the

A 3. What is the probability that the

1 2 1. What is the probability of spinning a

2. What is the probability that a spinner

3. What is the probability of rolling a

You roll a six-sided die whose sides are

Find the probability of rolling a 4.

number of ways to roll a 4 1

You roll a six-sided die whose sides are

Find the probability of rolling an odd number.

number of ways to roll an odd number 3 1

You roll a six-sided die whose sides are

Find the probability of rolling a number less than 7.

number of ways to roll less than 7 6

 The probability function that accompanies a

 This function integrates to 1:

The probability that x is any exact particular value (such as 1.9976) is 0;

Clinical example: Survival times

 Recallthe following probability

 x p( x)  10(.4)  11(.2)  12(.2)  13(.1)  14(.1)  11.3

f ( x)  1.5  6( x  50.2) 2 for 49.5 �x �50.5

Figure 5-9 Region of integration for the probability that X <

5-2.2 Marginal Probability Distributions

Figure 5-10 Region of

5-1.1 Joint Probability Distributions

Definition: Marginal Probability Mass Functions

 These estimates are also known as sample

S-ar putea să vă placă și