Documente Academic
Documente Profesional
Documente Cultură
Views of Probability
The probability that this coin will land heads up is
1
2
.
Frequentist (sometimes called objectivist) viewpoint: This
statement means that if the experiment were repeated
many, many times, the long-run average proportion of
heads would tend to
1
2
.
Bayesian (sometimes called subjectivist or personal)
viewpoint: This statement means that the person making
the statement has a prior opinion about the coin toss such
that he or she would as soon guess heads or tails if the
rewards are equal.
In the frequentist viewpoint, the probability of an event A,
( ) P A
, represents the long run frequency of event A in
repeated experiments.
In the Bayesian viewpoint, the probability of an event A,
( ) P A
, has the following meaning: For a game in which if
Aoccurs the Bayesian will be paid $1,
( ) P A
is the amount
of money the Bayesian would be willing to pay to buy into
the game. Thus, if the Bayesian is willing to pay 50 cents
to buy in,
( ) P A
=.5. Note that this concept of probability is
personal:
( ) P A
may vary from person to person depending
on their opinions.
In the Bayesian viewpoint, we can make probability
statements about lots of things, not just data which are
subject to random variation. For example, I might say that
the probability that Franklin D. Roosevelt had a cup of
coffee on February 21, 1935 is .68. This does not refer to
any limiting frequency. It reflects my strength of belief
that the proposition is true.
Rules for Manipulating Subjective Probabilities
All the usual rules for manipulating probabilities apply to
subjective probabilities. For example,
Theorem 11.1: If
1
C
and
2
C
are mutually exclusive, then
1 2 1 2
( ) ( ) ( ) P C C P C P C +
.
Proof: Suppose a person thinks a fair price for
1
C
is
1 1
( ) p P C
and that for
2
C
is
2 2
( ) p P C
. However, that
person believes that the fair price for
1 2
C C
is
3
p
which
differs from
1 2
p p +
. Say
3 1 2
p p p < +
and let the
difference be
1 2 3
( ) d p p p +
. A gambler offers this
person the price
3
4
d
p +
for
1 2
C C
. The person takes the
offer because it is better than
3
p
. The gambler sells
1
C
at a
discount price of
1
4
d
p
and sells
2
C
at a discount price of
2
4
d
p
to the person. Being a rational person with those
given prices of
1 2 3
, , and p p p
, all three of these deals seem
very satisfactory. At this point, the person has received
3
4
d
p +
and paid
1 2
2
d
p p +
. Thus before any bets are
paid off, the person has
3 1 2 3 1 2
3
( )
4 2 4 4
d d d d
p p p p p p + + +
.
That is, the person is down
4
d
before any bets are settled.
We now show that no matter what event happens, the
person will pay and receive the same amount in settling the
bets:
Suppose
1
C
happens: the gambler has
1 2
C C
and the
person has
1
C
so they exchange $1s and the person is
still down
4
d
. The same thing occurs if
2
C
happens.
Suppose neither
1
C
nor
2
C
happens, then the gambler
and the person receive zero, and the person is still
down
4
d
.
1
C
and
2
C
cannot occur together since they are
mutually exclusive.
Thus, we see that it is bad for the person to assign
3 1 2 1 2 1 2
( ) ( ) ( ) p P C C p p P C P C < + +
Because the gambler can put the person in a position to lose
1 2 3
( ) / 4 p p p +
no matter what happens. This is
sometimes referred to as a Dutch book.
The argument when
3 1 2
p p p > +
is similar and
can also lead to a Dutch book. Thus
3
p
must equal
1 2
p p +
to avoid a Dutch book; that is,
1 2 1 2
( ) ( ) ( ) P C C P C P C +
.
Note that
( | ) ( | ) ( ) h x f x
as varies so that the
posterior distribution is proportional to the likelihood times
the prior.
Based on the posterior distribution, we can get a point
estimate, an interval estimate and carry out hypothesis tests
as we shall discuss below.
Bayesian inference for the normal distribution
Suppose that we observe a single observation
x
from a
normal distribution with unknown mean
and known
variance
2
. Suppose that our prior distribution for
is
2
0 0
( , ) N
.
The posterior distribution of
is
( | ) ( )
( | ) ( | ) ( )
( | ) ( )
f x
h x f x
f x d
Now
2 2
0
2 2
0
0
2 2
0
2 2
0
2 2
2 0 0
2 2 2 2 2 2
0 0 0
1 1 1 1
( | ) ( ) exp ( ) exp ( )
2 2
2 2
1 1
exp ( ) ( )
2 2
1 1 1
exp 2
2
f x x
x
x x
1
1
1
1
]
]
1
1
]
_ _
+ + + +
'
, ,
1
;
1
]
Let
, , and a b c
be the coefficients in the quadratic
polynomial in
1 1
_ _
1 1 ' ;
, ,
1 1
] ]
The second term does not depend on
+
and variance
2
1
2 2
0
1
1 1
+
.
Comments about role of prior in the posterior distribution:
The posterior mean is a weighted average of the prior mean
and the data, with weights proportional to the respective
precisions of the prior and the data, where the precision is
equal to 1/variance. If we assume that the experiment (the
observation of X ) is much more informative than the prior
distribution in the sense that
2 2
0
<<
, then
2 2
1
1
x
Thus, the posterior distribution of
is
0
2 2
0
2 2
0
1 1
x
+
+