Sunteți pe pagina 1din 44

Leong YK & Wong WY Introduction to Statistical Decisions 1

Chapter 3
No Data Decision Problems
Decision problems called statistical are those in which there
are data, or observation on the state of nature, hopefully
containing information that can be used to make a better
decision.
It will be useful to consider problems of making decisions in
the absence of data, not only because theses problems are
simpler, but also one approach to handling problems involving
data is to convert them to no data problem.

3.1 Introduction
The ingredients of a no data decision problem are the
triple ( , A , L) where
: the set of states of nature ;
A : the set of all available actions;
L : real valued function defined on A , in
which L( , a ) represents the loss incurred
when one takes action a and the state of nature
is in state .
will be referred to as the state space; A an action
space, and L the loss function.
Leong YK & Wong WY Introduction to Statistical Decisions 2

Whenever we are given a decision problem with finite


action space and the states of nature consists of two
elements, we plot the loss point of each action on the
plane.
Example 3.11
Consider a decision problem with the following loss
table:
L( , a ) a1 a2 a3
1 1 4 3
2 3 1 5
Leong YK & Wong WY Introduction to Statistical Decisions 3

Example 3.1.2
Consider a decision problem with the following loss
table:
L( , a ) a1 a2 a3
1 5 3 5
2 0 3 4

Example 3.1.3
Consider a decision problem with the following loss
table:
L( , a ) a1 a2 a3
1 5 3 4
2 0 2 4
Leong YK & Wong WY Introduction to Statistical Decisions 4

3.2 Regret
If one knew the state of nature, he would immediately
know what action to take, namely the action for which the
loss is a minimum. But if one takes the action which does
not produce this minimum, he would regret not having
chosen the action that produces the minimum.

The amount of loss he could have saved himself by


knowing the state of nature is called the regret. This is
then defined for each state and action ai as follows:

Lr ( , ai ) = L( , ai ) min L( , a )
a
Leong YK & Wong WY Introduction to Statistical Decisions 5

Example 3.2.1 ( Example 3.1.1 cont/ )


Consider a decision problem with the following loss
table:
L( , a ) a1 a2 a3
1 1 4 3
2 3 1 5

Regret table:
Lr ( , a ) a1 a2 a3
1 0 3 2
2 2 0 4
Leong YK & Wong WY Introduction to Statistical Decisions 6

Example 3.2.2 ( Example 3.1.2 cont/ )


Consider a decision problem with the following loss
table:
L( , a ) a1 a2 a3
1 5 3 5
2 0 3 4

Regret table:

Lr ( , a ) a1 a2 a3
1 2 0 2
2 0 3 4
Leong YK & Wong WY Introduction to Statistical Decisions 7

Example 3.2.3 ( Example 3.1.3 cont/)


Consider a decision problem with the following loss
table:
L( , a ) a1 a2 a3
1 5 3 4
2 0 2 4

Lr ( , a ) a1 a2 a3
1 2 0 1
2 0 2 4
Leong YK & Wong WY Introduction to Statistical Decisions 8

In order to keep the level of the course at the moderate


level, most of the problems to be taken up in this chapter
will be those in which both and A are finite.

3.3 Mixed Actions


Most people at some point have made decision by tossing
a coin. Introducing an extraneous random device turns
out to be useful for the purpose of discussing the general
theory of making decision and actually provides decision
rules that under some criteria are better than those that
use only pure actions.
Using a random device to select an action from the set of
all possible actions is called a mixed action.

Mixed Action
A mixed action for a problem with action space
A = { a1 ,L, an } is a probability vector
~
p = ( p1 ,L, pn ) , 0 pi 1, p1 + L + pn = 1.

It will be useful to denote a mixed action as

~ a1 , L , an
p= ( p1 , L , pn )
p1 , L , pn
Leong YK & Wong WY Introduction to Statistical Decisions 9

To carry out a mixed action, one conducts random


experiment with sample space = {1 ,L, n } having
the probability structure
P({i }) = pi , i = 1,L, n .

He then performs the experiment, and if the outcome is


i , he takes action ai .

We shall denote the mixed action simply by the


probability vector ~p = ( p1 ,L, pn ) with the understanding
that the action space consists of n actions.
The original actions a1 , L, an are called pure actions. A
pure action can be regarded as a degenerate randomized
action, in the sense that

a1 , a2 , L , an
a1 = ( 1,0,0, , 0)
1 , 0 ,L, 0
We denote the set of all mixed actions by A *. Note that
A can be imbedded in A * and be considered as a subset
of A *

Example 3.3.1
Suppose that the action space of the decision problem
consists of only two actins, say A = { a1 , a2 }. The mixed
action
Leong YK & Wong WY Introduction to Statistical Decisions 10

~
p = ( p ,1 p) , 0 p 1

can be carried out by tossing a coin with probability p of


head. If head is observed then action a1 is taken;
otherwise takes action a2 .

In a decision problem with a given loss function, the use


of mixed action makes the loss a random variable.

Loss of Mixed Action

The loss of the mixed action

~ a1, L , an
p= ( p1 , L , pn )
p1, L, pn
in a decision problem with (expected) loss
function L( , a ) is defined to be

L( , ~
p ) = in=1 pi L( , ai ) , .

In a decision problem with m states n actions. There are


m losses L( , ~
p ) corresponding to each mixed action.
That is,
Leong YK & Wong WY Introduction to Statistical Decisions 11

L(1, ~p ) = L(1, a1 ) p1 + L(1, a2 ) p2 + L + L(1, an ) pn


L( 2 , ~
p ) = L( 2 , a1 ) p1 + L( 2 , a2 ) p2 + L + L( 2 , an ) pn

M
L( m , ~
p ) = L( m , a1 ) p1 + L( m , a2 ) p2 + L + L( m , an ) pn

These relations can be written in vector form as

L(1, ~p) L(1, a1 ) L(1, a2 ) L(1, an )


~
L( 2 , p ) L( 2 , a1 ) L( 2 , a2 ) L( 2 , an )
M = p1 M + p2 M + L + pn M

~
L( m , p ) L( m , a1 ) L( m , a2 ) L( m , an )

which suggests the interpretation of the vector of losses


(L(1, ~p ) , L , L( m , ~p )) as a convex combination of the
loss points (L(1, ai ) , L , L( m , ai )) , i = 1, 2 , L, n .

Example 3.3.2 ( Example 3.1.1 cont/ )


Consider a decision problem with the following loss
table:
L( , a ) a1 a2 a3
1 1 4 3
2 3 1 5
Leong YK & Wong WY Introduction to Statistical Decisions 12

Suppose ~ p = ( p1, p2 , p3 ) is a mixed action. Then the


loss function of ~p is given by

L(1, ~p ) p1 + 4 p2 + 3 p3
~ =
L( 2 , p ) 3 p1 + p2 + 5 p3
1 4 3
= p1 + p2 + p2
3 1 5
The loss point of all mixed action fills up the interior
(and the boundaries) of the triangle with the pure loss
points as its vertices.

Example 3.3.3
Consider a decision problem with the following loss
table:
L( , a ) a1 a2 a3
1 5 3 5
2 0 3 4
Leong YK & Wong WY Introduction to Statistical Decisions 13

The (expected) loss function of mixed actions


~
p = ( p1 , p2 , p3 ) are

L(1, ~p) 5 3 5
~ = p1 + p2 + p3
L( 2 , p ) 0 3 4

Example 3.3.4
Consider a decision problem with the following loss
matrix
a1 a2
1 0 1
2 6 5

The loss points of the two pure actions are the end points
of the line segment joining (0,6) and (1,5)
Leong YK & Wong WY Introduction to Statistical Decisions 14

The loss point of a mixed action is a point lying on the


line segment joining the loss points of the pure actions.
Example 3.3.5
Consider the decision problem with the following loss
table:
a1 a2 a3 a4 a5
1 2 4 3 5 3
2 3 0 3 2 5

The loss points of the pure actions are the vertices of


the polygon; the loss point of one of the pure action
(namely a3 ) happens to fall inside the polygon.
The set of all the loss points of mixed actions fill
up the convex set generated by the five points of the
polygon.
Leong YK & Wong WY Introduction to Statistical Decisions 15

Convex Set
A set of points is said to be convex if the
line segment joining each of its points is
contained entirely in the set. The convex
hull of a set A is the smallest convex
containing A.

The set of all loss points of the mixed actions is the


convex hull of the loss points of pure actions.

3.4 Minimax Principle


The fundamental difficulty of decision problem has
already emerged, namely, the fact that actions are not
usually comparable in a natural way. In other words,
Leong YK & Wong WY Introduction to Statistical Decisions 16

there is no linear order defined for actions. An action


with smallest loss under one state of nature might not be
best under other state of nature.
The minimax principle places a value on each action
according to the worst that can happen with that action;
one take an action for which its maximum loss is a
minimum.

Minimax Action
An action a ' A is said to be a pure minimax
action if
max L( , a ' ) = min max L( , a )
aA
A mixed action ~p * is said to be a
minimax mixed action if
max L( , ~
p*) = min max L( , ~
p)
~
p A*
Leong YK & Wong WY Introduction to Statistical Decisions 17

Example 3.4.1
Consider a decision problem with the following loss
table:
a1 a2 a3
1 4 5 2
2 4 0 5
max L( , a ) 4 5 5

Action a1 is the minimax pure action.

Graphical analysis of the process of determining the


minimax action:

Move the wedge whose vertex is on the 450 line and


whose sides are parallel to the coordinate axes up to
the set of loss point of actions.
The first point of the pure action encountered gives
the pure minimax action.
Leong YK & Wong WY Introduction to Statistical Decisions 18

To determine the minimax action among the set of all


mixed actions is generally more complicated. However,
there are two cases that can be handled at this points:

(a) The state space consists of only two elements.


(b) The action space consists of only two elements

Example 3.4.2 ( Example 3.4.1 cont/ )


Consider a decision problem with the following loss
table:
a1 a2 a3
1 4 5 2
2 4 0 5

It follows immediately from the above figure that the


minimax mixed action is the mixture of actions a2 and
a3 . Thus the minimax mixed action is of the form
Leong YK & Wong WY Introduction to Statistical Decisions 19

~
p = (0 , p , 1 p ) , 0 < p < 1.
The loss point of the minimax mixed action lies on the
line segment joining the loss point of a2 and a2
The loss point of the minimax mixed action lies on the
bisector ( 450 line ), and hence
L( , ~p ) = L( , ~
1p). (*)2

Condition (*) implies that

5 p + 2(1 p ) = 0 p + 5(1 p )
or p = 3 / 8.

The following figure show that the minimax mixed action


is a mixture of actions a2 and a3 with more weights put
on to action a3 .
Leong YK & Wong WY Introduction to Statistical Decisions 20

Example 3.4.3
Consider a decision problem with loss table given by

a1 a2 a3 a4 a5
1 2 4 3 5 3
2 3 0 3 2 5
max L( , a ) 3 4 3 5 5

There are two minimax pure actions, namely action a2


and a3 .

Question

Between actions a1 and a3 which action do you prefer?


Why?
Leong YK & Wong WY Introduction to Statistical Decisions 21

It should be pointed out that the application of the


minimax principle to regrets in the above problem
produces a different solution.
Note that the regret table of the above decision problem is
Regret a1 a2 a3 a4 a5
1 0 2 1 3 1
2 3 0 3 2 5
max LR ( ,a ) 3 2 3 3 5

When there are two states of nature, a graphical solution


to the problem of determining a minimax mixed action
can be carried out in precisely the same manner as for the
pure actions
Leong YK & Wong WY Introduction to Statistical Decisions 22

Historical Note
The minimax regret criterion was developed by
statistician L.J. Savage.

L. J. Savage (1917~1971)
Example 3.4.4
Consider a decision problem with loss table given by

a1 a2 a3 a4 a5
1 2 4 3 5 3
2 3 0 3 2 5

The loss points of all the mixed actions form a polygon


with the loss points of the pure actions as its vertices.
Leong YK & Wong WY Introduction to Statistical Decisions 23

By moving the wedge with vertex on the 45o line up to


the loss set. It is clear that the loss point of the minimax
mixed action lies on the segment joining the loss points
of pure actions of a1 and a2 . This implies that the
minimax mixed action is of the form
~
p = ( p, 1 p , 0 , 0 , 0 )
Moreover ,
L(1, ~
p ) = L( 2 , ~
p)

That is, 2 p + 4(1 p ) = 3 p + 0(1 p ) ,


or
p = 4 / 5.
In the minimax mixed action, more weights have been
allocate to action a1.

Now we look for the minimax mixed regret action


Leong YK & Wong WY Introduction to Statistical Decisions 24

Regret a1 a2 a3 a4 a5
1 0 2 1 3 1
2 3 0 3 2 5

The regret point of the minimax mixed regret action is at


point B. Intuitively the minimax action we are looking for
put more weight on action a2 instead of
action a1. In fact , let q~ = ( q,1 q,0,0,0) be the minimax
mixed regret action. Then
L ( , q~ ) = 2(1 q ) = 3q = L ( , q~ )
r 1 r 2
So the minimax mixed action we are looking for is
q~ = ( 2 / 5 , 3 / 5 , 0 , 0 , 0)
Leong YK & Wong WY Introduction to Statistical Decisions 25

Note that
min max L( , a*) min max L( , a ) .
aA* aA

Example 3.4.5 ( Example 3.1.3 Revisited )


Consider a decision problem with the following loss
table:
L( , a ) a1 a2 a3
1 5 3 4
2 0 2 4

The corresponding regret table is given as follows:

Regret a1 a2 a3
1 2 0 1
2 0 2 4

Note that
Leong YK & Wong WY Introduction to Statistical Decisions 26

The minimax mixed loss action is the pure action


a3 .
The minimax mixed regret action ~ p is the mixture of
actions a1 and a2 . Clearly ~
p = (1 / 2 , 1 / 2 , 0) .

Because the loss points of all possible mixed


actions, when there are only two states of nature,
is a convex set, the minimax procedure of moving
a wedge with its vertex on the 450 line up until if
first strikes the loss set yields a minimax action.

Graphical method to find minimax mixed action can also


be applied to the case when the action space consists of
only two actions.

Example 3.4.6
Consider a decision problem with the following loss
table:
1 2 3
a1 0 3 5
a2 5 3 0
Let ~
p = ( p,1 p ) be a mixed action. Its loss function is
Leong YK & Wong WY Introduction to Statistical Decisions 27
L(1, ~p ) = 5(1 p )
L( 2 , ~
p) = 3 .
L( , ~
3 p) = 5 p
These loss functions are linear functions in p and are
plotted in the plane as follows:

There are two minmax mixed actions, namely


~
p = ( 2 / 5, 3 / 5) and ~
p = (3 / 5, 2 / 5)
1 2
This shows that even the minimax mixed action is not
unique.
Example 3.4.7
Consider a decision problem in which the loss table is
table by

1 2 3 4
a1 4 2 1 1
a2 0 1 5 2
Leong YK & Wong WY Introduction to Statistical Decisions 28

In general, a mixed action is of the form


~
p = ( p , 1 p) , 0 p 1
The expected loss of ~ p under various states of nature is a
linear function in p , namely,
1 : 4 p + 0(1 p ) = 4 p
2 : 2 p 1(1 p ) = 3 p 1
3 : 1 p + 5(1 p ) = 5 4 p
4 : p + 2(1 p ) = 2 3 p
These functions of p are shown as follows:

This shows that the minimax mixed action is


~
p = (5 / 8 , 3 / 8 )
and
Leong YK & Wong WY Introduction to Statistical Decisions 29

5
p ) = 4 = = 5 4 .
5 5
min max L( , ~
~
pA* A 8 2 8

Example 3.4.6
Consider a decision problem with the following loss
table:
1 2 3
a1 0 3 4
a2 4 3 0

Let ~
p = ( p ,1 p ) be a mixed action. Then

1 : 4(1 p )
2 : 3
3 : 4p
Leong YK & Wong WY Introduction to Statistical Decisions 30

What conclusion can you draw from the above figure?

3.5 Bayes Principle


Using the minimax principle to determine what action to
take protects against the worst that can possibly happen,
for each action, even though the state of nature that
produces that worst consequences has in some sense
only a remote chance of being the actual state.

Example 3.5.1
Consider the following decision problem with loss table
given by
a1 a2
Leong YK & Wong WY Introduction to Statistical Decisions 31

1 100 101
2 90 0
By minimax principle, action a1 is minimax. If the true
state of nature is 1, taking action a2 only incurs 1% loss
more that of action a1 . However, if 2 is the true state of
nature, taking action a2 is much more better than the use
of action a1.

Some statisticians believe that it is possible and useful to


treat the states of nature as random variables in every
decision problem. They believe that the distribution of
the state of nature is a subjective probability distribution
in the sense that it represents an individual
experimenters information and subjective beliefs about
the true state of nature.
Suppose that in a decision problem the state of nature is
random, represented by random variable taking values
in . In this case, for any action a taken the loss function
L( , a ) is a random variable. In decision problem a
probability function ( ) assigned to each state of nature
is called a prior distribution.

Baye Loss and Bayes Action


The Bayes loss of action a with respect to the
Leong YK & Wong WY Introduction to Statistical Decisions 32

prior distribution of is defined to be


L( , a ) = L( , a ) P ( = ) .

Action a' is said to be a Bayes action with
respect to if
L( , a' ) = min L( , a )
aA
This minimum value is denoted by ( ) and
called Bayes loss of the decision problem.

Example 3.5.2
Consider a decision problem with loss table given by

a1 a1
1 0 1
2 6 5

Suppose the prior distribution of the state of nature is


given by
: P( = 1 ) = w = 1 P ( = 2 ) , 0 w 1.
The Bayes loss of the actions are
L( , a1 ) = 0 w + 6(1 w)
L( , a2 ) = 1w + 5(1 w)
Note that
Leong YK & Wong WY Introduction to Statistical Decisions 33

L( , a1) L( , a2 ) 6(1 w) w + 5(1 w)


w 0 .5 .

Example 3.5.3
Consider a decision problem with loss table given by

a1 a2 a3 a4
1 6 5 2 3
2 1 2 5 4

Suppose the prior distribution of the state of nature is


given by
: P( = 1 ) = w = 1 P ( = 2 ) , 0 w 1.
The Bayes losses of the actions are
Leong YK & Wong WY Introduction to Statistical Decisions 34
a1 : 6 p + 1(1 p ) = 1 + 5 p
a2 : 5 p + 2(1 p ) = 2 + 2 p
a3 : 2 p + 5(1 p ) = 5 3 p
a4 : 3 p + 4(1 p ) = 4 p

Thus the Bayes action is


a1 , 0 w 1 / 3

a = a2 , 1 / 3 w 3 / 5
a , 3 / 5 w 1
3

When the state space consists of only two elements, one


can use graphical method to determine the Bayes action.
Question : Does Bayes mixed action reduces the
Leong YK & Wong WY Introduction to Statistical Decisions 35

the minimum Bayes loss ?

Suppose that the loss points of the pure actions are


displayed as follows:

Let the prior probabilities of is assumed to be. The


vector by joining point a3 and a6 is perpendicular to the

vector = < w ,1 w > . So the dot product

a3a6 = 0
or
< w ,1 w > < L(1, a6 ) L(1, a3 ) , L( 2 , a6 L( 2 , a3 ) > = 0

or L( , a6 ) = L( , a3 ) .

Next we consider the dot product of vector with vector


Leong YK & Wong WY Introduction to Statistical Decisions 36

a6a4 . These two vectors are parallel and also in the


same direction. So ~ a6a4 >0. This implies that

L( , a6 ) < L( , a4 )

This gives us an algorithm to look for a Bayes pure


action.

Move a line segment which is perpendicular to the


r
vector until it touches a loss point of a pure
action. The action whose loss point is first touched
by this line is the loss point of the Bayes action.

Since the set of all loss points of mixed actions form a


convex polygon whose vertices are the loss points of the
pure actions. Therefore, there exist a pure action
which is Bayes against a given prior probability
distribution.
Leong YK & Wong WY Introduction to Statistical Decisions 37

Question : Is the Bayes action using regrets different


from the one when losses are used?

Recall that
Lr ( , a ) = L( , a ) min L( , a ' )
a 'A
= L( , a ) k ( ) , say
Therefore,
Lr ( , a ) = L( , a ) k ( ) P ( = ) .

Since Lr ( , a ) differs from L( , a ) by a term that does
not involve action a , this shows that there is no
difference in using loss or regret under Bayes principle.
Leong YK & Wong WY Introduction to Statistical Decisions 38

Example 3.5.4
Consider the decision problem with the following loss
table:
a1 a2 a3
1 2 5 3
2 3 1 5
Let the prior probabilities of be
: P( = 1) = w ; P( = 2 ) = 1 w
The Bayes loss of the (pure) actions are
L( , a1) = 2 w + 3(1 w) = 3 w
L( , a2 ) = 5w + 1(1 w) = 1 + 4w
L( , a3 ) = 3w + 5(1 w) = 5 2 w
The graphs of these lines are shown as below:
Leong YK & Wong WY Introduction to Statistical Decisions 39

It implies that the Bayes action is given by

a2 , w 2/5
a = .
a1 , w 2/5

Notice that for prior probability P( = 1 ) = w , 0 < w < 1,


action a3 will not be Bayes against the prior distribution
of . ( Why ? )
Another way to look into this problem is as follows:
Example 3.5.5
Consider again the decision problem stated in Example
4.5.4 with loss table
a1 a2 a3
1 2 5 3
2 3 1 5
The convex hull of the three loss points is a triangular
region with the loss points as its vertices. Now suppose
the prior probability of is given by
: P ( = 1) = w < 2 / 5 ; P( = 2 ) = 1 w > 3 / 5
Leong YK & Wong WY Introduction to Statistical Decisions 40

The slope of the line joining loss points of a1 and a2 is


31 2
m= =
25 3
Therefore the slope of the line segment is 3 / 2 . This
implies that the probability vector along the direction OC
is = <2/5, 3 / 5 > .

3.6 Dominance and Admissibility


Recall the problem considered Example 4.4.1 with the
following loss table:
a1 a2 a3 a4 a5
1 2 4 3 5 3
2 3 0 3 2 5
Even though action a3 is a minimax action, however, it
would not be used because the losses incurred by using
action a1 is less than that of action a3
Leong YK & Wong WY Introduction to Statistical Decisions 41

Dominance
Action a ' is said to dominate action a if
L( , a ' ) L( , a) for all .
If the above inequality is strict for some
is strict, then action a is said to be
inadmissible.
Action which is not inadmissible is called
admissible action.

Example 3.6.1
Consider a decision problem with loss points of actions
displaced in the following figure:

Draw lines at a lost point of action, say a , parallel to


the horizontal axis and parallel to the vertical axis.
Leong YK & Wong WY Introduction to Statistical Decisions 42

If there is another loss point below or on the


horizontal line, or on the left or on the vertical
line, then action a is inadmissible
In the above figure, action a2 is dominated
strictly by a1 and hence is inadmissible.
Several possible generalities of optimal actions:
Bayes actions are usually admissible.
Minimax action is a Bayes action.
Admissible actions are Bayes, for some prior
distribution

3.7 Least Favorable Prior distribution


Consider again the decision problem stated in Example
3.5.5 with loss table
a1 a2 a3
1 2 5 3
2 3 1 5

Again suppose that the prior distribution of is given by

: P( = 1) = w , P( = 2 ) = 1 w
Leong YK & Wong WY Introduction to Statistical Decisions 43

As has been shown that the Bayes action against is a2


if w 2 / 5 and is a3 otherwise. In any case, the minimum
Bayes loss is less than or equal to L( 0 , ai ) if the prior
distribution of 0 is taken to be

0 : P( = 1) = 2 / 5 , P( = 2 ) = 3 / 5 .
From the statisticians viewpoint, this is the worst case to
deal with the nature, and he will call this prior
distribution, the least favorable prior distribution.
Because of posing this prior distribution, the minimum
loss incurred by the statistician will be maximized.
Leong YK & Wong WY Introduction to Statistical Decisions 44

3.8 Minimax, Bayes and Admissible

Bayes action with constant loss is minimax.


An inadmissible action is not Bayes against
any prior distribution which allocate positive
Probabilities to all the states of nature.
Admissible action is Bayes against some prior
distribution of

If action a * has constant loss and is Bayes against a


prior distribution , then a * is a minimax action.
If action a * is Bayes against a prior distribution
where ( ) = P( = ) > 0 for all , then a * is an
admissible action.

S-ar putea să vă placă și