Sunteți pe pagina 1din 27

ELEG 5633 Detection and Estimation Detection Theory I

Jing Yang
Department of Electrical Engineering University of Arkansas

September 24, 2013

Detection Theory
Binary Bayesian hypothesis test
Likelihood Ratio Test Minimum probability of error MAP: maximum a posteriori ML: maximum likelihood

Neyman-Pearson test
Receiving Operating Characteristics (ROC) curves

M -ary hypothesis testing Composite hypothesis testing

Hypothesis Test
Each possible scenario is a hypothesis. Observed data: x = [x1 , x2 , . . . , xk ]T scalar case: k = 1 Random hypothesis test H is a random variable, taking one of {H0 , H1 , . . . , HM 1 } A priori probabilities: m := P[H = Hm ] Likelihood function: X pX |H (x|Hi ) M hypothesis, M 2: {H0 , H1 , . . . , HM 1 }

Nonrandom Hypothesis test

No proper priors about H Performance Criterion Given H = Hm , X is still random: X pX (x; Hm )

Generalized Bayesian risk Neyman-Pearson (NP) 3

A Binary Detection Example


Source Encoder Channel Decoder Destination

010001000100

0 : s0 (t) = sin(0 t) 1 : s1 (t) = sin(1 t) s0 (t) + n(t) if 0 sent x ( t) = s1 (t) + n(t) if 1 sent
H0 : 0 is sent H1 : 1 is sent ( x ) = H0 ? H H1 ?

Another Binary Detection Example

Early seizure detection triggers an intervention. k dierent channels: xi [n], i = 1, 2, . . . , k . H0 : not seisure H1 : seisure () : Rk {H0 , H1 } H 5

Binary Random Hypothesis Testing: H0 vs H1


Given: Priors 0 := P[H = H0 ] 1 := P[H = H1 ] Measurement model/ likelihood function H0 : X p 0 H1 : X p 1

Example
Additive Gaussian White Noise (AWGN) communication channel. A bit, 0 or 1, is sent through a noisy channel. The receiver gets the bit plus noise, and the noise is modeled as a realization of a N (0, 1) random variable. Assume that both 0 and 1 are equally likely, i.e., P0 = P1 = 1/2. Once the receiver receives a signal x, it must decide between two hypotheses, H0 : X N (0, 1)

H1 : X N (1, 1) (x)? Question: How to design the decision rule H

Decision Regions
x
Decision Rule

( x ) = H0 ? H

H1 ?

A decision is made by partitioning the range of X into two disjoint regions. ( x ) = H0 } R0 = {x : H ( x ) = H1 } R1 = {x : H


x2 R1

R0

Example
x = [x1 , x2 ]T R2
R1

x1

Question: How to design the decision regions R0 and R1 ? 8

Expected Cost Function Metric


To optimize the choice of decision regions, we can specify a cost for decisions. Four possible outcomes in a test: = H0 H = H1 H H0 correct incorrect H1 incorrect correct

Assign a cost ci,j for each outcome (i, j ), i, j {0, 1}

Denoted as (i, j ), i, j {0, 1}, i: decision Hi ; j : true distribution Hj

Example
Cost Assignment example: 1) Digital communications: c0,0 = 0, c0,1 = 1, c1,0 = 1, c1,1 = 0. 2) Seizure detection: H0 : no seizure; H1 : seisure c0,1 c1,0

Bayes Cost
Bayes Cost: the expected cost associated with a test
1

C=
i,j =0

ci,j j P(decide Hi |Hj is true)

Express Bayes cost directly in terms of the decision regions


1

C=
i,j =0 1

ci,j j P(decide Hi |Hj is true) ci,j j P(X Ri |Hj is true) ci,j j pj (x)dx
Ri

=
i,j =0 1

=
i,j =0

=
R0

c0,0 0 p0 (x) + c0,1 1 p1 (x)dx +


R1

c1,0 0 p0 (x) + c1,1 1 p1 (x)dx

10

Optimum Decision Rule: Likelihood Ratio Test


To minimize the Bayes cost, we should design the decision regions as R1 := {x : c0,0 0 p0 (x) + c0,1 1 p1 (x) > c1,0 0 p0 (x) + c1,1 1 p1 (x)} Therefore, the optimal test takes the following form: p1 (x) H1 0 (c1,0 c0,0 ) p0 (x) H0 1 (c0,1 c1,1 ) Likelihood ratio test (LRT) L(x) := p1 (x) H1 , p0 (x) H0 := 0 (c1,0 c0,0 ) can be precomputed 1 (c0,1 c1,1 ) R0 := {x : c0,0 0 p0 (x) + c0,1 1 p1 (x) < c1,0 0 p0 (x) + c1,1 1 p1 (x)}

L( )
11

L
H0

H1

Special Cases
cij = 1 [i j ], then Bayes cost = Probability of error, i.e., Pe The likelihood ratio test reduces to p1 (x) H1 0 p0 (x) H0 1 1 p1 (x) P[H =
H1 H0 H1 H1 | x ] H0

0 p0 (x) P[H = H0 |x]

This is called Max A Posteriori (MAP) Detector If 0 = 1 , cij = 1 [i j ], LRT reduces to p 1 ( x) H1 1 p0 (x) H0 p1 (x)
H1 H0

p0 (x)

This is called Maximum Likelihood (ML) Detector 12

Properties of LRT
L(x) is a sucient statistic. It summarizes all relevant information in the observations. Any 1 1 transformation on L is a sucient statistic. g (): an invertible function l(x) := g (L(x)) If g () is monotonically increasing l ( x) Eg: g (x) = ln x LRT indicates decision regions (x) = H0 } = {x : L(x) < } R0 = {x : H (x) = H1 } = {x : L(x) > } R1 = {x : H 13
H1 H0

g ( )

Example
Additive Gaussian White Noise (AWGN) communication channel. A bit, 0 or 1, is sent through a noisy channel. The receiver gets the bit plus noise, and the noise is modeled as a realization of a N (0, 1) random variable. Assume that both 0 and 1 are equally likely, i.e., 0 = 1 = 1/2. Let ci,j = 1 if i = j ; ci,j = 0 if i = j . What is the optimal decision rule to minimize the expected cost? H0 : X N (0, 1)

H1 : X N (1, 1)

14

Example
Additive Gaussian White Noise (AWGN) communication channel. Two messages m = {0, 1} with probabilities 0 = 1 . Given m, the received signal is a N 1 random vector X = sm + W where W N (0, 2 IN N ). {s0 , s1 } are known to the receiver. Design an optimal detector w.r.t Pe criterion.

15

Neyman-Pearson Hypothesis Testing


Binary Bayesian Test: LRT, then compare it with a threshold is a function of 0 , 1 , cij What if the prior probabilities are not known precisely? An alternative design specication: Minimizes one type of error subject to a constraint on the other type of error. H0 H1 = H0 correct Miss Detection H = H1 False Alarm H correct = H1 ; H0 ] = Type I: PF A = P[H
R1

p(x; H0 )dx p(x; H1 )dx


R0

= H0 ; H1 ] = Type II: PM = P[H Note: PM = 1 PD , where PD =


R1

p(x; H1 )dx.

Such design does not require prior probabilities nor cost assignments. 16

Neyman-Pearson Theorem
Theorem
To maximize PD with a given PF A = , decide H1 if L(x) := where is found from PF A =
x:L(x)>

p(x; H1 ) p(x; H0 )

p(x; H0 )dx =

17

Example
Assume the scalar observation X is generated according to
2 H0 :X N (0, 0 )

2 H1 :X N (0, 1 ),

2 2 1 > 0

what is the NP test if we want to have PF A < 103 ?

Example
Given N observations, Xi , i = 1, 2, . . . , N , which are i.i.d and, depending on the hypothesis, generated according to
2 H0 :Xi N (0, 0 )

2 H1 :Xi N (0, 1 ),

2 2 1 > 0

What is the NP test?

Example
Given N observations, Xi , i = 1, 2, . . . , N , which are i.i.d and, depending on the hypothesis, generated according to H0 :Xi = Wi H1 :Xi = + Wi where Wi N (0, 2 ). What is the NP test if we want to have PF A < 103 ?

20

Receiver Operating Characteristics (ROC) of the LRT


ROC: plot PD versus PF A ; summarizes the detection performance of NP-test. LRT: L(x) :=
p(x;H1 ) p(x;H0 )

PD ( ) =
x:L(x)>

p(x; H1 )dx p(x; H0 )dx


x:L(x)>

PF A ( ) = = 0: PD = PF A = 1 = +: PD = PF A = 0 As increases, both PD , PF A decreases

21

22

Properties of ROC
The ROC is a nondecreasing function. The ROC is above the line PD = PF A . Proof: ip a biased coin with P[head] = p, P[tail] = 1 p. = H0 if the outcome is tail; H = H1 otherwise. Let H Then, PD = PF A = p. Vary p to achieve every point on the line. The ROC is concave. How to prove it? Hint: construct a randomized rule, and randomly pick between two LRTs with dierent .

23

M -ary Hypothesis Testing


Now, we wish to decide among M hypotheses {H0 , H2 . . . , HM 1 }. Focus on the Bayesian test. Priors: 0 , 1 , . . . , M 1 . Likelihood: pX |Hi (x|Hi ) := pi (x) The Bayes cost
M 1 M 1

C=
i=0 j =0

cij j P(decide Hi |Hj is true)

For an observed x, the expected cost if we assign it to Ri :


M 1

Ci (x) := Ej [cij |x] = To minimize the Bayes cost, we let (x) = arg H

j =1

cij P[H = Hj |x]

i{0,1,...,M 1}

min

Ci (x)

24

Special Cases: MAP detector


cij = 1 [i j ], then Bayes cost = Probability of error, i.e., Pe
M 1

C i ( x) =
j =1 M 1

cij P[H = Hj |x] =

j =i

P[H = Hj |x]

=
j =1

P[H = Hj |x] P[H = Hi |x] min C i ( x) P[H = Hi |x]

(x) = arg H = arg

Hi {H0 ,H1 ,...,HM 1 } Hi {H0 ,H1 ,...,HM 1 }

max

Max A Posteriori (MAP) Detector for M -ary case.

25

Special Cases: ML detector


If 0 = . . . = M 1 = (x) = arg H = arg = arg
1 M,

cij = 1 [i j ], MAP reduces to max P[H = Hi |x]

Hi {H0 ,H1 ,...,HM 1 }

p(x|H = Hi )P[H = Hi ] p(x) Hi {H0 ,H1 ,...,HM 1 } max


Hi {H0 ,H1 ,...,HM 1 }

max

p(x|H = Hi )

Maximum Likelihood (ML) Detector for M -ary case.

26

Example
Assume that we have three hypotheses H0 :Xi = s + Wi , i = 1, . . . , N i = 1, . . . , N i = 1, . . . , N

H1 :Xi = Wi ,

H2 :Xi = s + Wi ,

where Wi are i.i.d N (0, 1). What is the optimal decision rule to minimize Pe if 0 = 1 = 2 = 1/3?

27

S-ar putea să vă placă și