Documente Academic
Documente Profesional
Documente Cultură
e = E XY {[Y ! c(X)] } =
2
[y
!
c(x)]
f X,Y (x, y)dxdy
#
!" !"
"
"
(8-4)
!"
Note that the above integrals are positive, so that e will be minimized
if the inner integral is a minimum for all values of x.
Note that for a fixed value of X, c(x) is a variable [not a function].
5
(8-5)
"
# c(x) f
Y |X
!"
gives
Y = c(X) =
!"
"
# yf
Y |X
(y)dy = E[Y | X]
(8-6)
!"
MMSE Example
Let the random point (X, Y) is uniformly distributed on a semicircle
1
Y
(12)1/2!
Y = E{Y | X } = E{ X 3 | X } = X 3 .
(8-7)
Clearly in this case Y =X3 is the best estimator for Y. Thus the
best estimator can be nonlinear.
Example : Let
#kxy,
f X , Y ( x, y ) = "
! 0
f X ( x) = x f X ,Y ( x, y )dy = x kxydy
kxy
=
2
2 1
kx(1 x 2 )
=
,
2
0 < x < 1.
y
1
Thus
1
f X , Y ( x, y )
kxy
2y
f Y X ( y | x) =
=
=
; 0 < x < y < 1.
2
2
f X ( x)
kx (1 x ) / 2 1 x
Y = ( X ) = E{Y | X } = x y f Y | X ( y | x)dy
1
= x y
2y
1 x 2
3
dy =
2
1 x 2
2
y
x dy
2 y
2 1 x 3 2 (1 + x + x 2 )
=
=
=
.
2
2
2
31 x x 31 x
3 1 x
(8-8)
11
! XY " Y
a=
"X
b = Y ! a X
(8-9)
2
XY
(8-10)
12
2
XY
(8-14)
13
!e
= E[2Z("X)] = E[(Y " Y )X] = 0
!A
(8-16)
! ! !
! ! ! !2 ! ! !2
v u = | v | cos! | u |, v v = | v | , u u = | u |
(8-16b)
We can use the dot product to express the orthogonal projection of one vector
onto another, as in the figure below.
15
!
!
The length of v proj is | v | cos! ; its direction is that of the unit
! !
vector u/ | u |; thus
!
! !
! !
!
!
! !
| v | cos! | u | u v u !
v proj = | v | cos! u/ | u | =
!
! = ! !u
|u |
| u | u u
(8-16c)
Now compare these identities with the expressions for the second moments
for zero-mean random variables
XY = ! X "! Y ,
XX = ! X2 ,
YY = ! Y2
(8-16d)
16
In general, the linear MMSE estimate has a higher MSE than the
(usually nonlinear) MMSE estimate E[Y|X]
If X and Y are jointly Gaussian RVs, it can be shown that the
conditional PDF of Y given X = is a Gaussian PDF with mean
Y + (Y /X)( X)
(8-17)
and variance
(Y)2(1 2)
(8-18)
Hence,
E[Y|X = ] = Y + (Y /X)( X) (8-19)
which is the same as the linear MMSE.
For jointly Gaussian RVs, MMSE estimate = linear MMSE estimate
Another special property of the Gaussian RV.
17
=
S(t)
(8-21)
18
a #" #b
(8-23)
!
a
19
+ ! ) = E[ S(t
+ ! ) | S(t)] = aS(t)
S(t
(8-25a)
E{[S(t + ! ) ! aS(t)]S(t)} = 0
and we can solve for a as
a = RS ()/RS (0)
(8-25b)
(8-26)
20
= E[ S(t)
| X(t)] = aX(t)
S(t)
(8-27)
E{[S(t) ! aX(t)]X(t)} = 0
(8-28)
(8-29b)
21
" a S(t + kT )
k
0#! #T
(8-30)
k=! N
|n| # N
0 # ! # T (8-31)
k=! N
22
" a R (kT ! nT ) = R (! ! nT ),
k
! N # n # N, 0 # ! # T (8-32)
k=! N
23
(8-34)
=
S(t)
"
!" # t # "
(8-35)
!"
Note that the estimate (t) is the output of a linear filter with
impulse response h() and with input X(t).
The orthogonality condition gives
E{[S(t) ! S(t)]X(t
!! ) = 0
all !
(8-36)
24
!" # t # "
(8-37)
!"
Which becomes
"
RSX (! ) =
# h(" )R
XX
(8-38)
!"
SXX (! )
25
(8-40)
SSS (! )
H (! ) =
SSS (! ) + S"" (! )
(8-41)
26
If the spectra SSS() and S() shown below do not overlap, then
H() = 1 in the band of the signal and H() = 0 in the band of the
noise, and the MMSE is zero.
1
2
2
emin = ! S (1! " SX ) =
2#
1
=
2#
"
*
[S
(
$
)
!
H
($ )SSX ($ )]d$
# SS
!"
"
SSS ($ )S%% ($ )
# S ($ ) + S ($ ) d$
%%
!" SS
(8-42)
27
E{eh( X )} = 0,
implying that
e = Y E{Y | X }
This follows since
h( X ).
(8-44)
28
PILLAI
which is the output of a linear time invariant, non-causal system with input
X[n] and impulse response h[n].
By the orthogonality principle we have
"
!"
RSX [m] =
for all m
(8-47)
"
# h[k]R
XX
[m ! k],
for all m
(8-48)
k=!"
SSX [z]
S XX [z]
(8-49)
29
m $1
(8-51a)
k=1
m $1
(8-51b)
k=1
30
31
S[n-k], k 1:
L
= E[S[n] | s[n ! k],1 " k " L] = # h[k]S[n ! k]
S[n]
(8-52)
k=1
1# m # L
k=1
(8-53)
1# m # L
(8-54)
k=1
32
By rewriting [8-54] as
L
0#m#L
[8-55]
k=1
Rh = r
[8-56]
" R0
$
$ R1
$ R2
$
$
$R
$ L!2
# RL!1
R2 !
RL!1 %
'
R0 R1 ! RL!2 '
R1 R0 ! RL!3 '
'
"
'
R2 ! R0 R1 '
'
RL!2 ! RL!1 R0 &
R1
" h1 % " R1 %
$
' $ '
$ h2 ' $ R2 '
$ h3 ' $ R3 '
$
'=$ '
$" ' $ " '
$h ' $ '
$ L!1 ' $ '
# hL & # R0 &
[8-57]
Desired
Response:
d(n)
Filter
Output:
y(n)
Estimation
Error: e(n)
The basic concept behind Wiener Filter theory is to minimize the difference
between the filter output, y(n), and some desired output, d(n). Noise could be
present in the filter output. This minimization either performs a matrix inversion
such as in (8-58) to find the Wiener filter when the model is known, or when the
model has some unknown parameters will use the least mean square (LMS)
approach, which adaptively adjusts the filter coefficients to reduce the square of
the difference between the desired and actual waveform after filtering. As before,
we will assume that H(z) is a feedforward finite impulse response (FIR) filter with
coefficients h(k) = hk , K=1,2,L.
The system is described by the following equation
L
[8-59]
36
Unknown
System
Input:
x(n)
Estimation
Error: e(n)
_
H(z)
Linear Filter
Estimated
Output: y(n)
37
38
% Sampling frequency
% Number of points
% Optimal filter order
function b = wiener_hopf(x,y,maxlags)
% Function to compute LMS algol using Wiener-Hopf equations
% Inputs:
x = input
%
y = desired signal
%
Maxlags = filter length
% Outputs:
b = FIR filter coefficients
%
rxx = xcorr(x,maxlags,'coeff'); % Compute the autocorrelation vector
rxx = rxx(maxlags+1:end)';
% Use only positive half of symm. vector
rxy = xcorr(x,y,maxlags);
% Compute the crosscorrelation
vector
rxy = rxy(maxlags+1:end)';
% Use only positive half
%
rxx_matrix = toeplitz(rxx);
% Construct correlation matrix
b = rxx_matrix\rxy;
% Calculate FIR coefficients using matrix
%
inversion,
40
Example: Results
0
-2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.6
0.7
0.8
0.9
80
90
100
Time(sec)
After Optimal Filtering
500
0
-500
0
0.1
0.2
0.3
0.4
0.5
Time(sec)
0.6
Optimal Filter Frequency Plot
0.4
0.2
0
10
20
30
40
50
60
Frequency (Hz)
70
41
Input:
x(n)
H(z)
Response:
y(n)
Error: e(n)
Desired
Response:
d(n)
42
h2
Weight Values
h1
estimated gradient
43
!E[(en )2 ]
!(en )2
=E
!h(k)
!h(k)
[8-60]
where we have used the property that the differentiation and expectation
operations are interchangeable.
So, since we dont have access to the average MSE, we will drop the E operation
and use the fact that
[8-61]
!(en )2
!h(k)
is an unbiased estimate of the gradient [8-60] to approximate the gradient.
44
!en2
![d(n) " y(n)]
= 2e(n)
= "2e(n)x(n " k)
!hn (k)
!hn (k)
[8-62]
45
n=1,2,,
[8-64]
46
47
Example Results
Matching Process
4.5
6
4
5
3.5
|H(z)|
|H(z)|
2.5
3
2
1.5
1
1
0.5
50
100
150
Frequency (Hz)
200
250
50
100
150
200
250
Frequency (Hz)
48
x(n) + N(n)
Signal Channel
N(n)
Reference Channel
Adaptive
Filter
N*(n)
Desired output
Adaptive Noise
Cancellation
Error Signal
e(n) = x(n) -N*(n)
49
B(n) + Nb(n)
Delay
D
Decorrelation
delay
Adaptive
FIR Filter
Nb*(n)
Desired output:
Narrowband
Signal
e(n) = Bb(n) + Nb(n) - Nb*(n)
(Adaptive Line
Enhancement)
Error Signal
50
LMS Algorithm
The LMS algorithm is implemented in the function lms.
The input is x, the desired signal is d, delta is the
convergence factor and L is the filter length.
function [b,y,e] = lms(x,d,delta,L)
% Simple function to adjust filter coefficients using the LSM algorithm
% Adjusts filter coefficients, b, to provide the best match between
%
the input, x(n), and a desired waveform, d(n),
% Both waveforms must be the same length
% Uses a standard FIR filter
%
M = length(x);
b = zeros(1,L); y = zeros(1,M);
% Initialize outputs
for n = L:M
x1 = x(n:-1:n-L+1);
% Select input for convolution
y(n) = b * x1';
% Convolve (multiply) weights
with input
e(n) = d(n) - y(n);
% Calculate error
b = b + delta*e(n)*x1;
% Adjust weights
end
52
Example: Results
SNR -8 db; 10 Hz sine
4
x(t)
2
0
-2
0
0.1
0.2
0.3
0.4
y(t)
0.5
0.6
0.7
0.8
0.9
0.6
0.7
0.8
0.9
90
100
Time(sec)
0.5
-0.5
0
0.1
0.2
0.3
0.4
-7
x 10
0.5
Time(sec)
6
|H(f)|
Application of an
adaptive filter using
the LSM recursive
algorithm to data
containing a single
sinusoid (10 Hz) in
noise (SNR = -8
db). The filter
requires the first 0.4
to 0.5 seconds to
adapt (400-500
points), and that the
frequency
characteristics after
adaptation are
those of a
bandpass filter with
a single cutoff
frequency of 10 Hz.
4
2
0
10
20
30
40
50
Frequency (Hz)
60
70
80
53
Example 4: Results Unlike a fixed Wiener Filter, an adaptive filter can track
changes in a waveform as shown in this example where two sequential
sinusoids having different frequencies (10 & 20 Hz) are adaptively filtered.
10 & 20 Hz SNR -6 db
6
4
x(t)
2
0
-2
-4
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
1.2
1.4
1.6
1.8
Time(sec)
0.4
After Adaptive Filtering
y(t)
0.2
-0.2
-0.4
0.2
0.4
0.6
0.8
Time(sec)
55
0
-0.5
-1
0.5
1.5
2.5
3.5
3.5
3.5
x(t) + n(t)
Signal + interference
1
0
-1
-2
0.5
1.5
2.5
y(t)
In this
application,
approximately
1000 samples
(2.0 sec) are
required for the
filter to adapt
correctly.
x(t)
0.5
0
-1
-2
0.5
1.5
Time(sec)
2.5
56
Vc(t)
V(t)
Lowpass
Filter
Vout(t)
Vc*(t)
Phase
Shifter
58
1.2
BW
1
0.8
0.6
0.4
0.2
fc
0
0.5
1.5
Frequency (Hz)
59
wn = .02;
[b,a] = butter(2,wn);
%
% Phase sensitive detection
ishift = fix(.125 * fs/fc);
vc = [vc(ishift:N) vc(1:ishift-1)];
v1 = vc .* vm;
vout = filter(b,a,v1);
60
Example: Results
Phase Sensitive Detection
2
Vm(t)
1
0
-1
-2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.4
0.5
0.6
0.7
0.8
0.9
0.4
0.5
0.6
0.7
0.8
0.9
Vm(t)
1
0
-1
-2
0
0.1
0.2
0.3
Vm(t)
1
0.5
0
-0.5
Demodulated Signal
0
0.1
0.2
0.3
Time (sec)
61
Kalman Filter
The Kalman Filter is a recursive (iterative) time-domain data
processing algorithm in the time domain that solves the same
problems as the Wiener filter. The Kalman filter can be also be
made adaptive, but we will not cover this topic (I do in my Digital
Communications course )
Generates optimal estimate of desired quantities given the set of
measurements (estimation, prediction, interpolation, smoothing,)
Optimal filtering for linear system and white Gaussian errors,
Kalman filter is best estimate based on all previous
measurements
Recursive/Iterative
Does not need to store all previous measurements and reprocess all
data each time step.
Kalman algorithmic approach can be viewed as two steps: (1)
prediction and then (2) correction.
62
Black box
system model
System
model noise
Input
System
dynamics
System
state
Output
device
Optimal
estimate of
system state
Observed
output
Kalman
FilterEstimator
Measurement
noise
63
[8-66a]
where the system state process is denoted by yk with filter parameters, A and
B, and the output filter H that are known. The model noise is wk. The process zk
is the observable system output (filtered signal + noise) and the process uk is the
system input. The model noise wk has covariance Q, the measurement noise vk
has covariance R, and P denotes the prediction error co-variance matrix.
Kalman Filter algorithm is a two-step process: prediction and correction
1. Prediction: -k is an estimate based on measurements at previous time steps
that follows the system above system dynamics
-k = Ayk-1 + Buk
[8-67a]
P-k = APk-1AT + Q
[8-67b]
2. Correction: k has additional information the measurement at time k
k = -k + Kk(zk - H -k )
[8-68a]
Pk = (I - KkH)P-k
where Kk = P-kHT(HP-kHT + R)-1
[8-68b]
64
Blending Factor
If we are sure about measurements:
Measurement error covariance of the output noise R decreases to zero
The Kalman Gain, Kk decreases and weights residual more heavily than
prediction
65
Kk = P-kHT(HP-kHT + R)-1
(2) Update estimate with measurement zk
k = -k + K(zk - H -k )
(3) Update error covariance
Pk = (I - KH)P-k
66
67
k
-0.1
-0.2
-0.3
-0.4
-0.5
-0.6
-0.7
10
20
30
40
50
60
70
80
90
100
68
10
20
30
40
50
60
70
80
90
100
k-0.1
-0.2
-0.3
-0.4
-0.5
-0.6
-0.7
10
20
30
40
50
60
70
80
90
100
70
Comparing the Least-Squares (Kalman ) and the Least Mean Square Error
(Wiener) Approach
2
n
"w
N!n 2
n
n=o
(8-69b)
71
LSE N = " w N!n en2 = " w N!n (zk ! dk )2 = " w N!n (y'n c n ! dn )2
n=0
n=0
(8-70a)
n=0
(8-70b)
(8-70c)
= wAn!1 + rn rn' ,
(8-70d)
and
(8-70e)
k=0
The key result that we use to derive the Kalman algorithm is the
!1
Matrix Inversion Lemma to determine An!1 from An!1
!1
n
A =w
!1
{A
!1
n!1
!1
!1
An!1
rn rn' An!1
!
!1
w + rn' An!1
rn
Letting, Dn = An!1
kn =
(8-71a)
(8-71b)
1
Dn!1rn [denotes the Kalman Gain] (8-71c)
w + n
n = rn' Dn!1rn
(8-71d)
(8-71e)
(8-71f)
This algorithm takes "big" steps in the direction of the Kalman gain to iteratively
realize the optimum tap setting at each time instant [based upon the received
samples up to "n"]. The algorithm is effectively using the Gram-Schmidt
orthogonalization technique to realize copt from the successive input vectors, {rn }
The Kalman algorithms converge in ~N iterations !
73