Particle Filtering and Moving Horizon Estimation

Computers and Chemical Engineering 30 (2006) 15291541
Particle filtering and moving horizon estimation

James B. Rawlings a, , Bhavik R. Bakshi b
a
Department of Chemical and Biological Engineering, University of Wisconsin, Madison, United States
b Department of Chemical and Biomolecular Engineering, Ohio State University, United States
Received 6 March 2006; received in revised form 18 May 2006; accepted 22 May 2006
Available online 21 July 2006
Abstract
This paper provides an overview of currently available methods for state estimation of linear, constrained and nonlinear systems. The following
methods are discussed: Kalman filtering, extended Kalman filtering, unscented Kalman filtering, particle filtering, and moving horizon estimation.
The current research literature on particle filtering and moving horizon estimation is reviewed, and the advantages and disadvantages of these
methods are presented. Topics for new research are suggested that address combining the best features of moving horizon estimation and particle
filters.
2006 Elsevier Ltd. All rights reserved.
Keywords: State estimation; Particle filtering; Moving horizon estimation
1. Introduction
For the purposes of this paper we consider the following discrete time dynamic system
The fundamental question of state estimation arises in many

fields of science and engineering. How does one best combine
knowledge from two sources an a priori model and online
measurements from a dynamic system in real time to estimate
the state of the dynamic system. Because of its wide application, many different scientific and engineering disciplines have
contributed to our understanding of state estimation. The fields
of systems theory, statistics and applied probability, as well as
many different applications areas have contributed underlying
theory, methods, and computational algorithms for state estimation. The purpose of this review is to assess some of the more
recent and more active areas of this diverse research literature.
Our goal is to summarize the state of the art and point out some
of the current research directions that show promise for application in the area of process systems and control. Because of the
very wide scope of the state estimation problem, we are forced
to limit the review to only those parts of the field with which the
authors have some direct knowledge or experience. Even with
this restricted scope, we found it necessary to provide only a
brief overview of the many different methods, and focus attention mainly on recent developments.
x(k + 1) = F (x(k), u(k)) + G(x(k), u(k))w(k)
(1a)
y(k) = h(x(k)) + v(k)
(1b)
Corresponding author.
E-mail address: rawlings@engr.wisc.edu (J.B. Rawlings).
0098-1354/$ see front matter 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.compchemeng.2006.05.031
in which
x(k) is the state of the system at time t(k). The initial value,
x(0), is a random variable with a given density;
u(k) is the system input at time t(k) (assumes a zero-order hold
over the time interval [t(k),t(k + 1)];
w(k) and v(k) are sequences of independent random variables, called process and measurement noises, respectively,
with time-invariant densities;
F(x(k), u(k)) is a (possibly) nonlinear system model. F may be
the solution to a first principles, differential equation model;
G(x(k), u(k)) is a full column rank matrix (this condition
is required for uniqueness of the conditional density to be
defined later;
y(k) is the system measurement or observation at time t(k);
h is a (possibly) nonlinear function of x(k).
The state estimation problem is to determine an estimate of
the state x(T) given the chosen model structure and a sequence of
noisy observations (measurements) of the system, Y(T) := {y(0),
. . ., y(T)}. As might be expected from such a fundamental
1530
J.B. Rawlings, B.R. Bakshi / Computers and Chemical Engineering 30 (2006) 15291541
problem statement, state estimation has found diverse application in science and engineering over many years. In the stochastic
setting chosen here, the conditional density of the state given the
measurements, px|Y (x(T)|Y(T)) is the natural statistical distribution of interest, and the state estimation problem is essentially
solved if we can find this distribution. The complete conditional density is difficult to calculate exactly, however, except
for well-known simple systems, such as when F and G are
linear, and w and v are normally distributed. In this case the
conditional density is also Gaussian with mean and covariance
provided by the well-known Kalman filter. When F and G are
nonlinear, however, the conditional density is not Gaussian, and
obtaining a complete solution is generally impractical. Moreover, when state estimation is used as part of a feedback control
system, the state estimator must meet other requirements. The
estimate must be found during the available sample time of the
system as each measurement becomes available. The on-line
requirements provide further limitations on what is achievable
in state estimation. In this review, we consider many of the methods for solving this problem including Kalman filtering (KF),
extended Kalman filtering (EKF), unscented Kalman filtering
(UKF), particle filtering (PF), and moving horizon estimation
(MHE).
Although the Kalman filter is the optimal state estimator for
unconstrained, linear systems subject to normally distributed
state and measurement noise, many physical systems, exhibit
nonlinear dynamics and have states subject to hard constraints,
such as nonnegative concentrations or pressures. Hence Kalman
filtering is no longer directly applicable. As a result, many different types of nonlinear state estimators have been proposed;
Daum (2005) provides a highly readable and tutorial summary of
many of these methods, and Soroush (1998) provides a review
with a focus on applications in process control. We focus our
attention on techniques that formulate state estimation in a probabilistic setting, that is, both the model and the measurement
are potentially subject to random disturbances. Such techniques
include the extended Kalman filter, moving horizon estimation,
Bayesian estimation, and Gaussian sum approximations. In this
probabilistic setting, state estimators attempt to reconstruct the
conditional density px|Y (x(T)|Y(T)). In many applications, the
entire density is not of interest, but a single point estimate of
the state is of most interest. One question that arises, then, is
which point estimate is most appropriate for this use. Two obvious choices for the point estimate are the mean and the mode of
the conditional density. For asymmetric distributions, Fig. 1(a)
demonstrates that these estimates are generally different. Additionally, if this distribution is multi-modal as is Fig. 1(b), then
the mean may place the state estimate in a region of low probability. Clearly, the mode is a more desirable estimate in such
cases.
For nonlinear systems, the conditional density is generally
asymmetric and potentially multi-modal. Such systems are not
pathological cases. On the contrary, in this paper we include a
multi-modal example in Section 4 that requires only a single,
isothermal chemical reaction with second-order kinetics. This
example is based on the work in (Haseltine & Rawlings, 2005),
which derives some simple conditions that lead to the formation
Fig. 1. Comparison of mean and mode as candidate point estimates for (a)
asymmetric and (b) multi-modal densities.
of multiple modes in the conditional density for systems tending to a steady state. Bakshi and coworkers show examples with
simple continuous stired-tank reactor (CSTR) models of chemical reactions that produce multi-modal conditional densities
(Chen, Bakshi, Goel, & Ungarala, 2004). Alspach and Sorenson
(1972) and references contained within, Gordon, Salmond, and
Smith (1993), and Chaves and Sontag (2002) have proposed
other examples in which multiple modes arise in the conditional
density. Gaussian sum approximations (Alspach & Sorenson,
1972) offer one method for addressing the formation of multiple modes in the conditional density for unconstrained systems. Current Bayesian estimation methods (Blviken, Acklam,
Christopherson, & Strdal, 2001; Chen, Ungarala, Bakshi, &
Goel, 2001; Gordon et al., 1993; Spall, 2003) offer another
means for addressing multiple modes, but these methods propose estimation of the mean rather than the mode. Gordon et
al. (1993) suggest using continuous density estimation techniques to estimate the mode of the conditional density. Silverman
(1986) demonstrates via a numerical example that the number
of samples required to reconstruct a point estimate within a
given relative error increases exponentially with the dimensionality of the state, so we expect continuous density estimation
may be applicable only to systems with low-dimensional state
vectors.
The basic formulation of a Bayesian solution to estimation in
nonlinear dynamic systems has existed for at least four decades
(Ho & Lee, 1964). The use of sequential Monte Carlo (SMC)
or particle filtering methods for solving this task can be traced
back to the late sixties (Handschin & Mayne, 1969). Until
recently, however, this formulation was not practical due to the
computational challenges posed by multi-dimensional Bayesian
integration and the need for on-line or sequential processing.
The challenge of solving Bayesian integration problems is
adequately addressed by Markov Chain Monte Carlo (MCMC)

methods (Gelfand & Smith, 1990; Robert & Casella, 1998).
This approach is restricted to those problems where all the
data are available, however. Thus, MCMC is best for solving
Bayesian problems in batch mode. It is not convenient for
problems where measurements are obtained sequentially and
the prior needs to be updated to obtain the posterior as each
measurement is obtained. Furthermore, MCMC requires the
prior specification in an analytic form, which is not readily
available in sequential problems, since the prior at any time
step is more conveniently represented by samples. Usually an
analytic form of the prior is available as the initial guess at the
first time instant, but solving the problem by MCMC in batch
(non-recursive) mode as more data are obtained is not practical
due to the increasing problem size. Particle filtering provides
a recursive approach that overcomes these shortcomings
of MCMC.
The resurgence of research in particle filtering can be traced
back to the work of Gordon et al. (1993) and is fueled by theoretical developments combined with increasing computational
power. Since then Bayesian estimation by particle filtering has
been proposed in many areas including, signal and image processing and target recognition (Azimi-Sadjadi & Krishnaprasad,
2005; Doucet, Godsill, & Andrieu, 2000; de Freitas, Noranjan,
Gee, & Doucet, 2000; Gordon et al., 1993), estimation in nonlinear dynamic chemical processes (Chen et al., 2004), constrained
estimation (Chen, Bakshi, Goel, & Ungarala, 2005), and fault
detection (Azimi-Sadjadi & Krishnaprasad, 2004). Despite this
flurry of activity, many challenges still need to be addressed for
applying PF to practical systems, and many opportunities remain
for exploiting the benefits of this approach. One such challenge
is posed by the phenomena of degeneracy and impoverishment
of the particles representing the posterior distribution. Degeneracy arises when a few particles dominate due to their large
weights and fail to capture the underlying distribution, causing
inferior estimates. A popular approach for avoiding degeneracy is to resample the particles to replace those with small
weights. Such an approach, if not applied carefully, may lead
to impoverishment of particles due to their reduced diversity.
Criteria for avoiding indiscriminate resampling have been suggested based on calculation of an effective sample size (Kong,
Liu, & Wong, 1994). Other methods for avoiding degeneracy and
impoverishment include the use of optimal importance sampling
(Doucet et al., 2000) and a resample-move strategy (Berzuini
& Gilks, 2001, 2003), with the latter being restricted to static
problems.
Like most recursive methods, particle filtering (PF) also relies
on an initial guess of the prior distribution. However, unlike
existing methods, such as extended Kalman filtering (EKF) and
moving horizon estimation (MHE), PF is more sensitive to a poor
initial guess. That is, it can require many more measurements for
PF to recover from the effects of a poor initial guess than EKF or
MHE. This sensitivity arises because a poor initial guess means
that there is little overlap between the particles representing the
initial prior and the likelihood obtained from the measurement
and measurement model. Due to the limited number of particles,
the posterior distribution is often less accurate than that obtained
1531
by methods that rely on an approximate but continuous prior

distribution, such as the Gaussian assumption made by EKF
and MHE (Chen et al., 2004). Methods based on combining
EKF and PF have been suggested (de Freitas et al., 2000), but
may not work well (Chen et al., 2004). More recent methods
based on empirical Bayes techniques perform better (Chen et
al., 2004), but still leave room for improvement (Goel, Lang,
& Bakshi, 2005). The combination of MHE smoothing (Tenny,
2002) with SMC may be a promising approach for overcoming
this challenge.
Even though PF avoids assumptions of Gaussian or fixed
shape distributions and can approximate arbitrary shapes via the
particles, existing methods report point estimates as the mean
of the particles. Many problems where PF is most attractive
have multi-modal distributions, but the particle mean is unable
to capture this feature. The mode or modes may be estimated
from the available particles, but this approach is not likely to be
very accurate due to the discretization and limited number of
particles. Again, careful combination of PF with methods based
on continuous distributions such as MHE may be appropriate
for tracking multiple modes.
2. Linear systems
2.1. Unconstrained
Consider the linear, time invariant model with Gaussian
noise,
F (x, u) = Ax + Bu,
G(x, u) = G
w N(0, Q), v N(0, R), x(0) N(x0 , Q0 )
in which Q, Q0 , R > 0. The conditional density can be evaluated
exactly for this case. It is convenient to express the conditional
density before and after measurement y(k) in a recursion as follows
px|Y (x(k)|Y (k 1)) = N(x (k), P (k))
px|Y (x(k)|Y (k)) = N(x(k), P(k))
The mean and covariance of the conditional density before
measurement are given by
x (k + 1) = Ax(k) + Bu(k)
(2)
P (k + 1) = AP(k)A + GQG
(3)
x (0) = x 0 ,
(4)
P (0) = Q0
The mean and covariance after measurement are given by

x (k) = x (k) + L(k)(y(k) Cx (k))
1
(5)
L(k) = P (k)C (R + CP (k)C )
(6)
P(k) = P (k) L(k)CP (k)
(7)
in which L(k) is the filter gain. For the linear case, every density in
sight is normal, the mean is equal to the mode for every density,
and the issue of which statistical property to use for the point
estimate does not arise. Example IV-A illustrates these results
1532
and examines the performance of a particle filter on the same

problem.
2.2. Constrained
Constraints on the w(k), v(k), x(k) can be added to the problem
formulation to refine the statistical description of the model. The
statistical distribution of the noises are then truncated normals.
One may also use constraints to generate asymmetric distributions by piecing together truncated probability density functions
as a jigsaw using variable decompositions (Robertson & Lee,
1995, 2002). The conditional densities are also truncated normals (Robertson & Lee, 2002). The modes of the conditional
density can be found by solving a convex quadratic program.
The mean and mode of the conditional density are again equal
and the state estimate is uniquely defined. Because the densities
are not multi-modal and a convex quadratic program (QP) can be
solved to find the state estimate, the solution for the constrained,
linear model is reasonably well in hand. Moreover, Bakshi and
coworkers have recently proposed a PF method for treating constrained systems (Chen et al., 2005).
As discussed in (Rao, 2000; Rao & Rawlings, 2002; Rao,
Rawlings, & Mayne, 2003) care should be exercised when
adding constraints to models used for state estimation. Constraints on w(k) are not problematic. However, we advise against
constraining v(k) due to the possibility of measurement outliers.
These constraints may amplify the effect of spurious measurements. Constraints on x(k) are nonstandard as well. One usually
chooses a model of the plant and, separately, the characteristics
of the disturbances, such as boundedness, or that the disturbances are independent and identically distributed with known
(zero) mean and variance. The properties of the model and disturbances are distinct. State constraints, on the other hand, correlate
the disturbances with the state and may lead to acausality. If one
is trying to enforce physical constraints, such as positivity of
concentrations, for example, an alternative is to use a physically
based nonlinear model that enforces the constraints automatically for all allowable x(0) and disturbance sequences. It remains
unclear if a simplified linear model with added state constraints is
a better choice than an appropriate nonlinear model that implicitly enforces the state constraints. This choice depends also on
how well current state estimation methods can handle the nonlinear model of interest.
3. Nonlinear systems
3.1. Extended Kalman ltering
The EKF linearizes the nonlinear system, then applies the
Kalman filter to obtain the state estimates. The method can be
summarized in a recursion similar in structure to the Kalman
filter (Stengel, 1994, 31, pp. 387388)
x (k + 1) = F (x(k), u(k))
(k) + G(k)Q
(k)
P (k + 1) = A(k)P(k)
A
G
x (0) = x 0 ,
P (0) = Q0
The mean and covariance after measurement are given by

x (k) = x (k) + L(k)(y(k) h(x (k))
(k)C
(k)(R + C(k)P
(k))1
L(k) = P (k)C
(k)
P(k) = P (k) L(k)C(k)P

in which the following linearizations are made
F (x, u)
A(k)
=
,
x
h(x)
C(k)
=
x
and all partial derivatives are evaluated at x (k) and u(k), and
G(k)
= G(x(k), u(k)). The densities of w, v and x0 are assumed
to be normal. Many variations on the same theme have been
proposed such as the iterated EKF and the second-order EKF
(Gelb, 1974, 32, 190192). Of the nonlinear filtering methods, the EKF method has received the most attention due to
its relative simplicity and demonstrated effectiveness in handling some nonlinear systems. Examples of implementations
include estimation for the production of silicon/germanium alloy
films (Middlebrooks, 2001), polymerization reactions (Prasad,
Schley, Russo, & Bequette, 2002), and fermentation processes
(Gudi, Shah, & Gray, 1994). However, the EKF is at best an
ad hoc solution to a difficult problem, and hence there exist
many pitfalls to the practical implementation of EKFs (see, for
example, Wilson, Agarwal, & Rippin, 1998). These problems
include the inability to accurately incorporate physical state
constraints and the naive use of linearization of the nonlinear
model.
Until recently, few properties regarding the stability and convergence of the EKF had been proven. Recent publications
present bounded estimation error and exponential convergence
arguments for the continuous and discrete EKF forms given
detectability, small initial estimation error, small noise terms,
and no model error (Reif, Gunther, Yaz, & Unbehauen, 1999,
2000; Reif & Unbehauen, 1999).
However, depending on the system, the bounds on initial estimation error and noise terms may be unrealistic. Also, initial
estimation error may result in bounded estimate error but not
exponential convergence, as illustrated by Chaves and Sontag
(2002).
Julier and Uhlmann (2004) summarize the status of the EKF
as follows:
The extended Kalman filter is probably the most widely used
estimation algorithm for nonlinear systems. However, more
than 35 years of experience in the estimation community has
shown that it is difficult to implement, difficult to tune, and only
reliable for systems that are almost linear on the time scale of
the updates.
We seem to be making a transition from a previous era in
which new approaches to nonlinear filtering were criticized as
overly complex because the EKF works, to a new era in which
researchers are demonstrating ever simpler examples in which
the EKF fails completely. The unscented Kalman filter is one of
the methods developed specifically to overcome the problems
caused by the naive linearization used in the EKF.
3.2. Unscented Kalman ltering

The linearization of the nonlinear model at the current state
estimate may not accurately represent the dynamics of the nonlinear system behavior even for one sample time. In the EKF
prediction step, the mean propagates through the full nonlinear
model, but the covariance propagates through the linearization. The resulting error is sufficient to throw off the correction
step and the filter can diverge even with a perfect model. The
unscented Kalman filter avoids this linearization at a single point
by sampling the nonlinear response at several points. The points
are called sigma points, and their locations and weights are chosen to satisfy the given starting mean and covariance (Julier &
Uhlmann, 2004a, 2004b).1 Given x and P, choose sample points,
zi , and weights, wi , such that

x =
wi z i ,
P=
wi (zi x )(zi x )
i
Similarly, given w N(0, Qw ) and v N(0, Rv ), choose sample

points ni for w and mi for v. Each of the sigma points is propagated forward at each sample time using the nonlinear system
model. The locations and weights of the transformed points then
update the mean and covariance.
zi (k + 1) = F (zi (k), u(k)) + G(zi (k), u(k))ni (k),
all i
From these we compute the forecast step

x =
w i zi ,
P =
wi (zi x )(zi x )
i
After measurement, the EKF correction step is applied after

first expressing this step in terms of the covariances of the innovation and state prediction

i = h(zi ) + mi ,
y =
wi i
i
The output error is given as Y : y y . We next rewrite the

Kalman filter update as
x = x + L(y y )
1
L = E ((x x )Y ) E (YY )

P C
P=
(R+CP C )1

LE ((x x )Y )

CP
in which we approximate the two expectations with the sigma

point samples

E ((x x )Y )
wi (zi x )(i y )
E (YY )
i
i
wi ( y )(i y )
1 Note that this idea is fundamentally different than the idea of particle filtering,
which is discussed subsequently. The sigma points are chosen deterministically,
for example as points on a selected covariance contour ellipse or a simplex. The
particle filtering points are chosen by random sampling.
1533
See Julier and Uhlmann (2004a), Julier, Uhlmann, and

Durrant-Whyte (2000), van der Merwe, Doucet, de Freitas, and
Wan (2000) for more details on the algorithm. An added benefit
of the UKF approach is that the partial derivatives F(x,u)/x,
h(x)/x are not required. See also Nrgaard, Poulsen, and Ravn
(2000) for other derivative-free nonlinear niters of comparable
accuracy to the UKF. See (Julier & Uhlmann, 2002; Lefebvre,
Bruyninckx, & De Schutter, 2002) for an interpretation of the
UKF as a use of statistical linear regression.
The UKF has been tested in a variety of simulation examples
taken from different application fields including aircraft attitude
estimation, tracking and ballistics, and communication systems.
In the chemical process control field, Romanenko and coworkers have compared the EKF and UKF on a strongly nonlinear
exothermic chemical CSTR (Romanenko & Castro, 2004), and
a pH system (Romanenko, Santos, & Afonso, 2004). The CSTR
has nonlinear dynamics and a linear measurement model, i.e. a
subset of states is measured. In this case, the UKF performs significantly better than the EKF when the process noise is large.
The pH system has linear dynamics but a strongly nonlinear
measurement, i.e. the pH measurement. In this case, the authors
show a modest improvement in the UKF over the EKF.
3.3. Full information estimation
Because the conditional density px|Y (x(T)|Y(T)) is difficult to
obtain exactly for nonlinear models, we also focus our attention
on the entire trajectory of states X(T) := {x(0), . . . x(T)}, rather
than just the last state x(T). For simplicity of presentation, we
assume G = I in Eq. (1).
In the full information problem, Y(T) is available and our goal
is to find the maximum likelihood estimate by solving
max pX|Y (X(T )|Y (T ))
(8)
X(T )
From Bayes theorem we have

pX|Y (X(T )|Y (T )) =
pY |X (Y (T )|X(T ))pX (X(T ))

pY (Y (T ))
Because the sequences w(k) and v(k) are independent, we

can express the terms in the numerator as
pY |X (Y (T )|X(T )) =
T

py|x (y(j)|x(j))
j=0
pX (X(T )) =
T
1
px|x (x(j + 1)|x(j))px(0) (x(0))
j=1
We also have
py|x (y(j)|x(j)) = pv (y(j) h(x(j)))
px|x (x(j + 1)|x(j)) = pw (x(j + 1) F (x(j), u(j)))
Substituting these results into Eq. (8), and noting that
pY (Y(T)) does not depend on the decision variables X(T), the
1534
maximum likelihood optimization is

maxpx(0) (x(0))
X(T )
T
1
pw (x(j + 1) F (x(j), u(j)))
j=1
T

pv (y(j) h(x(j)))
j=0
We can write this as an equivalent minimization problem by

taking the negative logarithm to yield
minV0 (x(0)) +
X(T )
T
1
Lw (w(j)) +
T
j=1
Lv (y(j) h(x(j)))
(9)
j=0
subject to x(j + 1) = F (x(j), u(j)) + w(j) in which

V0 (x):= log(px(0) (x))
Lw (w):= log(pw (w)),
Lv (v):= log(pv (v))
If the three densities above are chosen as normals, then we

obtain a nonlinear least-squares problem (nonlinear because
of the model constraint), but any given densities are allowed.
We often default to using normals in applications solely
because of lack of knowledge about the densities of the initial state and the disturbances. This issue is discussed briefly in
Section 3.6.
3.4. Moving horizon estimation
The computational burden of solving the full information
estimator, Eq. (9), grows as measurements become available.
Moving horizon estimation (MHE) fixes this computational cost
by considering a finite horizon of only the last N measurements.
Let X(T N: T) := {x(T N) . . . x(T)} denote the most recent N
values of the state sequence at current time T. The MHE state
estimation problem is
min
X(T N:T )
T
1
VT N (x(T N)) +
T
Lw (w(j))
j=T N
Lv (y(j) h(x(j)))
(10)
j=T N
subject to x(j + 1) = F (x(j), u(j)) + w(j) in which VTN is

called the arrival cost. The arrival cost represents the information in the prior measurement sequence Y(T N 1) that is
not considered in the horizon at time T. The statistically correct choice for the arrival cost is the conditional density of
x(T N)|Y(T N 1)
VT N (x) = log px(T N)|Y (x|Y (T N 1))
In the linear, Gaussian case, this conditional density is simply P (T N), and MHE reduces exactly to the Kalman filter. In
the nonlinear case, this density is not available (or we can solve
the full information problem). We therefore consider various
methods for approximating it. One option is to use a linearization approximation as in the EKF. Note that with a reasonably
large value of N, MHE based on the EKF arrival cost is not

the same as the EKF. Numerous simulation examples show that
MHEs fitting the data in the horizon and using the full nonlinear model in the state equation provides robustness to poor
priors that is not achievable with the EKF (Haseltine & Rawlings,
2005).
The full information or MHE optimization approaches have
long been used in the process control community as methods of state estimation, data reconciliation, and fault detection for nonlinear models: Joseph, Edgar, Bequette, Biegler,
Marquardt, Doyle and coworkers have proposed variations of
this general approach (Albuquerque & Biegler, 1996; Bequette,
1991; Binder, Blank, Dahmen, & Marquardt, 2002; Gatzke &
Doyle, 2002; Kim, Liebman, & Edgar, 1991; Liebman, Edgar,
& Lasdon, 1992; Mhamdi, Helbig, Abel, & Marquardt, 1996;
Ramamurthi, Sistu, & Bequette, 1993; Tjoa & Biegler, 1991).
These approaches have also been used for designing nonlinear
observers (Michalska & Mayne, 1995; Moraal & Grizzle, 1995;
Zimmer, 1994).
The stability of MHE for constrained linear systems has been
studied for linear and nonlinear models (Meadows, Muske, &
Rawlings, 1993; Muske & Rawlings, 1995) Statistical properties of constrained linear and nonlinear MHE have been
studied by Robertson and Lee (Robertson, Lee, & Rawlings,
1996; Robertson & Lee, 2002) Tyler and Morari have examined the feasibility issue for constrained MHE for linear models
(Tyler & Morari, 1996). Rao et al. have worked on the stability of linear and nonlinear, constrained MHE (Michalska
& Mayne, 1995; Rao & Rawlings, 2000; Rao, Rawlings,
& Lee, 2001; Rao et al., 2003). Ferrari-Trecate et al. have
recently extended MHE for use with hybrid systems (FerrariTrecate, Mignone, & Morari, 2002). Goodwin and coworkers have shown a nice duality between constrained estimation and control (Goodwin, De Dona, Seron, & Zhuo, 2005)
and also designed MHE for distributed, networked systems
(Goodwin, Haimovich, Quevedo, & Welsh, 2005). Advances
in numerical optimization have made it possible to solve the
MHE optimization in real time for small dimensional nonlinear models (Tenny & Rawlings, 2002). But computational
complexity remains a significant research challenge for MHE
researchers.
The best choice of arrival cost remains an open issue in
MHE research. Rao et al. (2001) explore estimating this cost
for constrained linear systems with the corresponding cost for
an unconstrained linear system. More specifically, the following
two schemes are examined:
(1) a filtering scheme that penalizes deviations of the initial
estimate in the horizon from a prior estimate, and
(2) a smoothing scheme that penalizes deviations of the trajectory of states in the estimation horizon from a prior
estimate.
For unconstrained, linear systems, the MHE optimization collapses to the Kalman filter for both of these schemes. Rao (2000)
further considers several optimal and suboptimal approaches for
estimating the arrival cost via a series of optimizations. These
approaches stem from the property that, in a deterministic setting (no state or measurement noise), MHE is an asymptotically
stable observer as long as the arrival cost is underbounded.
One simple way of estimating the arrival cost, therefore, is to
implement a uniform prior. Computationally, a uniform prior
corresponds to not penalizing deviations of the initial state from
the prior estimate.
For nonlinear systems, Tenny and Rawlings (2002) estimate
the arrival cost by approximating the constrained, nonlinear
system as an unconstrained, linear time-varying system and
applying the corresponding filtering and smoothing schemes.
They conclude that the smoothing scheme is superior to the
filtering scheme because the filtering scheme induces oscillations in the state estimates due to unnecessary propagation of
initial error. The assumption here is that the conditional density is well approximated by a multivariate normal. The problem with this assumption of course is that nonlinear systems
may exhibit a multi-modal conditional density. Haseltine and
Rawlings (2002) demonstrate that approximating the arrival cost
with the smoothing scheme in the presence of multiple local
optima may skew all future estimates. If global optimization is
implementable in real time, approximating the arrival cost with
a uniform prior and making the estimation horizon reasonably
long is preferable to an approximate multivariate normal arrival
cost.
3.5. Particle ltering
Unlike most other nonlinear filtering methods, including
those described earlier, particle filtering does not assume a fixed
shape of any density, but approximates the densities of interest
via samples or particles
p(x(t))
np
1535
in which p(x(T 1)|Y(T 1)) is the posterior at time step T 1.

The distribution of px|x (x(T)|x(T 1)) can be found based on the
available state equation, Eq. (1), as,

px|x (x(T )|x(T 1)) = (x(T ) f (x(T 1), w(T 1)))
pw (w(T 1)) dw(T 1)
(13)
in which we use the notation f (x, w):=F (x, u) + G(x, u)w.

Likewise the likelihood distribution can be expressed based on
the measurement equation, Eq. (1), as follows,

py|x (y(T )|x(T )) = (y(T ) h(x(T ), v(T )))pv (v(T )) dv(T )
(14)
Using Monte Carlo sampling for solving the dynamic estimation problem requires an approach for generating samples from
the posterior at each time point, while incorporating the state and
measurement equations, Eq. (1), and available measurements,
Y(T). This may be accomplished via sequential Monte Carlo
sampling with Eqs. (11)(14) followed by using the following
equation for calculating the posterior moments.

E[f (x)] =
1
f (x)p(x) dx
f (xi )
N
(15)
i=1
Eq. (15) requires samples from the posterior, which are often
difficult to obtain since the posterior may have unusual shapes
and may lack a convenient closed-form representation. Consequently, it is common to write Eq. (15) as,

E[f (x)] =
f (x)
p(x)
1
(x) dx
f (xi )f (xi )qi
(x)
N
(16)
i=1
qi (t)(x(t) xi (t))
in which
i=1
in which np is the number of particles or samples in the approximation, xi is the sample location and qi is the sample weight.
Thus PF can capture the time-varying nature of distributions
commonly encountered in nonlinear dynamic problems, and
any moment can be calculated from the samples. Furthermore,
this sampling based approach can solve the estimation problem in a recursive manner without resorting to model approximation. The posterior at time T may be written recursively
based on prior knowledge of the system, px|Y (x(T)|Y(T 1)),
and the current information of the process, py|x (y(T)|x(T)) or
likelihood
px|Y (x(T )|Y (T )) py|x (y(T )|x(T ))px|Y (x(T )|Y (T 1)) (11)
The two terms on the right hand side of Eq. (11) may be
further manipulated as follows.
px|Y (x(T )|Y (T 1))

= px|x (x(T )|x(T 1))px|Y (x(T 1)|Y (T 1)) dx(T 1)
(12)
qi =
p(xi )
(xi )
(17)
is the weight function and {xi } are samples drawn from the
importance function, (x). This formulation permits convenient
sampling from a known distribution, (x), and relaxes the need
to draw samples from the true posterior distribution, p(x). Also,
any pairs of samples (particles) and weights, {xi , qi } contain
information about the relevant distribution. A basic requirement
of the importance function is that its support should include
the support of the true distribution (Geweke, 1989). Moreover,
having f(xi )q(xi ) roughly equal for all particles ensures precise
estimates.
A computationally efficient and recursive solution to Eqs.
(12)(14) is provided by sequential Monte Carlo (SMC) sampling. The recursive approach of SMC is depicted graphically in Fig. 2. Information available at time T 1 includes
the particles and weights, (xi (T 1), qi (T 1)), which represent the posterior at time T 1, px|Y (x(T 1)|Y(T 1)). Bayes
rule is applied recursively by passing each sample through
the state equation, Eq. (1), to obtain samples corresponding
to the prior at time T, px|Y (x(T)|Y(T 1)). This prediction step
1536
np new samples by uniformly sampling the interval [0,1]. The

properties of the resamples using this procedure are therefore
summarized by

qj a i = aj
i
pa (a ) =
0 a i = aj
1
q i =
, all i
np
The probability densities associated with the original and
resampled systems are
px (x) =
utilizes information about process dynamics and model accuracy without making any assumptions about the nature of the
dynamics and shape or any other characteristic of the distributions. Once the measurement, y(T) is available, it can be
used to recursively update the previous weights by the following equation (Arulampalam, Maskell, Gordon, & Clapp,
2002).
p(y(T )|xi (T )p(xi (T )|xi (T 1))
(xi (T )|xi (T 1), y(T ))
qi (x ai ),
p x (x) =
np
i=1
Fig. 2. General approach of sequential Monte Carlo sampling.
q i (T ) qi (T 1)
np
(18)
This correction step utilizes the measurement model and

information about the measurement error. Again, no assumptions about the type of model or distributions are required. The
result of these prediction and correction steps is the particles
and weights at time T, (xi (T), qi (T)}, where qi (T) are obtained
by normalizing q i (T ). Any moment may then be calculated via
Eq. (16).
The benefits of using particles (samples) to approximate the
probability density function does create some new practical challenges that need to be addressed. Application of the SMC steps
described in this subsection often results in increasing variance
of the weights due to particles with small weights. This phenomenon of degeneracy reduces the accuracy of importance
sampling. It may be avoided by using more accurate importance
functions or by resampling the particles to equalize their weights
and removing those with very small weights. Resampling is the
charmingly simple idea depicted in Fig. 3. Given np samples
having state values ai and weights qi , i = 1,. . . np , we choose
Fig. 3. Interval [0,1] partitioned by original sample weights, qi . The arrows

depict the outcome of drawing three uniformly distributed random numbers.
For the case depicted here, the new samples are a 1 = a1 , a 2 = a3 , a 3 = a3 because
the first arrow falls into the first interval and the other two arrows both fall into
the third interval. Sample a2 is discarded and sample a3 is repeated twice in the
resample. The new samples weights are simply q 1 = q 2 = q 3 = 1/3.
q i (x a i )
i=1
The resampled density is clearly not the same as the original sampled density. It is likely that we have moved many of
the new samples to places where the original density has large
values. But by resampling in the fashion described here, we
have not introduced bias into the estimates (Gelfand & Smith,
1990).
Degeneracy may also appear due to little overlap between the
prior and likelihood, which may be due to a poor initial guess or
large unmodeled changes in the system. Methods for addressing
these challenges include the hybrid use of particle filtering with
EKF or empirical Bayes methods as described in more detail
and illustrated by (Chen et al., 2004; Lang, Goel, & Bakshi,
2006).
The resulting algorithm is fully recursive and computationally efficient since the sampling-based approach avoids integration for obtaining the moments at each time step. The recursive
nature implies that solving a nonlinear optimization problem
in a moving window or approximating the prior by the type
of methods necessary for MHE are not required. Furthermore,
SMC does not rely on restrictive assumptions about the nature
of the error or prior distributions and models, making it broadly
applicable.
3.6. Estimating covariances from data
All of the techniques described in this review depend on
knowing densities of the disturbances to the process and
measurement, pw (w) and pv (v). In process control applications, these are never known and must be obtained from
operating data. This need has been addressed by numerous researchers in control and identification starting with
the classic approaches of Mehra and Belanger (Belanger,
1974; Mehra, 1970). Obtaining better disturbance statistics
from data remains a topic of current research (Valappil &
Georgakis, 2000). Odelson and coworkers provide a recent
review of the classical methods and suggest some new
improvements (Odelson, Lutz, & Rawlings, 2006; Odelson,
Rajamani, & Rawlings, 2006; Rajamani, Rawlings, & Qin,
2006).
We next present two tutorial examples to illustrate some of
the issues discussed in this review.
Fig. 4. Conditional density of state vs. time, before (P (k)) and after (P(k))
measurement y(k)=x1 (k). Analytical solution from Kalman filtering.
1537
Fig. 5. Particle locations vs. time; 500 particles.
4. Examples
4.1. Linear system and estimating conditional density
Consider the linear dynamic system
x(k + 1) = Ax(k) + Bu(k) + Gw(k)
y(k) = Cx(k) + v(k)
in which w and v are zero mean, normally distributed with
covariances Q and R, and the initial state is distributed as
x(0) N(x(0), Q0 ). We choose the model parameters as follows

0.7
0.3
A=
B = I2 C = [ 1 0 ] G = I2
0.3 0.7

1.75 1.25
Q0 =
Q = 0.1I2 R = 0.01
1.25 1.75
Notice we are measuring only the first state. The system is
observable and we can reconstruct the second state from the measurements. We examine the state evolution and the conditional
density until k = 2 with the following input and measurement
sequences.

1
7
7
7
x (0) =
u(0) =
u(1) =
u(2) =
1
2
1
1
y(0) = 3
y(1) = 9
Fig. 6. Particle filtering approximation to conditional density of state vs. time;

500 particles.
in Fig. 4. Fig. 7 shows the results of particle filtering with 5000

samples. The particle filter with 5000 samples provides considerably more accurate conditional densities, and the conditional
densities depicted in Fig. 7 are much closer to those in Fig. 4.
Increasing the number of samples to capture the covariance of
the conditional density has obvious limitations as we increase
the state dimension. Current industrial applications have on the
order of hundreds of states compared to this two-state tutorial
example.
y(2) = 13
Fig. 4 shows the state conditional probability density before

and after measurement for three samples. These densities are
computed from the standard Kalman filter formulas given in
Eqs. (4)(7). Ellipses containing 95% probability are drawn in
all figures to illustrate the conditional densities. We see that
because R is small compared to Q, Q0 the measurement at each
sample noticeably tightens the conditional density.
Fig. 5 shows the evolution of the particle locations when running a particle filter with 500 samples. Fig. 6 shows the mean
and 95% probability ellipses calculated from these samples with
the particles removed for clarity. Notice that 500 particles are not
adequate to accurately track the conditional density covariance
for even two sample times into the future. Both P (2) and P(2)
in Fig. 6 show significant deviation from the correct result shown
Fig. 7. Particle filtering approximation to conditional density of state vs. time;

5000 particles.
1538
Fig. 8. Resampled particle locations vs. time; 500 particles.
This example also illustrates the previously discussed need

for resampling. As shown in Fig. 5 the samples have not stayed
in the regions of high conditional density as measurements are
collected. After measurement, the weights become small for the
many particles having x1 (k) far away from measurement y(k).
To increase the efficiency and accuracy of the particle filter, we
need to increase the number of samples in the region of high conditional density. Rather than simply increasing the total number
of samples, we can use resampling. Fig. 8 shows the outcome
of using 500 particles with resampling after each measurement.
Notice the small number of marked samples indicates that many
of the samples are replicated after resampling (see also Fig. 3.)
By focusing the samples in the region of high conditional density
after the first measurement, P (1), P(1) and P (2) are computed
more accurately. But we also clearly see the phenomenon of
impoverishment of the samples, which is pronounced because
of the accurate measurement sensor (small R). Note that the
conditional density P(2) after the second measurement has collapsed to zero because all 500 samples have moved to only two
distinct values. As discussed previously, the remedy here is to
modify the resampling process to maintain a larger set of distinct
samples.
Fig. 9. Stochastic evolution of states and the particle filter mean estimates in the
batch reactor.
Resampling is carried out after every measurement. A poor initial guess of the states (for example x (0) = [0.1, 4.5]T with
Q0 = Q) leads to divergence of the particle filter. To avoid
the divergence a broad initial spread of the particles can be
chosen.
Fig. 10 shows the formation of multiple peaks for the probability density p(x(2)|y(0), y(1), y(2)) as tracked by the particles at
t = 0.2. However, we can see that the particles are concentrated
at a few discrete locations rather than being spread out. This
impoverishment is also illustrated in the distribution of particles
at time t = 0.8 in Fig. 11. The multiple peaks finally disappear
at time t = 1.5 as seen in Fig. 12. The mean estimate using the
particle filter does not converge to the actual state, however, as
4.2. Nonlinear system and multi-modal conditional density

Consider the following gas-phase, reversible reaction:
2A B
k = 0.16
Let PA and PB denote the partial pressures of A and B. If the

state is x = [PA , PB ]T then the model for a well-mixed ideal gas
in an isothermal batch reactor can be written as:

2
0.1
2
x = f (x) =
kPA ,
y = [ 1 1 ]x,
x(0) =
1
4.5
The total pressure is measured. The state and the measurements are corrupted by Gaussian noises with covariances
Q = diag (0.0012 , 0.0012 ) and R = 0.12 , respectively. The discretization time is t = t(k + 1) t(k)=0.1.
The particle filter with SMC sampling was used with 1000
particles for estimating the states evolving as shown in Fig. 9.
P A and P B are the weighted mean estimate of the particles.
Fig. 10. Particle locations and frequency at t = 0.2.
1539
5. Conclusions and future research
seen by the P A , P B plots in Fig. 9. More particles or a better

initial distribution of particles or a better importance distribution
(we chose the prior here) would give better state estimates using
the particle filter.
During the 5 years since the CPC 6 meeting, the research

activity in the field of nonlinear and constrained state estimation
has grown tremendously. A simple counting of the references
cited in just this review that appeared after 2000 verifies this
point. So what conclusions can we draw from all of this research
activity and what can we expect to be fruitful avenues for new
research over the next decade.
First of all, particle filtering (PF) has clearly emerged as
a powerful tool for solving online state estimation problems
without restrictive assumptions about the dynamics and form
of the conditional density. This emergence has been fueled by
a sound underlying theory, advances in sampling techniques,
straightforward parallelization of the algorithm, and the continued increase in computing power. But the approach is not a
panacea, and its widespread, routine use requires solution of several remaining research challenges. Unless carefully designed,
the curse of dimensionality remains a challenge for particle filters. One cannot obtain accurate results by simply overwhelming
the problem with particle samples. The dimensionality of the
state in industrial applications of interest is too high for this
approach to work well. The nature of the density approximation
as a sum of delta functions makes point evaluation of the density difficult. It remains a research challenge to combine PF with
continuous density methods to find the modes of multi-modal
densities.
Secondly, moving horizon estimation (MHE) has proven to
be the method of choice for constrained, linear systems. Solving a convex QP for even reasonably high-dimensional models
is tractable in real time. Nonlinear models require solution of a
nonconvex optimization in MHE. Improvements in optimization
methods, which have been mainly applied in the model predictive control problem, are obviously applicable to the MHE
problem as well. Again, the problem of multi-modal densities
poses challenges for MHE. The usual trick of using a normal
to approximate the arrival cost does not work well in this case.
It seems local optimization from different starting points near
each mode is required to handle this case. Approximating the
arrival cost as a sum of normals at each mode might work well
if the number of modes is small.
Multi-modal densities are not unusual, especially if we
consider physical models of chemical processes. But if we
restrict attention to these kinds of models, the number of modes
is usually small, and often two. Two modes arise frequently
because, when the prior is poor, the measurements increase
the density near the true state, which is far from the prior. In
linear problems the density remains normal and simply inflates
as measurements are taken, and then shrinks around the correct
state as the prior is discounted. In nonlinear problems, a second
mode appears as the measurements are taken, and then the
mode near the prior simply disappears. New state estimation
methods that can routinely detect and track emergence of a
small number of modes in a high-dimensional state space would
seem ideal for handling this issue in chemical process control
applications. To address this challenge, a combination of PF
and MHE may permit using the power of PF for representing
1540
the general, multi-modal densities, and the power of MHE for

accurately tracking the locations of the modes.
Acknowledgments
The authors would like to thank M. Rajamani, E.L. Haseltine, and D.Q. Mayne for helpful discussion of the ideas in this
paper. The first author acknowledges financial support from NSF
through grant #CNS-0540147 and PRF through grant #43321AC9. The second author acknowledges financial support from
NSF through grant #CTS-0321911.
References
Albuquerque, J., & Biegler, L. T. (1996). Data reconciliation and gross-error
detection for dynamic systems. AIChE Journal, 42(10), 28412856.
Alspach, D. L., & Sorenson, H. W. (1972). Nonlinear Bayesian estimation using
Gaussian sum approximations. IEEE Transactions on Automatic Control,
AC-17(4), 439448.
Arulampalam, M. S., Maskell, S., Gordon, N., & Clapp, T. (2002, February). A tutorial on particle filters for online nonlinear/non-Gaussian
Bayesian tracking. IEEE Transactions on Signal Processing, 50(2), 174
188.
Azimi-Sadjadi, B., & Krishnaprasad, P. S. (2004). A particle filtering approach
to change detection for nonlinear systems, Rensselaer Polytechnic Institute,
Tech. Rep. Submitted to EURASIP Journal on Applied Signal Processing
(EURASIP JASP).
Azimi-Sadjadi, B., & Krishnaprasad, P. (2005). Approximate nonlinear filtering
and its application in navigation. Automatica, 41(6), 945956.
Belanger, P. (1974). Estimation of noise covariance matrices for a linear timevarying stochastic process. Automatica, 10, 267275.
Bequette, B. W. (1991, February). Nonlinear predictive control using multirate sampling. Canadian Journal of Chemical Engineering, 69, 136
143.
Berzuini, C., & Gilks, W. (2001). Resample-move filtering with cross-model
jumps. In A. Doucet, N. de Freitas, & N. Gordon (Eds.), Sequential Monte
Carlo methods in practice (pp. 117138). New York: Springer.
Berzuini, C., & Gilks, W. R. (2003). Particle filtering methods for dynamic
and static Bayesian problems. In P. J. Green, N. L. Hjort, & S. Richardson
(Eds.), Highly structured stochastic systems (pp. 207236). Oxford: Oxford
University Press.
Binder, T., Blank, L., Dahmen, W., & Marquardt, W. (2002). On the regularization of dynamic data reconciliation problems. Journal of Process Control,
12(4), 557567.
Blviken, E., Acklam, P. J., Christopherson, N., & Strdal, J.-M. (2001, February). Monte Carlo filters for non-linear state estimation. Automatica, 37(2),
177183.
Chaves, M., & Sontag, E. (2002). State-estimators for chemical reaction networks of Feinberg-Horn-Jackson zero deficiency type. European Journal of
Control, 8(4), 343359.
Chen, W. S., Bakshi, B. R., Goel, P. K., & Ungarala, S. (2004). Bayesian estimation of unconstrained nonlinear dynamic systems via sequential Monte
Carlo sampling. Industrial and Engineering Chemistry Research, 43(14),
40124025.
Chen, W. S., Bakshi, B. R., Goel, P. K., & Ungarala, S. (2005). Bayesian
estimation of constrained nonlinear dynamic systems via sequential Monte
Carlo sampling, Submitted to Automatica.
Chen, W. S., Ungarala, S., Bakshi, B., & Goel, P. (2001). Bayesian rectification
of nonlinear dynamic processes by the weighted bootstrap, in AIChE Annual
Meeting, Reno, Nevada.
Daum, F. (2005, August). Nonlinear filters: Beyond the Kalman filter. IEEE
A&E Systems Magazine, 20(8), 5769, Part 2: Tutorials.
de Freitas, J. F. G., Noranjan, M., Gee, A. H., & Doucet, A. (2000). Sequential
Monte Carlo methods to train neural network models. Neural Computation,
12, 955993.
Doucet, A., Godsill, S., & Andrieu, C. (2000). On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and Computing, 10, 197208.
Ferrari-Trecate, G., Mignone, D., & Morari, M. (2002). Moving horizon estimation for hybrid systems. IEEE Transactions on Automatic Control, 47(10),
16631676.
Gatzke, E., & Doyle, F. J. (2002). Use of multiple models and qualitative knowledge for on-line moving horizon disturbance estimation and fault diagnosis.
Journal of Process Control, 12(2), 339352.
Gelb, A. (Ed.). (1974). Applied optimal estimation. Cambridge, Massachusetts:
The M.I.T. Press.
Gelfand, A., & Smith, A. (1990). Sampling based approaches to calculating
marginal densities. Journal of the American Statistical Association, 85,
398408.
Geweke, J. (1989, November). Bayesian inference in econometric models using
Monte Carlo integration. Econometrica, 57(6), 13171339.
Goel, P., Lang, L., & Bakshi, B. R. (2005, January). Sequential Monte Carlo
in Bayesian inference for dynamic models: An overview. In Proceedings of
International Workshop/Conference on Bayesian Statistics and its Applications, Co-sponsored by International Society for Bayesian Analysis.
Goodwin, G. C., De Dona, J. A., Seron, M. A., & Zhuo, X. W. (2005).
Lagrangian duality between constrained estimation and control. Automatica, 41, 935944.
Goodwin, G. C., Haimovich, H., Quevedo, D. E., & Welsh, J. S. (2005, September). A moving horizon approach to networked control system design. IEEE
Transactions on Automatic Control, 49(9), 14271445.
Gordon, N., Salmond, D., & Smith, A. (1993, April). Novel approach to
nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings FRadar and Signal Processing, 140(2), 107113.
Gudi, R., Shah, S., & Gray, M. (1994). Multirate state and parameter estimation
in an antibiotic fermentation with delayed measurements. Biotechnology and
Bioengineering, 44, 12711278.
Handschin, J. E., & Mayne, D. Q. (1969). Monte Carlo techniques to estimate
the conditional expectation in multistage nonlinear filtering. International
Journal of Control, 9(5), 547559.
Haseltine, E. L., & Rawlings, J. B. (2002). A critical evaluation of extended
Kalman ltering and moving horizon estimation, TWMCC, Department
of Chemical Engineering, University of Wisconsin-Madison, Tech. Rep.
200203, August 2002.
Haseltine, E. L., & Rawlings, J. B. (2005, April). Critical evaluation of
extended Kalman filtering and moving horizon estimation. Industrial and
Engineering Chemistry Research, 44(8), 24512460 [Online]. Available:
http://pubs.acs.org/journals/iecred/.
Ho, Y. C., & Lee, R. C. K. (1964). A Bayesian approach to problems in stochastic estimation and control. IEEE Transactions on Automatic Control, 9(5),
333339.
Julier, S., & Uhlmann, J. (2002, August). Authors reply. IEEE Transactions on
Automatic Control, 47(8), 14081409.
Julier, S. J., & Uhlmann, J. K. (2004, March). Unscented filtering and nonlinear
estimation. Proceedings of the IEEE, 92(3), 401422.
Julier, S. J., & Uhlmann, J. K. (2004, December). Corrections to unscented
filtering and nonlinear estimation. Proceedings of the IEEE, 92(12), 1958.
Julier, S. J., Uhlmann, J. K., & Durrant-Whyte, H. F. (2000, March). A
new method for the nonlinear transformation of means and covariances
in filters and estimators. IEEE Transactions on Automatic Control, 45(3),
477482.
Kim, I., Liebman, M., & Edgar, T. (1991). A sequential error-in-variables
method for nonlinear dynamic systems. Computers and Chemical Engineering, 15(9), 663670.
Kong, A., Liu, J. S., & Wong, W. H. (1994, March). Sequential imputations
and Bayesian missing data problems. Journal of the American Statistical
Association, 89(425), 278288.
Lang, L., Goel, P. K., & Bakshi, B. R. (2006, January). A smoothing based
method to improve performance of sequential Monte Carlo estimation under
poor initial guess. In Proceedings of Chemical Process Control 7.
Lefebvre, T., Bruyninckx, H., & De Schutter, J. (2002, August). Comment on
A new method for the nonlinear transformation of means and covariances
in filters and estimators. IEEE Transactions on Automatic Control, 47(8),
14061408.
Liebman, M., Edgar, T., & Lasdon, L. (1992). Efficient data reconciliation and estimation for dynamic processes using nonlinear programming techniques. Computers and Chemical Engineering, 16(10/11), 963
986.
Meadows, E. S., Muske, K. R., & Rawlings, J. B. (1993, June). Constrained
state estimation and discontinuous feedback in model predictive control. In
Proceedings of the 1993 European Control Conference (pp. 23082312).
Mehra, R. (1970). On the identification of variances and adaptive Kalman filtering. IEEE Transactions on Automatic Control, 15(12), 175184.
Mhamdi, A., Helbig, A., Abel, O., & Marquardt, W. (1996). Newton-type receding horizon control and state estimation. In Proceedings of the 1996 IFAC
World Congress (pp. 121126).
Michalska, H., & Mayne, D. Q. (1995). Moving horizon observers and observerbased control. IEEE Transactions on Automatic Control, 40(6), 9951006.
Middlebrooks, S. A. (2001). Modelling and control of silicon and germanium thin lm chemical vapor deposition, Ph.D. dissertation, University
of Wisconsin-Madison.
Moraal, P. E., & Grizzle, J. W. (1995). Observer design for nonlinear systems
with discrete-time measurements. IEEE Transactions on Automatic Control,
40(3), 395404.
Muske, K. R., & Rawlings, J. B. (1995). Nonlinear moving horizon state estimation. In R. Berber (Ed.), Methods of model based process control (pp.
349365). Dordrecht, The Netherlands: Kluwer, Ser. NATO advanced study
institute series: E Applied Sciences 293.
Nrgaard, M., Poulsen, N. K., & Ravn, O. (2000). New developments in state
estimation for nonlinear systems. Automatica, 36, 16271638.
Odelson, B. J., Lutz, A., & Rawlings, J. B. (2006, May). The autocovariance
least-squares methods for estimating covariances: Application to modelbased control of chemical reactors. IEEE Control Systems Technology, 14(3),
532541.
Odelson, B. J., Rajamani, M. R., & Rawlings, J. B. (2006, February). A
new autocovariance least-squares method for estimating noise covariances. Automatica, 42(2), 303308 [Online]. Available: http://www.
elsevier.com/locate/automatica.
Prasad, V., Schley, M., Russo, L. P., & Bequette, B. W. (2002). Product property
and production rate control of styrene polymerization. Journal of Process
Control, 12(3), 353372.
Rajamani, M. R., Rawlings, J. B., & Qin, S. J. (2006). Equivalence of MPC
disturbance models identified from data. In Proceedings of Chemical Process
Control 7.
Ramamurthi, Y., Sistu, P., & Bequette, B. (1993). Control-relevant dynamic data
reconciliation and parameter estimation. Computers and Chemical Engineering, 17(1), 4159.
Rao, C. V. (2000). Moving horizon strategies for the constrained monitoring and
control of nonlinear discrete-time systems, Ph.D. dissertation, University
of Wisconsin-Madison, 2000.
Rao, C. V., & Rawlings, J. B. (2000). Nonlinear moving horizon estimation. In
F. Allgower & A. Zheng (Eds.), Nonlinear model predictive control: Vol. 26,
(pp. 4569). Basel: Birkhauser, Ser. Progress in systems and control theory.
Rao, C. V., & Rawlings, J. B. (2002, January). Constrained process monitoring:
Moving-horizon approach. AIChE Journal, 48(1), 97109.
Rao, C. V., Rawlings, J. B., & Lee, J. H. (2001). Constrained linear state estimation a moving horizon approach. Automatica, 37(10), 16191628.
Rao, C. V., Rawlings, J. B., & Mayne, D. Q. (2003, February). Constrained
state estimation for nonlinear discrete-time systems: Stability and moving
1541
horizon approximations. IEEE Transactions on Automatic Control, 48(2),

246258.
Reif, K., Gunther, S., Yaz, E., & Unbehauen, R. (1999, April). Stochastic stability
of the discrete-time extended Kalman filter. IEEE Transactions on Automatic
Control, 44(4), 714728.
Reif, K., Gunther, S., Yaz, E., & Unbehauen, R. (2000, January). Stochastic
stability of the continuous-time extended Kalman filter. In IEE ProceedingsControl Theory and Applications, vol. 147, no. 1 (pp. 4552).
Reif, K., & Unbehauen, R. (1999, August). The extended Kalman filter as an
exponential observer for nonlinear systems. IEEE Transactions on Signal
Processing, 47(8), 23242328.
Robert, C., & Casella, G. (1998). Monte Carlo statistical methods. New York:
Springer.
Robertson, D. G., & Lee, J. H. (1995). A least squares formulation for state
estimation. Journal of Process Control, 5(4), 291299.
Robertson, D. G., & Lee, J. H. (2002). On the use of constraints in least squares
estimation and control. Automatica, 38(7), 11131124.
Robertson, D. G., Lee, J. H., & Rawlings, J. B. (1996, August). A moving
horizon-based approach for least-squares state estimation. AIChE Journal,
42(8), 22092224.
Romanenko, A., & Castro, J. A. (2004, March 15). The unscented filter as an
alternative to the EKF for nonlinear state estimation: A simulation case study.
Computers and Chemical Engineering, 28(3), 347355.
Romanenko, A., Santos, L. O., & Afonso, P. A. F. N. A. (2004). Unscented
Kalman filtering of a simulated pH system. Industrial and Engineering
Chemistry Research, 43, 75317538.
Silverman, B. W. (1986). Density estimation for statistics and data analysis.
New York: Chapman and Hall.
Soroush, M. (1998, December). State and parameter estimations and their applications in process control. Computers and Chemical Engineering, 23(2),
229245.
Spall, J. C. (2003, April). Estimation via Markov chain Monte Carlo. IEEE
Control Systems Magazine, 23(2), 3445.
Stengel, R. F. (1994). Optimal control and estimation. Dover Publications, Inc.
Tenny, M. (2002). Computational strategies for nonlinear model predictive
control, Ph.D. dissertation, University of Wisconsin-Madison.
Tenny, M. J., & Rawlings, J. B. (2002, May). Efficient moving horizon estimation
and nonlinear model predictive control. In Proceedings of the American
Control Conference (pp. 44754480).
Tjoa, I. B., & Biegler, L. T. (1991). Simultaneous strategies for data reconciliation and gross error detection of nonlinear systems. Computers and Chemical
Engineering, 15(10), 679690.
Tyler, M. L., & Morari, M. (1996). Stability of constrained moving horizon
estimation schemes, Preprint AUT96-18, Automatic Control Laboratory,
Swiss Federal Institute of Technology.
Valappil, J., & Georgakis, C. (2000). Systematic estimation of state noise statistics for extended Kalman filters. AIChE Journal, 46(2), 292308.
van der Merwe, R., Doucet, A., de Freitas, N., & Wan, E. (2000, August). The
unscented particle lter, Cambridge University Engineering Department,
Tech. Rep. CUED/F-INFENG/TR 380.
Wilson, D. I., Agarwal, M., & Rippin, D. (1998). Experiences implementing
the extended Kalman filter on an industrial batch reactor. Computers and
Chemical Engineering, 22(11), 16531672.
Zimmer, G. (1994). State observation by on-line minimization. International
Journal of Control, 60(4), 595606.

Particle Filtering and Moving Horizon Estimation

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Particle Filtering and Moving Horizon Estimation

Încărcat de

Drepturi de autor:

Formate disponibile

Computers and Chemical Engineering 30 (2006) 15291541

Particle filtering and moving horizon estimation

The fundamental question of state estimation arises in many

x(k + 1) = F (x(k), u(k)) + G(x(k), u(k))w(k)

y(k) = h(x(k)) + v(k)

adequately addressed by Markov Chain Monte Carlo (MCMC)

by methods that rely on an approximate but continuous prior

The mean and covariance after measurement are given by

L(k) = P (k)C (R + CP (k)C )

P(k) = P (k) L(k)CP (k)

and examines the performance of a particle filter on the same

The mean and covariance after measurement are given by

P(k) = P (k) L(k)C(k)P

3.2. Unscented Kalman ltering

Similarly, given w N(0, Qw ) and v N(0, Rv ), choose sample

From these we compute the forecast step

After measurement, the EKF correction step is applied after

The output error is given as Y : y y . We next rewrite the

L = E ((x x )Y ) E (YY )

in which we approximate the two expectations with the sigma

See Julier and Uhlmann (2004a), Julier, Uhlmann, and

From Bayes theorem we have

pY |X (Y (T )|X(T ))pX (X(T ))

Because the sequences w(k) and v(k) are independent, we

px|x (x(j + 1)|x(j))px(0) (x(0))

maximum likelihood optimization is

pw (x(j + 1) F (x(j), u(j)))

We can write this as an equivalent minimization problem by

subject to x(j + 1) = F (x(j), u(j)) + w(j) in which

Lv (v):= log(pv (v))

If the three densities above are chosen as normals, then we

subject to x(j + 1) = F (x(j), u(j)) + w(j) in which VTN is

large value of N, MHE based on the EKF arrival cost is not

in which p(x(T 1)|Y(T 1)) is the posterior at time step T 1.

in which we use the notation f (x, w):=F (x, u) + G(x, u)w.

np new samples by uniformly sampling the interval [0,1]. The

Fig. 2. General approach of sequential Monte Carlo sampling.

This correction step utilizes the measurement model and

Fig. 3. Interval [0,1] partitioned by original sample weights, qi . The arrows

Fig. 5. Particle locations vs. time; 500 particles.

Fig. 6. Particle filtering approximation to conditional density of state vs. time;

in Fig. 4. Fig. 7 shows the results of particle filtering with 5000

Fig. 4 shows the state conditional probability density before

Fig. 7. Particle filtering approximation to conditional density of state vs. time;

Fig. 8. Resampled particle locations vs. time; 500 particles.

This example also illustrates the previously discussed need

4.2. Nonlinear system and multi-modal conditional density

Let PA and PB denote the partial pressures of A and B. If the

Fig. 10. Particle locations and frequency at t = 0.2.

5. Conclusions and future research

Fig. 11. Particle locations and frequency at t = 0.8.

seen by the P A , P B plots in Fig. 9. More particles or a better

Fig. 12. Particle locations and frequency at t = 1.5.

During the 5 years since the CPC 6 meeting, the research

the general, multi-modal densities, and the power of MHE for

horizon approximations. IEEE Transactions on Automatic Control, 48(2),

S-ar putea să vă placă și