Sunteți pe pagina 1din 110

CHEE825/436 - Module 4 J.

McLellan - Fall 2005 1


Process and Disturbance Models
CHEE825/436 - Module 4 J. McLellan - Fall 2005 2
Outline
Types of Models
Model Estimation Methods
Identifying Model Structure
Model Diagnostics
CHEE825/436 - Module 4 J. McLellan - Fall 2005 3
The Task of Dynamic Model Building
partitioning process data into a deterministic component
(the process) and a stochastic component (the
disturbance)


process disturbance
?
time series
model
transfer
function
model
CHEE825/436 - Module 4 J. McLellan - Fall 2005 4
Process Model Types
non-parametric
impulse response
step response
spectrum
parametric
transfer function models
numerator
denominator
difference equation models
equivalent to transfer function models with backshift
operator
}
technically parametric when
in finite form (e.g., FIR)
CHEE825/436 - Module 4 J. McLellan - Fall 2005 5
Impulse and Step Process Models
described as a set of weights:

y t h i u t i
i
N
( ) ( ) ( ) =

=0
y t s i u t i
i
N
( ) ( ) ( ) =

=
A
0
impulse
model
step
model
Note - typically treat Au(t-N) as a step from 0 - i.e.,
Au(t-N) = u(t-N)
CHEE825/436 - Module 4 J. McLellan - Fall 2005 6
Process Spectrum Model
represented as a set of frequency response values, or
graphically
frequency (rad/s)
CHEE825/436 - Module 4 J. McLellan - Fall 2005 7
Process Transfer Function Models
numerator, denominator dynamics and time delay


G q
B q q
F q
p
f
( )
( )
( )
( )

=
1
1 1
1
poles
zeros
time delay
extra 1 step
delay introduced
by zero order hold
and sampling - f is
pure time delay
q
-1
is backwards shift operator: q
-1
y(t)=y(t-1)
CHEE825/436 - Module 4 J. McLellan - Fall 2005 8
Model Types for Disturbances
non-parametric
impulse response - infinite moving average
spectrum
parametric
transfer function form
autoregressive (denominator)
moving average (numerator)
CHEE825/436 - Module 4 J. McLellan - Fall 2005 9
ARIMA Models for Disturbances

) (
) 1 )( (
) (
) (
1 1
1
t a
q q D
q C
t d
d

=
autoregressive
component
moving average
component
random
shock
AutoRegressive Integrated Moving Average Model
Time Series Notation
- ARIMA(p,d,q) model has
pth-order denominator - AR
qth-order numerator - MA
d integrating poles (on the unit circle)
CHEE825/436 - Module 4 J. McLellan - Fall 2005 10
ARMA Models for Disturbances

) (
) (
) (
) (
1
1
t a
q D
q C
t d

=
autoregressive
component
moving average
component
random
shock
Simply have no integrating component
CHEE825/436 - Module 4 J. McLellan - Fall 2005 11
Typical Model Combinations
model predictive control
impulse/step process model + ARMA disturbance model
typically a step disturbance model which can be
considered as a pure integrator driven by a single pulse

single-loop control
transfer function process model + ARMA disturbance model
CHEE825/436 - Module 4 J. McLellan - Fall 2005 12
Classification of Models in Identification
AutoRegressive with eXogenous inputs (ARX)
Output Error (OE)
AutoRegressive Moving Average with eXogenous
inputs (ARMAX)
Box-Jenkins (BJ)
per Ljungs terminology
CHEE825/436 - Module 4 J. McLellan - Fall 2005 13
ARX Models


u(t) is the exogenous input
same autoregressive component for process, disturbance
numerator term for process, no moving average in
disturbance
physical interpretation - disturbance passes through entire
process dynamics
e.g., feed disturbance
A q y t B q q u t a t
f
( ) ( ) ( ) ( ) ( )
( ) +
= +
1 1 1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 14
Output Error Models



no disturbance dynamics
numerator and denominator process dynamics
physical interpretation - process subject to white noise
disturbance (is this ever true?)
y t
B q
A q
q u t a t
f
( )
( )
( )
( ) ( )
( )
= +

+
1
1
1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 15
ARMAX Models


process and disturbance have same denominator dynamics
disturbance has moving average dynamics
physical interpretation - disturbance passing though process
which enters at a point away from the input
except if C(q
-1
) = B(q
-1
)
A q y t B q q u t C q a t
f
( ) ( ) ( ) ( ) ( ) ( )
( ) +
= +
1 1 1 1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 16
Box-Jenkins Model


autoregressive component plus input, disturbance can have
different dynamics
AR component A(q
-1
) represents dynamic elements common
to both process and disturbance
physical interpretation - disturbance passes through other
dynamic elements before entering process
A q y t
B q
F q
q u t
C q
D q
a t
f
( ) ( )
( )
( )
( )
( )
( )
( )
( )

= +
1
1
1
1
1
1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 17
Range of Model Types
Output Error

ARX

ARMAX

Box-Jenkins
least general
most general
CHEE825/436 - Module 4 J. McLellan - Fall 2005 18
Outline
Types of Models
Model Estimation Methods
Identifying Model Structure
Model Diagnostics
CHEE825/436 - Module 4 J. McLellan - Fall 2005 19
Model Estimation - General Philosophy
Form a loss function which is to be minimized to
obtain the best parameter estimates

Loss function
loss can be considered as missed trend or information
e.g. - linear regression
loss would represent left-over trends in residuals which
could be explained by a model
if we picked up all trend, only the random noise e(t) would
be left
additional trends drive up the variation of the residuals
loss function is the sum of squares of the residuals (related
to the variance of the residuals)
CHEE825/436 - Module 4 J. McLellan - Fall 2005 20
Linear Regression - Types of Loss Functions
First, consider the linear regression model:


Least Squares estimation criterion -


Y x x x e e N
p p
= + + + + + | | | | o
0 1 1 2 2
2
0 , ~ ( , )
min ( )
min ( { })
min
{ , ,... }
{ , ,... }
{ , ,... }
|
|
|
| | | |
i
i
i
i p
i i
i
n
i p
i
i i
p p
i
i
n
i p
i
i
n
y y
y x x x
e
=
=
=
=
=
=

= + + + +

=

1
2
1
1
0 1 1 2 2
2
1
1
2
1

squared prediction error


at point i
CHEE825/436 - Module 4 J. McLellan - Fall 2005 21
Linear Regression - Types of Loss Functions

The model describes how the mean of Y varies:


and the variance of Y is because the random component in Y
comes from the additive noise e. The probability density
function at point i is





where e
i
is the noise at point i







E Y x x x
p p
{ } = + + + + | | | |
0 1 1 2 2

o
2
f
y x x
e
Y
i p p
i
i
i i
=
+ + +
|
\

|
.
|
|
=

|
\

|
.
|
|
1
2
2
1
2
2
0 1 1
2
2
2
2
to
| | |
o
to
o
exp
{ ( )}
exp
{ }

CHEE825/436 - Module 4 J. McLellan - Fall 2005 22


Linear Regression - Types of Loss Functions
We can write the joint probability density function for all
observations in the data set:

|
|
|
|
.
|

\
|

=
|
|
|
|
.
|

\
|

+ + +
=
=
=
2
1
2
2 /
2
1
2
1 1 0
2 /
2
} {
exp
) 2 (
1
2
)} ( {
exp
) 2 (
1
1
o o t
o
| | |
o t
n
i
i
n n
n
i
p p i
n n
Y Y
e
x x y
f
i i
n

CHEE825/436 - Module 4 J. McLellan - Fall 2005 23


Linear Regression - Types of Loss Functions
Given parameters, we can use to determine probability that
a given range of observations will occur.

What if we have observations but dont know parameters?
assume that we have the most common, or likely,
observations - i.e., observations that have the greatest
probability of occurrence
find the parameter values that maximize the probability of
the observed values occurring
the joint density function becomes a likelihood function
the parameter estimates are maximum likelihood
estimates
f
Y Y
n 1

CHEE825/436 - Module 4 J. McLellan - Fall 2005 24


Linear Regression - Types of Loss Functions
Maximum Likelihood Parameter Estimation Criterion -

|
|
|
|
.
|

\
|

+ + +
=
=
=
=
2
1
2
1 1 0
2 /
,..., 1 ,
,..., 1 ,
2
)} ( {
exp
) 2 (
1
max
) ( max
o
| | |
o t
|
|
|
n
i
p p i
n n
p i
p i
i i
i
i
x x y
L

y
CHEE825/436 - Module 4 J. McLellan - Fall 2005 25
Linear Regression - Types of Loss Functions
Given the form of the likelihood function, maximizing is equivalent
to minimizing the argument of the exponential, i.e.,






For the linear regression case, the maximum likelihood parameter
estimates are equivalent to the least squares parameter
estimates.

min ( { })
min
{ , ,... }
{ , ,... }
|
|
| | | |
i
i
i p
i
i i
p p
i
i
n
i p
i
i
n
y x x x
e
=
=
=
=
+ + + +

=

1
0 1 1 2 2
2
1
1
2
1

CHEE825/436 - Module 4 J. McLellan - Fall 2005 26


Linear Regression - Types of Loss Functions
Least Squares Estimation
loss function is sum of squared residuals = sum of
squared prediction errors
Maximum Likelihood
loss function is likelihood function, which in the linear
regression case is equivalent to the sum of squared
prediction errors

Prediction Error = observation - predicted value
y y y x x x
i i i
i i
p p
i
= + + + + { } | | | |
0 1 1 2 2

CHEE825/436 - Module 4 J. McLellan - Fall 2005 27
Loss Functions for Identification
Least Squares

minimize the sum of squared prediction errors

The loss function is


where N is the number of points in the data record.


( ( ) ( )) y t y t
t
N

=1
2
CHEE825/436 - Module 4 J. McLellan - Fall 2005 28
Least Squares Identification Example
Given an ARX(1) process+disturbance model:


the loss function can be written as






y t a y t b u t e t ( ) ( ) ( ) ( ) = + +
1 1
1 1
( ( ) ( )) ( ( ) { ( ) ( )}) y t y t y t a y t b u t
t
N
t
N

= +

= = 2
2
1 1
2
2
1 1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 29
Least Squares Identification Example
In matrix form,





and the sum of squares prediction error is


y
y
y N
y u
y u
y N u N
a
b
e
e
e N
( )
( )
( )
( ) ( )
( ) ( )
( ) ( )
( )
( )
( )
2
3
1 1
2 2
1 1
2
3
1
1

(
(
(
=

(
(
(

(
+

(
(
(
e e e
T
y
y
y N
y u
y u
y N u N
a
b
with =

(
(
(

(
(
(

(
( )
( )
( )
( ) ( )
( ) ( )
( ) ( )
2
3
1 1
2 2
1 1
1
1

CHEE825/436 - Module 4 J. McLellan - Fall 2005 30
Least Squares Identification Example
The least squares parameter estimates are:






Note that the disturbance structure in the ARX model is such that
the disturbance contribution appears in the formulation as a
white noise additive error --> satisfies assumptions for this
formulation.

( )
( )
( )
( )
a
b
y
y
y N
T T
1
1
1
2
3

(
=

(
(
(

u u u

CHEE825/436 - Module 4 J. McLellan - Fall 2005 31
Least Squares Identification
ARX models fit into this framework
Output Error models -






or in difference equation form:


y t
B q
A q
q u t e t
A q y t B q q u t A q e t
f
f
( )
( )
( )
( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
( )
( )
= +
= +

+
+
1
1
1
1 1 1 1
or
y t a y t a y t p B q q u t A q e t
p
f
( ) ( ) ( ) ( ) ( ) ( ) ( )
( )
= + + + +
+
1
1 1 1
1
violates least squares
assumptions of independent
errors
CHEE825/436 - Module 4 J. McLellan - Fall 2005 32
Least Squares Identification
Any process+disturbance model other than the ARX
model will not satisfy the structural requirements.

Implications?
estimators are not consistent - dont asymptotically tend
to true values of parameters
potential for bias
CHEE825/436 - Module 4 J. McLellan - Fall 2005 33
Prediction Error Methods
Choose parameter estimates to minimize some function of the
prediction errors.

For example, for the Output Error Model, we have






Use a numerical optimization routine to obtain best estimates.
c ( ) ( )
( )
( )
( )
( )
t y t
B q
A q
q u t
f
=

+
1
1
1
prediction
prediction
error
CHEE825/436 - Module 4 J. McLellan - Fall 2005 34
Prediction Error Methods
AR(1) Example -


Use model to predict one step ahead given past values:


This is an optimal predictor when e(t) is normally distributed, and
can be obtained by taking the conditional expectation of y(t)
given information up to and including time t-1. e(t) disappears
because it has zero mean and adds no information on average.

one step ahead predictor

y t a y t b u t e t ( ) ( ) ( ) ( ) = + +
1 1
1 1
( ) ( ) ( ) y t a y t b u t = +
1 1
1 1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 35
Prediction Error Methods
Prediction Error for the one step ahead predictor:


We could obtain parameter estimates to minimize sum of squared
prediction errors:


c ( ) ( ) ( ) ( ) { ( ) ( )} t y t y t y t a y t b u t = = +
1 1
1 1
c ( ) ( ( ) ( )) t y t y t
t
N
t
N
2
2 2
2
= =

same as Least Squares Estimates for this


ARX example
CHEE825/436 - Module 4 J. McLellan - Fall 2005 36
Prediction Error Methods
What happens if we have an ARMAX(1,1) model?


One step ahead predictor is:


But what is e(t-1)?
estimate it using measured y(t-1) and estimate of y(t-1)



y t a y t b u t e t c e t ( ) ( ) ( ) ( ) ( ) = + + +
1 1 1
1 1 1
( ) ( ) ( ) ( ) y t a y t b u t c e t = + +
1 1 1
1 1 1
( ) ( ) ( )
( ) { ( ) ( ) ( )}
e t y t y t
y t a y t b u t c e t
=
= +
1 1 1
1 2 2 2
1 1 1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 37
Prediction Error Methods
Note that estimate of e(t-1) depends on e(t-2), which depends on
e(t-3), and so forth
eventually end up with dependence on e(0), which is
typically assumed to be zero
conditional estimates - conditional on assumed initial
values
can also formulate in a way to avoid conditional
estimates
impact is typically negligible for large data sets
during computation, it isnt necessary to solve recursively all the way
back to the original condition
use previous prediction to estimate previous prediction error
) 1 ( ) 1 ( ) 1 ( = t y t y t e
CHEE825/436 - Module 4 J. McLellan - Fall 2005 38
Prediction Error Methods
Formulation for General Case -
given a process plus disturbance model:

we can write

so that the prediction is:


The random shocks are estimated as



y t G q u t H q e t ( ) ( ) ( ) ( ) ( ) = +
1 1
y t G q u t H q e t e t ( ) ( ) ( ) ( ( ) ) ( ) ( ) = + +
1 1
1
( ) ( ) ( ) ( ( ) ) ( ) y t G q u t H q e t = +
1 1
1
e t H q y t G q u t ( ) ( ){ ( ) ( ) ( )} =
1 1 1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 39
Prediction Error Methods
Putting these expressions together yields


which is of the form


The prediction error for use in the estimation loss function is



( ) ( ){( ( ) ) ( ) ( ) ( )} y t H q H q y t G q u t = +
1 1 1 1
1
( ) ( , ) ( ) ( , ) ( ) y t L q y t L q u t = +

1
1
2
1
u u
c u u ( ) ( ) ( ) ( ) { ( , ) ( ) ( , ) ( )} t y t y t y t L q y t L q u t = = +

1
1
2
1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 40
Prediction Error Methods
How does this look for a general ARMAX model?


Getting ready for the prediction,


we obtain






A q y t B q u t C q e t ( ) ( ) ( ) ( ) ( ) ( )

= +
1 1 1
y t A q y t B q u t C q e t e t ( ) ( ( )) ( ) ( ) ( ) ( ( ) ) ( ) ( ) = + + +

1 1
1 1 1
( ) ( ( )) ( ) ( ) ( ) ( ( ) ) ( ) y t A q y t B q u t C q e t = + +

1 1
1 1 1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 41
Prediction Error Methods
Note that the ability to estimate the random shocks
depends on the ability to invert C(q
-1
)
invertibility discussed in moving average disturbances
ability to express shocks in terms of present and past
outputs - convert to an infinite autoregressive sum

Note that the moving average parameters appear in the
denominator of the prediction
the model is nonlinear in the moving average
parameters, and conditionally linear in the others
CHEE825/436 - Module 4 J. McLellan - Fall 2005 42
Likelihood Function Methods
Conditional Likelihood Function
assume initial conditions for outputs, random shocks
e.g., for ARX(1), values for y(0)
e.g., for ARMAX(1,1), values for y(0), e(0)
General argument -



form joint distribution for this expression
over all times
find optimal parameter values to maximize likelihood
y t G q u t H q e t e t ( ) ( ) ( ) ( ( ) ) ( ) ( ) =
1 1
1
normally
distributed,
zero mean,
known variance
CHEE825/436 - Module 4 J. McLellan - Fall 2005 43
Likelihood Function Methods
Exact Likelihood Function

Note that we can also form an exact likelihood function which
includes the initial conditions
maximum likelihood estimation procedure estimates
parameters AND initial conditions
exact likelihood function is more complex

In either case, we use a numerical optimization
procedure to solve for the maximum likelihood
estimates.
CHEE825/436 - Module 4 J. McLellan - Fall 2005 44
Likelihood Function Methods
Final Comment -
derivation of likelihood function requires convergence of
moving average, autoregressive elements
moving average --> invertibility
autoregressive --> stability
Example - Box-Jenkins model:`


can be re-arranged to yield the random shock
A q y t
B q
F q
u t
C q
D q
e t ( ) ( )
( )
( )
( )
( )
( )
( )

= +
1
1
1
1
1
e t A q y t
B q
F q
u t
D q
C q
( ) { ( ) ( )
( )
( )
( )}
( )
( )
=

1
1
1
1
1
inverted MA
component
inverted AR
component
CHEE825/436 - Module 4 J. McLellan - Fall 2005 45
Outline
Types of Models
Model Estimation Methods
Identifying Model Structure
Model Diagnostics
CHEE825/436 - Module 4 J. McLellan - Fall 2005 46
Model-Building Strategy

graphical pre-screening
select initial model structure
estimate parameters
examine model diagnostics
examine structural diagnostics
validate model using additional data set
}
modify model
and re-estimate
as required
CHEE825/436 - Module 4 J. McLellan - Fall 2005 47
Example - Debutanizer
Objective - fit a transfer function +disturbance model
describing changes in bottoms RVP in response to
changes in internal reflux

Data
step data
slow PRBS (switch down, switch up, switch down)
CHEE825/436 - Module 4 J. McLellan - Fall 2005 48
Graphical Pre-Screening
examine time traces of outputs, inputs, secondary
variables
are there any outliers or major shifts in operation?
could there be a model in this data?
engineering assessment
should there be a model in this data?
CHEE825/436 - Module 4 J. McLellan - Fall 2005 49
Selecting Initial Model Structure
examine auto- and cross-correlations of output, input
look for autoregressive, moving average components
examine spectrum of output
indication of order of process
first-order
second-order underdamped - resonance
second or higher order overdamped
CHEE825/436 - Module 4 J. McLellan - Fall 2005 50
Selecting Initial Model Structure...
examine correlation estimate of impulse or step
response
available if input is not a step
what order is the process ?
1st order, 2nd order over/underdamped
size of the time delay

CHEE825/436 - Module 4 J. McLellan - Fall 2005 51
Selecting Initial Model Structure
Time Delays

For low frequency input signal (e.g., few steps or filtered
PRBS), examine transient response for delay

For pre-filtered data, examine cross-correlation plots -
where is first non-zero cross-correlation?

CHEE825/436 - Module 4 J. McLellan - Fall 2005 52
Debutanizer Example
step response
indicates settling time ~100 min
potentially some time delay
positive gain
1st order or overdamped higher-order
correlation estimate of step response
indicates time delay of ~4-5 min
overdamped higher-order
CHEE825/436 - Module 4 J. McLellan - Fall 2005 53
Debutanizer Example - PRBS Test
0 50 100 150
-0.2
-0.1
0
0.1
0.2

O
u
t
p
u
t

#

1

Input and output signals
0 50 100 150
-50
0
50
Time
I
n
p
u
t

#

1

CHEE825/436 - Module 4 J. McLellan - Fall 2005 54
Debutanizer Example - Step Response Test

0 50 100 150
0
0.05
0.1
0.15
0.2

O
u
t
p
u
t

#

1

Input and output signals
0 50 100 150
49
49.5
50
50.5
51
Time
I
n
p
u
t

#

1

CHEE825/436 - Module 4 J. McLellan - Fall 2005 55
Debutanizer Example - Correlation Step Response
Estimate

0 5 10 15 20 25 30 35 40
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
x 10
-3
Time
Step Response
CHEE825/436 - Module 4 J. McLellan - Fall 2005 56
Debutanizer Example
process spectrum
suggests higher-order
disturbance spectrum
cut-off behaviour suggests AR type of disturbance
initial model
ARX with delay of 4 or 5
ARMAX
Box-Jenkins
NOT output error - disturbance isnt white
CHEE825/436 - Module 4 J. McLellan - Fall 2005 57
Debutanizer Example - Process Spectrum Plot
10
-2
10
-1
10
0
10
1
10
-6
10
-4
10
-2
A
m
p
l
i
t
u
d
e

Frequency response
10
-2
10
-1
10
0
10
1
-1000
-500
0
Frequency (rad/s)
P
h
a
s
e

(
d
e
g
)

CHEE825/436 - Module 4 J. McLellan - Fall 2005 58
Debutanizer Example - Disturbance Spectrum

10
-2
10
-1
10
0
10
1
10
-8
10
-6
10
-4
10
-2
10
0
Frequency (rad/s)
Power Spectrum
CHEE825/436 - Module 4 J. McLellan - Fall 2005 59
Additional Initial Selection Tests
CHEE825/436 - Module 4 J. McLellan - Fall 2005 60
Singularity Test
Form the data vector


Covariance matrix for this vector will be singular if s>model order,
non-singular if smodel order


Notes:
1. Test developed for deterministic model results are exact for
this
2. Test is approximate when random shocks enter process
results will depend on signal-to-noise ratio


| |
( )
( ) ( ) ( ) ( )
t
y t y t s u t u t s
=
1 1
Cov
N
t t
T
t
N
( ) ( ) ( ) =

=
1
1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 61
Pre-Filtering
If input is not white noise, cross-correlation does not
show process structure clearly
autocorrelation in u(t) complicates structure
Solution - estimate time series model for input, and pre-
filter using inverse of this model
prefilter input and output to ensure consistency
Now estimate cross-correlations between filtered input,
filtered output
look for sharp cut-off - negligible denominator
gradual decline - denominator dynamics
CHEE825/436 - Module 4 J. McLellan - Fall 2005 62
Pre-Filtering
can also examine cross-correlation plots for indication
of time delay
first non-zero lag in cross-correlation function

Note that differencing, which is used to treat non-
stationary disturbances, is a form of pre-filtering
more on this later...
CHEE825/436 - Module 4 J. McLellan - Fall 2005 63
Outline
Types of Models
Model Estimation Methods
Identifying Model Structure
Model Diagnostics
CHEE825/436 - Module 4 J. McLellan - Fall 2005 64
Model Diagnostics
Analyze residuals:

look for unmodelled trends
auto-correlation
cross-correlation with inputs
spectrum - should be flat

assess size of residual standard error
Wet towel analogy - wring out all moisture (information)
until there is nothing left
CHEE825/436 - Module 4 J. McLellan - Fall 2005 65
Unmodelled Trends in Residuals
autocorrelations
should be statistically zero
cross-correlations
between residual and inputs should be zero for lags greater
than the numerator order
i.e., at long lags
if cross-correlation between inputs and past residuals is non-
zero, indicates feedback present in data (inputs depend on
past errors)
i.e., at negative lags
CHEE825/436 - Module 4 J. McLellan - Fall 2005 66
Debutanizer Example
Consider ARX(2,2,5) model
2 poles, 1 zero, delay of 5

Autocorrelation plots
no systematic trend in residuals

Cross-correlation plots
no systematic relationship between residuals and input
CHEE825/436 - Module 4 J. McLellan - Fall 2005 67
Debutanizer Example - Residual Correlation Plots

-20 -15 -10 -5 0 5 10 15 20
-0.5
0
0.5

Autocorrelation of residuals for output 1
-20 -15 -10 -5 0 5 10 15 20
-0.5
0
0.5
Samples
Cross corr for input 1and output 1 resids
CHEE825/436 - Module 4 J. McLellan - Fall 2005 68
Debutanizer Example - Predicted vs. Response

0 50 100 150
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
Time
Measured and simulated model output
CHEE825/436 - Module 4 J. McLellan - Fall 2005 69
Detecting Incorrect Time Delays
If cross-correlation between residual and input is non-
zero for small lags, the time delay is possibly too
large

additional early transients arent being modeled because
model assumes nothing is happening
CHEE825/436 - Module 4 J. McLellan - Fall 2005 70
Debutanizer Example
Lets choose a delay of 7

Cross-correlation plot
indicates significant cross-correlation between input and
output at positive lag
estimate of time delay is too large
CHEE825/436 - Module 4 J. McLellan - Fall 2005 71
Model Diagnostics
Quantitative Tests

significance of parameter estimates
ratio tests - of explained variation

Debutanizer Example
parameters are all significant
CHEE825/436 - Module 4 J. McLellan - Fall 2005 72
Debutanizer Example - Parameter Estimates
This matrix was created by the command ARX on 11/16 1996 at 11:36
Loss fcn: 5.805e-006 Akaike`s FPE: 6.123e-006 Sampling interval 1
The polynomial coefficients and their standard deviations are

B =

1.0e-003 * 0 0 0 0 0 0.1428 -0.0605
0 0 0 0 0 0.0243 0.0272


A = 1.0000 -1.3924 0.4303
0 0.0747 0.0697

parameter
parameter
standard
error
standard
error
AR parameters
numerator parameters
CHEE825/436 - Module 4 J. McLellan - Fall 2005 73
Model Diagnostics
Cross-Validation

Use model to predict behaviour of a new data set
collected under similar circumstances

Reject model if prediction error is large
CHEE825/436 - Module 4 J. McLellan - Fall 2005 74
Debutanizer Example
Use initial step test data as a cross-validation data set.

Prediction errors are small, and trend is predicted quite
well

Conclusion - acceptable model
CHEE825/436 - Module 4 J. McLellan - Fall 2005 75
Debutanizer Example - Prediction for Validation Data

0 50 100 150
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Time
Measured and simulated model output
CHEE825/436 - Module 4 J. McLellan - Fall 2005 76
Debutanizer Example - Residual Correlation Plots for
Validation Data

-20 -15 -10 -5 0 5 10 15 20
-0.5
0
0.5

Autocorrelation of residuals for output 1
-20 -15 -10 -5 0 5 10 15 20
-0.5
0
0.5
Samples
Cross corr for input 1and output 1 resids
CHEE825/436 - Module 4 J. McLellan - Fall 2005 77
Outline
Types of Models
Model Estimation Methods
Identifying Model Structure
Model Diagnostics
CHEE825/436 - Module 4 J. McLellan - Fall 2005 78
Initially...
Use the structure selection methods described earlier.

Once you have estimated several candidate models...
CHEE825/436 - Module 4 J. McLellan - Fall 2005 79
Model Structure Diagnostics
Akaikes Information Criterion (AIC)

weighted estimation error
unexplained variation with term penalizing excess
parameters
analogous to adjusted R
2
for regression
find model structure that minimizes the AIC
CHEE825/436 - Module 4 J. McLellan - Fall 2005 80
Akaikes Information Criterion
Definition

AIC N V p
N
= + log( (

)) u 2
number of
data points in
sample
}

related to
prediction error
(residual sum of squares)
number of
parameters
CHEE825/436 - Module 4 J. McLellan - Fall 2005 81
Akaikes Information Criterion

best model
at minimum
AIC
# of parameters
CHEE825/436 - Module 4 J. McLellan - Fall 2005 82
Akaikes Final Prediction Error
An attempt to estimate prediction error when model is
used to predict new outputs



Goal - choose model that minimizes FPE (balance
between number of parameters and explained
variation)


( )
FPE
p N
p N N
residual sumof squares =
+

|
\

|
.
|
1
1
1 /
/
CHEE825/436 - Module 4 J. McLellan - Fall 2005 83
Minimum Data Length (MDL)
Another approach - find minimum length description
of data - measure is based on loss function + penalty
for terms



find description that minimizes this criterion

N
N
V
N
) log(
) dim(u +
CHEE825/436 - Module 4 J. McLellan - Fall 2005 84
Cross-Validation
Collect additional data, or partition your data set, and
predict output(s) for the additional input sequence
poor predictions - modify model accordingly, re-estimate with
old data and re-validate
good predictions - use your model!
Note - cross-validation set should be collected under
similar conditions
operating point, no known disturbances (e.g., feed changes)
CHEE825/436 - Module 4 J. McLellan - Fall 2005 85
Debutanizer Example
Search over a range of ARX model orders and time
delay:
poles: 1-4
zeros: 1-4
time delay: 1-6

Examine mean square error, MDL, AIC and/or FPE
- Matlab generated -> ARX(2,2,5) model is best
CHEE825/436 - Module 4 J. McLellan - Fall 2005 86
Debutanizer Example

0 2 4 6 8 10
0
0.02
0.04
0.06
0.08
0.1
0.12
# of par's
%

U
n
e
x
p
l
a
i
n
e
d

o
f

o
u
t
p
u
t

v
a
r
i
a
n
c
e

Model Fit vs # of par's
AIC optimal (ARX3,2,5)
MDL optimal (ARX2,2,5)
CHEE825/436 - Module 4 J. McLellan - Fall 2005 87
Other methods...
Look for Singularity of the Information Matrix

CHEE825/436 - Module 4 J. McLellan - Fall 2005 88
Outline
The Modeling Task
Types of Models
Model Building Strategy
Model Diagnostics
Identifying Model Structure
Modeling Non-Stationary Data
MISO vs. SISO Model Fitting
Closed-Loop Identification
CHEE825/436 - Module 4 J. McLellan - Fall 2005 89
What is Non-Stationary Data?
Non-stationary disturbances
exhibit meandering or wandering behaviour
mean may appear to be non-zero for periods of time
stochastic analogue of integrating disturbance

Non-stationarity is associated with poles on the unit
circle in the disturbance transfer function
AR component has one or more roots at 1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 90
Non-StationaryData

0 100 200 300
-4
-2
0
2
4
AR parameter of 0.3
o
u
t
p
u
t

0 100 200 300
-5
0
5
AR parameter of 0.6
o
u
t
p
u
t

0 100 200 300
-10
-5
0
5
10
AR parameter of 0.9
time
o
u
t
p
u
t

0 100 200 300
-20
-10
0
10
20
Non-stationary
time
o
u
t
p
u
t

CHEE825/436 - Module 4 J. McLellan - Fall 2005 91
How can you detect non-stationary data?
Visual
meandering behaviour
Quantitative
slowly decaying autocorrelation behaviour
difference the data
examine autocorrelation, partial autocorrelation functions for
differenced data
evidence of MA or AR indicates a non-stationary, or
integrated MA or AR disturbance
CHEE825/436 - Module 4 J. McLellan - Fall 2005 92
Differencing Data
is the procedure of putting the data in delta form

Start with y(t) and convert to


explicitly accounting for the pole on the unit circle
Ay t y t y t ( ) ( ) ( ) = 1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 93
Detecting Non-Stationarity

-2 0 2 4 6 8 10 12
0
0.5
1
r
e
s
p
o
n
s
e

Autocorrelation for Non-Stationary Disturbance
-2 0 2 4 6 8 10 12
-0.5
0
0.5
1
time
r
e
s
p
o
n
s
e

Autocorrelation for Differenced Disturbance
CHEE825/436 - Module 4 J. McLellan - Fall 2005 94
Impact of Over-Differencing
Over-differencing can introduce extra meandering and
local trends into data

Differencing - cancels pole on unit circle

Over-differencing - introduces artificial unit pole into
data
CHEE825/436 - Module 4 J. McLellan - Fall 2005 95
Recognizing Over-Differencing
Visual
more local trends, meandering in data

Quantitative
autocorrelation behaviour decays more slowly than initial
undifferenced data
CHEE825/436 - Module 4 J. McLellan - Fall 2005 96
Estimating Models for Non-Stationary Data
Approaches

Estimate the model using the differenced data

Explicitly incorporate the pole on the unit circle in the
disturbance transfer function specification
CHEE825/436 - Module 4 J. McLellan - Fall 2005 97
Estimating Models from Differenced Data
Prepare the data by differencing BOTH the input and
the output
Specify initial model structure after using graphical,
quantitative tools
Estimate, diagnose model for differenced data
Convert model to undifferenced form by multiplying
through by (1-q
-1
)
Assess predictions on undifferenced data for fitting
and validation data sets
CHEE825/436 - Module 4 J. McLellan - Fall 2005 98
Differenced Form of Box-Jenkins Model




Note - in time series literature,


is used to denote differencing
A q y t
B q
F q
q u t
C q
D q
a t
f
( ) ( )
( )
( )
( )
( )
( )
( )
( )

+ = +
1
1
1
1
1
1
1 A A
V = =

( ) 1
1
q A
CHEE825/436 - Module 4 J. McLellan - Fall 2005 99
Outline
Types of Models
Model Estimation Methods
Identifying Model Structure
Model Diagnostics
Estimating MIMO models
CHEE825/436 - Module 4 J. McLellan - Fall 2005 100
SISO Approach
Estimate models individually

Advantage
simplicity
Disadvantage
need to reconcile disturbance models for each input-output
channel in order to obtain one disturbance model for the
output
cant assess directionality with respect to inputs
CHEE825/436 - Module 4 J. McLellan - Fall 2005 101
MISO Approach
Estimate the transfer function models + disturbance
model for a single output and all inputs
simultaneously
Advantage
consistency - obtain one disturbance model directly
potential to assess directionality
Disadvantage
complexity - recognizing model structures is more difficult
CHEE825/436 - Module 4 J. McLellan - Fall 2005 102
A Hybrid Approach
conduct preliminary analysis using SISO approach
model structures
apparent disturbance structure
estimate final model using MISO approach
must decide on a common disturbance structure
feasible if input sequences are independent
CHEE825/436 - Module 4 J. McLellan - Fall 2005 103
Outline
Types of Models
Model Estimation Methods
Identifying Model Structure
Model Diagnostics
Closed-loop vs. open-loop estimation
CHEE825/436 - Module 4 J. McLellan - Fall 2005 104
The Closed-Loop Identification Problem

Y
t
U
t SP
t
+
-
Controller
Gc
Process
Gp
dither signal W
t
X
CHEE825/436 - Module 4 J. McLellan - Fall 2005 105
Where should the input signal be introduced?
Options:

Dither at the controller output
clearer indication of process dynamics
preferred approach

Perturbations in the setpoint
additional controller dynamics will be included in estimated
model
CHEE825/436 - Module 4 J. McLellan - Fall 2005 106
What do the closed-loop data represent?
dither signal case, without disturbances
open-loop
input-output data represents


closed-loop
input-output data represents


Y G W
t p t
=
Y
G
G G
W
t
p
p c
t
=
+ 1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 107
Estimating Models from Closed-Loop Data
Approach #1:

Working with W-Y data,

estimate and back out controller to

obtain process transfer function.
we already know the controller transfer function
G
G G
p
p c
1+
CHEE825/436 - Module 4 J. McLellan - Fall 2005 108
Estimating Models from Closed-Loop Data
Approach #2:

Estimate transfer functions for the process
(U ->Y), and for the controller (Y->U) simultaneously.
CHEE825/436 - Module 4 J. McLellan - Fall 2005 109
Estimating Models from Closed-Loop Data
Approach #3:

Fit the model as in the open-loop case (U->Y).

Note that so that we are effectively

using a filtered input signal.
W
G G
U
c p
+
=
1
1
CHEE825/436 - Module 4 J. McLellan - Fall 2005 110
Some Useful References
Identification Case Study - paper by Shirt, Harris and Bacon (1994).

Closed-Loop Identification - issues
- paper by MacGregor and Fogal

System Identification Workshop
- paper edited by Barry Cott

S-ar putea să vă placă și