Documente Academic
Documente Profesional
Documente Cultură
equation. Therefore, you should not worry about its value no matter what the estimation
yields. Convergence properties, significance of coefficients, correlogram of residuals and
squared residuals, etc. can be used for model diagnostics."
Many researchers turned to using effect sizes because evaluating effects using p-values alone
can be misleading. But effect sizes can be misleading too if you don’t think about what they
mean within the research context.
As much as we’d all love to have straight answers to what’s big enough, that’s not the job of
any statistic. You’ve got to think about it and interpret accordingly.
Suppose X is uniformly distributed between 0 and c. Then the PDF is 1/c for
0 <= x <= c. If you have N observations, all of which lie in the range (0,c),
An ARMA(0,q) process is just MA(q) which is a moving average of past white noise. MA
processes are often used to model shocks to a system (again say prices) which are gradually
weakened over time.
An ARIMA(0,1,0) is just a random walk. There are many applications to random walks in
chemistry, biology, finance, etc. An intuitive example is a drunkard walking on an axis in one
dimension, and a grid in higher dimensions.
Stationary means that my answer isn’t evolving over time. The right answer stays
mainly in the same spot. I like to say a cat is a cat, and if I have data on it, it will
always predict that we have a cat.
Now, for concreteness, the GDP is not like a cat. It moves all over the place, new
things are invented, old products are discontinued, the status quo of what GDP
measures changes all the time. As a result we see that real GDP has trended upwards.
My answer is evolving!
Okay, so where am I going with this. Linear models require stationarity. Otherwise, I
can’t trust the coefficients that it spits out. There are lot’s of technical reasons for
this, and you can read up on the theory behind autocorrelation if you so choose, but
the intuition is that we want to achieve a stationary time-series to do a linear
regression on.
We get there by differencing the series against itself. This removes any trends. But
sometimes, we over difference, so we add a little bit back in. That is your MA term.
And sometimes differencing doesn’t go far enough. but double differencing would go
too far. That’s what AR terms are for.
So to sum up, AR term is a partial difference of the series, I is a full difference, and
MA is a partial clawback of a full difference. The coefficient on the AR term will tell you the
percent of a difference you need to take. And the MA term tells you what percent to add back into the error term
after differencing. The extent of the partial differnece or addition is determined by the
coefficients on the AR and MA terms. Hence, you should see coefficients between -1
and 1. If the magnitude is greater than 1 you have problems with your model.
So ARIMA is just a linear regression with a couple of terms to force your time series
to be stationary.
Now the problem with this is that there are a lot of ways that we can get to a stationary series by taking full
differences, partial differences, and adding partial differences back in. So we need some way to choose which
terms to use. In practice, I tend to fit many, many ARIMA models, to see how stable things are. If my
coefficients seem relatively robust, I then select the model with the lowest log-likelihood. It isn’t necessarily the
best strategy, but in an applied world, I think it makes a lot of sense.
However, the correct way to select the terms that you are going to use is to use an autocorrelation plot and a
partial autocorrelation plot to determine the correct number of terms, with an augmented dickey-fuller test.
There are also various generalizations of the GARCH model -- for instance, we could
make the volatility at a given time depend not only on the previous volatilities and a
random term, but also on the current value of the main process. This would accord
with some people's beliefs that unusually high or low stock prices lead to
disproportionately higher volatility, for instance. There's a pretty long list here: