Sunteți pe pagina 1din 4

Moving Block and Stationary Block Bootstrap for Time Series Data

Darren Keeley
Abstract
Efron’s Bootstrap is a useful method for estimating standard errors of estimators when the data is
independent. Observations from time series data, on the other hand, can be dependent on preceding
observations, and the random sampling done by the Bootstrap eradicates this important feature of the
data. This paper explores two Bootstrapping methods that maintain the time dependence, namely the
Moving Block Bootstrap (MBB) and the Stationary Block Bootstrap (SBB). The abilities of MBB and
SBB to accurately estimate the standard error of the variance of a stationary times series are compared.
When using these methods, the choice of block size can be crucially important, and ways for finding the
optimal size are discussed. This paper found that both MBB and SBB achieve mostly the same level of
accuracy, and that n1/3 is consistently close to the optimal block size even if it isn’t the best.

Background
MBB and SBB are appropriate for stationary time series, where the mean, variance and auto-correlation
structure are constant across time. Their application to finding the standard error of variance is useful for
building prediction intervals for times series forecasting models.
Both MBB and SBB divide the series into blocks of consecutive observations of length l. These blocks
can overlap. l is fixed for MBB, while for SBB l is a random variable that follows a geometric distribution
with probability p. These blocks are resampled with replacement and then concatenated in the order that
they were resampled to form a bootstrap sample of similar length to the original series. Like Efron’s
bootstrap, estimators calculated from these new samples make an empirical distribution from which the
standard error can be estimated.
The function tsboot() from the boot library in R is used here to conduct MBB and SBB. However, the
function has no defaults as to what l should be, and it is not obvious what the best choice is. Efron’s
Introduction to the Bootstrap only mentions that the chosen l should be large enough so that observations
more than l time units apart will be nearly independent. A more precise guideline can be found in Hall et
al, 1995, where the optimal fixed block length is l = n1/3 in very general contexts, with n being the length
of the series. From this point forward this equation will be referred to as Hall’s rule. It is stated by
Radovanov and Marcikic, 2014 that for variance estimation, the optimal fixed block length l is
approximately equal to the optimal p-1, where p-1 can be interpreted as the reciprocal of the average block
length. This paper extends Hall rule to SBB and explores whether l = p-1 = n1/3 is at least a good starting
place when searching for the best block size and geometric probability.

Methodology
Simulating data to serve as a benchmark
To compare the ability of MBB and SBB to accurately estimate the standard error of the variance of a
stationary series, it is necessary to know the true variance and standard error of each series so that they
can serve as benchmarks. To meet this end, data was simulated from a stationary Autoregressive process
with two lags denoted by AR(2):
𝑦𝑡 = 0.5𝑦𝑡−1 − 0.35𝑦𝑡−2 + 𝜀𝑡
The first sample size considered is n = 120. To find a close approximation to the true variance and its
standard error, 100,000 samples were simulated and their variances calculated. The standard error was
then calculated from this distribution. The same procedure was repeated for n = 1000. These
approximated parameters are listed in Table 1 below.

Fig. 1: The simulated stationary time series of lengths n = 120 and n=1000.

N=120 N=1000
VAR 1.3199 1.3206
SE(VAR) 0.2046 0.0706

Table 1: The true variances and standard errors approximated


from 100,000 samples of lengths n=120 and n=1000.

Conducting Moving Block Bootstrap and Stationary Bootstrap


For n = 120, it’s expected that the optimal (average) block length l will be close to 1201/3 ≅ 5. A
computationally costly grid search of l from 2 to 11 is conducted and plotted in Figure 2.

Fig. 2: A block size of 4 yielded the most accurate estimate at 0.1893, and l=5 yielded 0.1888.
The orange line demarcates the true standard error at 0.2046.
It would seem that both procedures reach their optimum following Hall’s rule, though they underestimate
the standard error no matter which block size is chosen. MBB gave the closest estimate of 0.1893 with l =
4, while SBB’s best estimate was 0.1789 with l = 2. These estimates could be considered relatively close
to the true standard error of 0.2046; at least, they are the best MBB and SBB can do with these data. An
extended grid search of larger l’s reveals that increasing l only leads to smaller standard error estimates.
Graphics of this second grid search are not included for brevity.
It is very important for the sample to be representative of the population. Many of the simulated AR(2)
samples when n = 120 had variances that were not close to the population variance, and thus the bootstrap
methods produced erroneous standard error estimates. A representative sample was hand picked so that
MBB and SBB could be properly demonstrated. This emphasizes the data-hungry nature of bootstrapping,
and compels the introduction of a larger sample size of n = 1000 to observe how MBB and SBB perform
when the between-sample variability is dampened. For this series length it was not necessary to hand pick
a sample.
The behavior of the standard error estimates changes with the second AR(2) sample of n = 1000. Hall’s
rule designates l = 10 as the optimal (average) block length. The grid search represented in left plot of
Figure 3 shows that the standard error estimates seem to increase with block size, indicating that the right
block size can capture the true standard error. l = 10 yielded standard errors of 0.0662 and 0.0680 for
MBB and SBB, respectively, compared to the population standard error of 0.0706.
The expanded grid search in the plot on the right shows that MBB and SBB eventually reach the true
standard error around l = 30 and 20, respectively. However, longer block sizes lead to MBB and SBB
quickly surpassing the true standard error

Fig. 3: The plot on the left shows the performance of block lengths around 10, with the true standard error indicated by the orange line. Since an
upward trend toward the true standard error was observed, larger block sizes were explored in the chart on the right. The best block lengths
were around 30 and 20 for the MBB and SBB, respectively.

Conclusion
In light of these findings, Hall’s rule is a rough guideline that can’t completely capture the shifting
complexities between different time series, though it does allow MBB and SBB to provide reasonable
estimates without too much forethought. Radovanov and Marcikic, 2014 conclude that MBB is best
because it “shows the lowest errors in estimation and prediction in and out of the sample.” The
simulations conducted here hint that the superiority of one method over the other is also dependent on the
data itself.
References
Efron, Bradley, and Robert Tibshirani. An Introduction to the Bootstrap. Chapman & Hall/CRC, 1998.

Hall, Peter, et al. “On Blocking Rules for the Bootstrap with Dependent Data.” Biometrika, vol. 82, no. 3,
1995, p. 561., doi:10.2307/2337534.
Radovanov, Boris, and Aleksandra Marcikić. “A Comparison of Four Different Block Bootstrap
Methods.” Croatian Operational Research Review, vol. 5, no. 2, 2014, pp. 189–202.,
doi:10.17535/crorr.2014.0007.

S-ar putea să vă placă și