Documente Academic
Documente Profesional
Documente Cultură
Abstract
The paper presents the statistical estimation of extreme wind speed using annually r largest order
statistics (r-LOS) extracted from the time series of wind data. The method is based on a joint
generalized extreme value distribution of r-LOS derived from the theory of Poisson process. The
parameter estimation is based on the method of maximum likelihood. The hourly wind speed data
collected at 30 stations in Ontario, Canada, are analyzed in the paper. The results of r-LOS method
are compared with those obtained from the method of independent storms (MIS) and specifications
of the Canadian National Building Code (CNBC-1995). The CNBC estimates are apparently
conservative upper bound due to large sampling error associated with annual maxima analysis. Using
the r-LOS method, the paper shows that the wind pressure data can be suitably modelled by the
Gumbel distribution.
r 2006 Elsevier Ltd. All rights reserved.
Keywords: Wind speed; Extreme value estimation; Generalized extreme value distribution; Order statistics;
Annual maxima; Maximum likelihood method; Method of independent storm
1. Introduction
The estimation of design wind speed corresponding to a long return period is generally
based on the extreme value theory, which derives the three asymptotic domains of attraction,
namely, the Gumbel, Frechet and Weibull distributions [1]. These three distributions can be
written in a unified form, referred to as the Generalized Extreme Value distribution (GEV).
Corresponding author. Tel.: +1 519 888 4567x5858; fax: +1 519 888 4349.
E-mail address: mdpandey@uwaterloo.ca (M.D. Pandey).
1
Graduate student.
0167-6105/$ - see front matter r 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jweia.2006.05.008
ARTICLE IN PRESS
166 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182
Traditionally, a sample of annual maximum wind speed is fitted with the Gumbel
distribution using the methods of moments or least squares. However, the statistical
extrapolation to estimate wind speed corresponding to 500–1000 year return period is
seriously contaminated by sampling and model uncertainty, if data are available for a
limited period (20–30 years). This has motivated the development of approaches to enlarge
the sample extreme values beyond the annual maxima.
The method of independent storms (MIS), proposed by Cook [2] and refined by Harris
[3,4], considers several wind storm maxima, rather than just annual maxima. The extremes
of storm maxima are fitted with the Gumbel distribution. The MIS method is limited to the
Gumbel distribution, and it discounts the possibility of GEV model representing the data.
The Peaks-Over-Threshold (POT) method is another alternative that models the peaks of
wind speed time series exceeding a threshold by the Generalized Pareto Distribution
(GPD), which is shown to be the domain of attraction of the peaks [5,6]. However, the
application of POT is confounded by an erratic variation of a quantile estimate with
respect to the threshold used in creating the sample of peaks [7].
The paper presents an alternate extreme value analysis of the Canadian wind speed data
that is based on estimation of the joint distribution of annually r largest order statistics
(r-LOS) of data. Assuming that r-LOS are generated by an underlying inhomogeneous
Poisson process, they can be modelled by a joint GEV distribution [8,9]. The paper
shows that the r-LOS method provides a systematic approach to (1) ascertain whether data
belong to the Gumbel or the GEV distribution, and (2) estimate the sampling error
associated with quantile estimates. Although the theoretical basis of the r-LOS method is
well established, the paper illustrates its versatility in the estimation of extreme wind speed.
The paper is organized as follows. A brief review of extreme value theory and MIS is
presented in Section 2. The proposed r-LOS method is described in Section 3. Section 4
presents a detailed analysis of wind data collected at 30 sites in Ontario (Canada) using
r-LOS and MIS methods. The wind speed quantile estimates are compared with the design
values specified in the Canadian National Building Code (CNBC) 1995. Section 5
summarizes the finding of this paper.
2.1. Background
where m, s and x are the location, scale and shape parameters, respectively. If x40, the
GEV is known as Type II (Frechet) distribution with an unbounded upper tail (m-s/
xoxoN). The case of xo0 is called the Type III (the reverse Weibull) distribution with a
finite upper tail (Noxom–s/x). As x-0, the Type I (Gumbel) distribution is obtained:
G X ðxÞ ¼ exp exp ðx mÞ=s . (4)
Since the GEV in a theoretical sense encompasses all the three types of extreme value
distributions, it has become a popular choice for extreme value analysis without any
presumption about the Gumbel distribution.
It should be noted that the GEV converges to the Gumbel distribution only in an
asymptotic sense, because Eq. (3) is not applicable to the case of x ¼ 0 due to a singularity.
This has an important practical consequence that a non-zero value of the shape parameter
obtained during the distribution fitting cannot be accepted as it is. In fact, a statistical
significance test is required to test whether or not a non-zero shape parameter is indicative
of the GEV or Gumbel distribution.
asymptotic convergence. The r-LOS method will be applied to examine this issue in the
context of the Canadian wind speed data analysis.
It is a simple and straightforward method, adopted by many national design codes world
wide, in which the Gumbel distribution is used to fit a sample of annual maximum wind
speed. The annual maxima are plotted on the Gumbel probability paper and parameters
are estimated from the method of least squares.
The design wind speeds specified in the CNBC 1995 is based on the Gumbel analysis of
annual maxima of wind speed data [11]. Given a sample of n year maxima, the method of
moments is used to estimate the Gumbel parameters [12]:
pffiffiffi.
s ¼ sx 6 p,
m ¼ x̄ 0:5772s, ð5Þ
where x̄ is the mean and sx is the standard deviation of the annual maxima data, and m and
s are location and scale parameter, respectively. In CNBC, the 30-year wind speed quantile
is chosen as a reference speed. The mean 30-year wind speed (V̄ 30 ) is first estimated for
every station in Canada from the Gumbel analysis of annual maxima data, varying from
15 to 25 years in duration [12]. The data uncertainty is added to the mean estimate to
obtain the final design value. pffiffiffi
The sampling error associated with the 30-year quantile was estimated as es ¼ 2:96sx = n
[12]. This formula is based on sampling error associated with the Gumbel quantile written
as a function of sample moments. The additional sources of error are climate variability,
siting uncertainty, anemometer height uncertainty and uncertainty associated with log law
for height correction, which were assumed to be 25%, 20%, 15% and 15% of the sampling
error (es), respectively. The final data uncertainty (ed) was obtained by adding variance of
all these components, i.e, ed ¼ 1.0712 es. The 30-year wind speed and the associated error
were plotted on a map, and the final extreme wind contour map was prepared using expert
judgement. The design speed appears to be specified as V30 ¼ V̄ 30 +ed, i.e., it corresponds
to 68% upper bound confidence interval.
The design speed corresponding to any other T-year return period can be calculated in
terms of V30 as
T
V T ¼ V 30 0:7797sx 3:3843 þ lnðln . (6)
T 1
The CNBC provides an alternate formula in which a T- year quantile is expressed in
terms of 10 and 30-year quantiles as
" #
x10 x30 0:0339
xT ¼ x30 þ ln . (7)
1:1339 ln 1 1=T
This method enlarges the sample for extreme value analysis by including the wind storm
maxima, typically 100 storms per year. Through the examination of continuous records of
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 169
wind speed, independent wind storms are identified between each pair of lulls, and
maximum value within each storm is extracted to form a sample of storm maxima. The
wind speed data are converted into dynamic pressure. The top r order statistics of storm
maxima data are plotted on the Gumbel probability paper and extreme quantiles are
obtained by extrapolation of the straight line fitted to the data. The main concepts of the
improved MIS, as discussed by Harris [3,4], are summarized below:
Suppose N independent storm maxima are extracted from S years of records, and FP(x)
denotes their CDF. The probability distribution of annual maxima can be given as
FA(x) ¼ [FP(x)]m, assuming that it is generated by independent storms with annual
occurrence rate of m ¼ N/S. Arranging the storm maxima in a decreasing order and
denoting a kth order statistic as Yk, such that Y1 being the largest and the YN being the
smallest value. Its probability density function (PDF) is given as [4]
N!
f Y k ðyk Þ ¼ ½F P ðyk ÞNk ½1 F P ðyk Þk1 f P ðyk Þ. (8)
ðk 1Þ!ðn kÞ!
Using the probability integral transformation, zk ¼ FP(yk), the PDF of the cumulative
frequency or the plotting position, zk, of Yk can be derived as
dyk
f Zk ðzk Þ ¼ f Y k fF 1
P ðzk Þg and dzk ¼ f P ðyk Þ dyk . (9)
dzk
N!
f Zk ðzk Þ ¼ ½zk Nk ½1 zk Þk1 . (10)
ðk 1Þ!ðn kÞ!
3.1. Analysis
This method selects r largest observation in each year of the data collected and derives
their joint distribution based on the theory of the Poisson process [13]. As seen from
Eqs. (2) and (3) the sample extreme value distribution, [FX(x)]n, asymptotically converge to
the GEV as
n 1=x o
½F X ðuÞn ¼ G X ðxÞ ¼ exp 1 þ xðu mÞ=s (11)
The probability that wind speed exceeds a threshold, u, is p ¼ [1FX(u)] and the number
of exceedances in n trials follows the binomial distribution with parameters n and p. As
n ! 1 and p-0, then np-L, a constant. The distribution of number of exceedances
converges to the Poisson distribution with L as the intensity measure. Using a first-order
approximation, log½F X ðuÞ ½1 F X ðuÞ, the Poisson intensity measure can be obtained
from Eq. (12) as
1=x
n log½F X ðuÞ n½1 F X ðuÞ3np ! LðuÞ ¼ 1 þ xðu mÞ=s ðn ! 1; p ! 0Þ.
(13)
Denoting the rth LOS in a sample of n as M ðrÞ
n , its distribution, Gr(y), can be related to the
Poisson distribution as
X
r1
½LðyÞk
n oy ¼ G r ðyÞ ¼
P½M ðrÞ exp Lð yÞ . (14)
k¼0
k!
For details of the derivation of this distribution, readers are referred to [9, Chapter 7].
The joint distribution (16) is the basis for inference, and is referred to as the r-LOS model.
As x-0, the Poisson rate parameter in limit converges to
h
y mi
Lð yÞ ¼ exp ðx ! 0Þ (16)
s
which is the Gumbel analog of the r-LOS model.
The r-LOS model has emerged as a versatile method for extreme value analysis in many
areas of science and engineering. It has been effectively applied for the estimation of
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 171
extremes of sea levels [5], wave heights [14] and rainfall [15]. A monograph by Coles
illustrates several interesting applications of this method [9]. This paper presents for the
first time a comprehensive application of r-LOS model to wind speed estimation problem.
Here, xk,i denotes the kth largest OS in ith year of wind data. The distribution parameters
are obtained by maximizing the log-likelihood function. In a special case of r ¼ 1, it
reduces to the GEV model for annual maxima. A quantile estimate corresponding to a T
year return period is obtained as
sn x o
xT ¼ m 1 log 1 1=T . (18)
x
the bias and violate the assumption of Poisson process generating the extreme values [8].
A practical criterion is that r should be selected such that it minimizes the variance
associated with a required quantile estimate. Tawn [17] also concluded that the results for
r ¼ 3–7 are very stable, showing that provided r is not too large this method leads to
consistent results. The analysis presented in Section 4 shows that r ¼ 5 is sufficient to
provide minimum variance quantile estimates. Thus, r ¼ 5 is considered in the rest of the
analysis.
This section presents the analysis of time series of the aviation wind (HLY 01) recorded
by the Environment Canada at 30 stations in the Province of Ontario. The location and
other details of these stations are given in Table 1 and Fig. 1. The data for a site contain
Table 1
Information about wind stations in Ontario Canada
No. Name of sites Site ID Begin year End year Total years
Note: The letter ‘A’ in the names of a site means that data were recorded at the local airport field.
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 173
hourly wind speed, which is the 2-min average of the wind speed recorded just before the
hour. The largest daily wind speed is the maximum value of these hourly wind speeds
(24 values) within a day. The design wind estimates given in CNBC were derived from
these data [11].
The main assumption in the extreme value analysis is that the wind speed maxima are
independent. To reduce mutual dependence in the data, the annual time series of daily
maximum speed are partitioned into blocks that are equal to or larger than the duration of
typical storms in days (4–8 days). Using a procedure described by Simiu and Heckert [5],
new time series of 4-day independent maxima were created for each station. The wind
speed (V) time series were converted into dynamic pressure series using the relation
0.5rV2, where r ¼ 1.29 kg/m3 is the air density in Ontario [12].
The processed data sets were analyzed using the following three methods:
0.4
0.3
Shape Parameter of GEV
0.2
0.1
0.0
-0.1
-0.2
-0.3
0 1 2 3 4 5 6 7 8 9 10
r
Fig. 2. The GEV shape parameter with 95% confidence interval (Kingston, Ont.).
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 175
105 105
50-year Design Speed (km/h)
95 95
90 90
85 85
80 80
75 75
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
r r
Fig. 3. 50 and 500 year wind speed with 95% confidence interval (Kingston, Ont.).
Table 2
Likelihood ratio test applied to r-LOS data (Kingdom, Ont.)
Table 3
Parameters of the Gumbel distribution (Kingdom, Ont.)
Fig. 4 compares the wind speed quantiles for return periods ranging from 50 to 1000
years. It is interesting that the MIS and r-LOS curves are in close agreement in this case.
The design wind speeds specified in the CNBC are higher than those obtained from the
r-LOS and MIS methods. The reason, as discussed in Section 2.1, is the higher uncertainty
associated with estimates obtained from annual maxima data.
The data from other remaining 29 stations in Ontario were analyzed as described in the
previous section. The quantity of data used by each of the three methods varies, as shown
in Fig. 5. Some of the important trends observed from the results are discussed below.
ARTICLE IN PRESS
176 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182
125
120
115
Quantile (Km/h)
110
105
100
95
90
NBCC 95
85
MIS
80 r-LOS
75
10 100 1000
Return Period (Year)
300 r-LOS
MIS
AMG
250
Number of Samples
200
150
100
50
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.
Fig. 5. Comparison of the sample size used by different methods in Ontario data analysis.
5
r=1
Standard Error (km/h)
r=3
3
2
r=5
1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.
Fig. 6. The number of order statistics (r) versus the standard error associated with 500-year speed.
0.4
0.3
0.2
Shape Parameter
0.1
0.0
-0.1
-0.2
-0.3
-0.4
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.
Fig. 7. The GEV shape parameter with 95% confidence interval: wind pressure data (Ont., Canada).
shown in Fig. 8, confirm that the Gumbel model is preferred for 27 stations. The GEV can
be used only for three stations numbered 1 (Wawa), 19 (Sarnia) and 28 (Mount Forest).
The estimated shape parameters for these stations are highly negative: 0.11, 0.12 and
0.19, respectively.
In contrast, the shape parameter (mean and 95% confidence intervals) estimated from
the wind speed data is mostly negative (Fig. 9). The likelihood ratio test confirms that 22
stations follow the GEV model (Fig. 10). The negative shape parameter means that the
upper tail is bounded at (ms/x), and it corresponds to the reverse Weibull (Type III)
domains of attraction. These results are in line with conclusions of the POT results
ARTICLE IN PRESS
178 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182
20
18
16
14
Likelihood Ratio
12
10
6 GEV
Criteria Line = 3.841
4
2 Gumbel
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.
Fig. 8. Results of likelihood ratio test: wind pressure data (Ont., Canada).
0.3
0.2
0.1
Shape Parameter
0.0
-0.1
-0.2
-0.3
-0.4
-0.5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.
Fig. 9. The GEV shape parameter with 95% confidence interval: wind speed data (Ont., Canada).
reported in the literature [5], in which the shape parameter for the generalized Pareto
distribution is predominantly negative.
On the other hand, the comparison of Figs. 8 and 10 appears to support the notion that
a square transformation of wind speed (to wind pressure) accelerates the convergence to
the Gumbel distribution [2,3,18].
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 179
20
18
16
14
Likelihood Ratio
12
10
6 GEV
Criteria Line =3.841
4
2 Gumbel
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.
Fig. 10. Results of likelihood ratio test: Wind speed data (Ont., Canada).
110 CNBC 95
r-LOS
MIS
100
50-year Speed (km/h)
90
80
70
60
50
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.
130
CNBC95
r-LOS
MIS
500-year Speed (km / h)
110
90
70
50
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.
130
120
110
Quantile (km/h)
100
90
80
70 NBCC 95
r-LOS+1.07*Std.Err.
Mean of r-LOS
60
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.
Fig. 13. Comparison of 500-year wind speed estimated from RLOS and NBC 95 methods.
r-LOS estimates of 500-year wind speed are slightly higher than the corresponding MIS
values.
For the sake of a consistent comparison, mean r-LOS estimates are converted into the
design values by adding the data uncertainty (ed ¼ 1.0712 es) similar to CNBC approach.
Fig. 13 shows that r-LOS estimates of 500-year design speed are much lower than CNBC
specifications. It suggests that the conservatism-associated CNBC can be reduced by
improved extreme value analysis of the wind speed data.
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 181
5. Conclusions
The paper presents the statistical estimation of extreme wind speed using annually
r-LOS extracted from the time series of wind data. The method is based on a joint
generalized extreme value distribution of r largest order statistics derived from the theory
of Poisson process. The parameter estimation is based on the method of ML . A formal
likelihood ratio test is applied to discern between the GEV and Gumbel distribution. The
data collected at 30 stations in Ontario, Canada, are analyzed in the paper. The results
of r-LOS method are compared with those obtained from the MIS and specifications of the
CNBC.
The r-LOS analysis of the Ontario data shows that the Gumbel distribution is
statistically more preferable than the GEV distribution with a bounded tail due to negative
shape parameter. In this sense, the r-LOS results support the basis of the MIS method
which adopts the Gumbel distribution for wind pressure. However, the r-LOS method is
more versatile than the MIS method, as it also retains the flexibility of adopting the GEV
distribution.
Acknowledgements
The authors are grateful to the Sciences and Engineering Research of Canada (NSERC)
and the University Network of Excellence for Nuclear Engineering (UNENE) for
providing the financial support for this study. The authors are also thankful to the
Environment Canada for providing the wind speed data. The authors gratefully
acknowledge the use of computer programs provided by Coles for r-LOS model and
Simiu for filtering of the data.
References
[1] E.J. Gumbel, Statistics of Extremes, Columbia University Press, Columbia, 1958.
[2] N.J. Cook, Towards better estimation of extreme winds, J. Wind Eng. Ind. Aerodyn. 9 (1982)
295–323.
[3] R.I. Harris, Gumbel re-visited—a new look at extreme value statistics applied to wind speeds, J. Wind Eng.
Ind. Aerodyn. 59 (1996) 1–22.
[4] R.I. Harris, Improvements to the method of independent storms, J. Wind Eng. Ind. Aerodyn. 80 (1999)
1–30.
[5] E. Simiu, N.A. Heckert, Extreme wind distribution tails: a peaks over threshold approach, J. Struct. Eng. 122
(1996) 539–547.
[6] J.I. Pickands, Statistical inference using extreme order statistics, Ann. Stat. 3 (1975) 119–131.
[7] M.D. Pandey, An adaptive exponential model for extreme wind speed estimation, J. Wind Eng. Ind.
Aerodyn. 90 (2002) 839–866.
[8] R.L. Smith, Extreme value theory based on the r largest annual events, J. Hydrol. 86 (1986) 27–43.
[9] S. Coles, An Introduction to Statistical Modeling of Extreme Values, Springer, Berlin, 2001.
[10] A.F. Jenkinson, The frequency distribution of the annual maximum (or minimum) of meteorological
elements, Quart. J. R. Meteorol. Soc. 81 (1955) 158–171.
[11] National Building Code of Canada, 1995, National Research Council.
[12] Yip, T.C., Auld, H., 1993. Updating the 1995 National Building Code of Canada wind pressures. In: Paper
presented to Canadian Electricity Association, pp. 1–9.
[13] I. Weissman, Estimation of parameters and large quantiles based on the k largest observations, J. Am. Stat.
Assoc. 73 (1978) 812–815.
ARTICLE IN PRESS
182 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182
[14] C.G. Soares, M.G. Scotto, Application of the r largest-order statistics for long-term predictions of significant
wave height, Coast. Eng. 51 (2004) 387–394.
[15] S. Nadarajah, Extremes of daily rainfall in west central Florida, Clim. Change 69 (2005) 325–342.
[16] G.W. Oehlert, A note on the delta method, Am. Statist. 46 (1992) 27–29.
[17] J.A. Tawn, An extreme-value theory model for dependent observations, J. Hydrol. 101 (1988)
227–250.
[18] A. Naess, Estimation of long return period design values for wind speeds, J. Eng. Mech. 124 (1998)
252–259.