Sunteți pe pagina 1din 17

WIND ENERGY

Wind Energ. 0000; 00:1–17


DOI: 10.1002/we

RESEARCH ARTICLE

Evaluation of Bivariate Archimedean and Elliptical Copulas to


Model Wind Power Dependency Structures
Henry Louie
Department of Electrical and Computer Engineering, Seattle University, Seattle, WA, USA

ABSTRACT
When modeling wind power from several sources, consideration of the dependency structure of the sources is of critical
importance. Failure to appropriately account for the dependency structure can lead to unrealistic models, which may result
in erroneous conclusions from wind integration studies and other analyses. The dependency structure is fully described by
the multivariate joint distribution function of the wind power. However, few—if any—explicit joint distribution models
of wind power exist. Instead, copulas can be used to create joint distribution functions, provided that the selected
copula family reasonably approximates the dependency structure. Unfortunately, there is little guidance on which copula
family should be used to model wind power. The purpose of this paper is to investigate which copula families are best
suited to model wind power dependency structures. Bivariate copulas are considered in particular. The paper focuses
on power from wind plants—collections of wind turbines with a common interconnection point—but the methodology
can be generally extended to consider power from individual wind turbines, or even aggregate wind power from entire
systems. Twelve Archimedean and elliptical copulas are evaluated using hourly data from 500 wind plant pairs in the
National Renewable Energy Laboratory’s Eastern Dataset. The evaluation is based on χ2 and Cramér-von Mises statistics.
Application guidelines recommending which copula family to use are developed. It is shown that a default assumption of
Gaussian dependence is not justified, and that the use of Gumbel copulas can result in improved models. An illustrative
example shows the application of the guidelines to model dependence of wind power sources in Monte Carlo simulations.
Copyright c 0000 John Wiley & Sons, Ltd.
KEYWORDS
Copula; concordance; correlation; dependence; modeling; monte carlo; wind power
Correspondence
Email: louieh@seattleu.edu

Received . . .

1. INTRODUCTION

The proliferation of weather-driven renewable energy sources into the power system has increased the need to better
understand and appropriately model the dependence of stochastic variables. Dependence is quantified using a measure of
association, such as the linear correlation coefficient [1]. Wind power researchers almost universally use this metric as the
sole indicator of dependence. For example, [2, 3, 4, 5, 6, 7] showed that the linear correlation coefficient of power from
wind plants tends to decrease with separation distance, and tends to increase for longer averaging periods. These results
provided a statistical quantity that could generally explain the role of geographic diversity and the so-called smoothing
effect in wind power integration. Integrating wind plants with low correlation into the same system reduces the occurrences
of extremely high or low aggregate power output; similarly, wind power variations on timescales of seconds to minutes
have low correlation and require fewer reserves than the potentially highly-correlated longer term fluctuations.
Though the linear correlation coefficient provides general information about dependence, it does not uniquely describe
the structure of the dependence. It also does not translate well into specific, actionable information that can be used by
system operators or planners. For example, a system planner may wish to know the number of hours per year that the
aggregate wind power in a system will be above or below some threshold value. The linear correlation coefficient, even
coupled with knowledge of the marginal distributions of the wind power sources, is not enough information to determine

c 0000 John Wiley & Sons, Ltd.


Copyright 1
Prepared using weauth.cls [Version: 2010/06/17 v1.00]
H. Louie Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power

this. However, the dependency structure is fully described by the joint distribution function of the wind power. The join
distribution function can be used to provide more specific and actionable information than the linear correlation coefficient.
Unfortunately, no multivariate joint distribution models of wind power are available, and no common joint distributions
obviously fit wind power data. In the absence of an explicit joint distribution function, there are ways of modeling the
dependency structure. One approach is to decompose the assumed correlation matrix using Cholesky decomposition. The
resulting matrix is used to transform pairs of uncorrelated random variables to achieve the desired covariance [8]. However,
this method can only be used to model linear correlation, and it offers no further control of the dependency structure.
Another approach is to use copulas.
Copulas have been widely used in fields such as finance, and they have several properties that make them attractive for
wind power modeling [9]. One important feature of the copula is that it can be used to model dependency in a way that is
independent of the marginal distribution functions. This is important since modeling the power output of individual wind
plants is often not trivial, and decoupling this problem from that of modeling the dependency structure of several wind
plants is advantageous. The parameters of the copula can also be selected to obtain a specified rank correlation. This is
useful because the correlation between wind plants can be estimated from characteristics such as separation distance and
averaging period [3, 4, 10]. Therefore, if only basic information of the wind plants is available, a reasonable model of
dependency structure can be produced, provided that the user specifies an appropriate copula function. However, there is
scant guidance on copula selection in existing literature, and an inappropriate copula can result in unacceptable errors in
the model.
Researchers have applied various copulas to the different problems related to wind power. A common default copula
selection is the Gaussian, but it has not been rigorously investigated if this is an appropriate choice for wind power.
In [11], wind power was modeled from wind speed copulas as part of a method of studying wind power integration.
The Gaussian copula was selected to model wind speed. However, the decision to use this copula was based only on a
qualitative assessment of Quantile-Quantile plots. In [10], a limited number of copulas were applied to model wind speed.
Only Archimedean copulas were considered. Dependency structures modeled by copulas are identified in [12] as a method
of generating scenarios of wind power production. Wind power production scenarios are increasingly used in stochastic
optimization and stochastic programming, which are common threads of power systems research. In [13], Gaussian copulas
are used to generate these scenarios. Empirical copulas were used in [14] to model the dependency structure between wind
speed and wind turbine power output. The aim is to improve wind turbine condition monitoring by detecting deviations in
the dependency structure from the nominal. In [15], copulas were used in wind power forecasting. In particular, a quantile-
copula conditional kernel density estimator was used to improve probabilistic wind power forecasts. The Gaussian copula
was evaluated among others in this application.
As more researchers employ copulas to model wind power, it is warranted that a thorough evaluation of the fit of copulas
to model wind power dependency structures is performed. This paper evaluates the fit of 12 common Archimedean and
elliptical copulas to model hourly wind power dependency of wind plants. It is emphasized that the copulas are being
evaluated for their ability to model wind power—not wind speed. Non-linearity and non-monotonicity of the power curve
may prohibit the results from being generally applied to wind speed. Data from the National Renewable Laboratory’s
Eastern Dataset [16, 17] are used in the evaluations. From these evaluations, guidelines on the selecting particular copulas
to model wind power dependence are developed. An application vignette demonstrates the use of copulas to model
wind power dependency in a Monte Carlo simulation. The vignette presents a methodology that uses copulas to perform
probabilistic studies of power systems for the specific case of wind energy. Knowing the distribution of operating states is
increasingly important as more stochastic generation is added to the system.
The remainder of this paper is arranged as follows. Foundations of copula theory are provided in Section 2. Section 3
analyzes the dependency structures found in the power output by wind plants. The methodology used in the copula
evaluation is described in Section 4. The evaluation of copulas, and development of guidelines for their application are
described in Section 5. A vignette illustrating the application is provided in Section 6. Conclusions and future outlook are
described in Section 7.

2. THEORETICAL BACKGROUND

Copulas are a simple, yet powerful, tool for representing dependency structures, and are the focus of this paper. In this
Section, a concise overview of copula theory is given. More in-depth descriptions of this rich topic are found in texts such
as [1, 18].
Copulas are multivariate distribution functions on the unit hypercube that have uniform marginal distribution
functions [1, 18, 19]. The terms distribution function or distribution are used in this paper in preference to cumulative
distribution function, and density function is used instead of probability density function. Although copulas can be defined
in d-dimensional space, for clarity the following discussion is limited to the two-dimensional (bivariate) case.

2 c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17
DOI: 10.1002/we
Prepared using weauth.cls
Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power H. Louie

Let x and y be two random variables paired as (x, y) with joint distribution H(x, y) and joint density h(x, y). In the
context of this work x and y represent the wind power from two wind plants that have been normalized to their rated
power. If F (x) and G(y) are the marginal distributions of x and y, with u = F (x) and v = G(y), then according to
Sklar’s Theorem there is a copula function C such that:

C(u, v) = C(F (x), G(y)) = H(x, y). (1)

Note that u and v are each bounded on [0, 1] and are uniformly distributed, regardless of the marginal distributions of x and
y. In informal terms, the copula links univariate marginal distributions to their bivariate joint distribution. The relationship
in (1) can be inverted so that:
C(u, v) = H(F −1 (u), G−1 (v)). (2)
The density function c of any copula is computed from the mixed partial derivative:

∂ 2 C(u, v)
c(u, v) = . (3)
∂u∂v
The distinction between x and y—variables representing wind power—and u and v—those variables transformed by
F (x) and G(y), respectively—is important. Following the convention in [9], u and v exist in what is referred to as
the uniform/rank domain; whereas x and y exist in the normalized wind power domain. Given a copula function, it is
straightforward to generate dependent random variables in the uniform/rank domain, which can then be transformed by
the appropriate inverse marginal distribution functions to the normalized wind power domain. An example of this procedure
is given in Section 6. However, it is the selection of an appropriate copula function and its parameters that is often the most
challenging step.
An interesting category of copulas are parametric copulas, which are the focus of this paper. Parametric copulas are
of interest as they are: easy to work with, mathematically tractable, widely utilized, supported in software packages [1],
and—as shown in this work—can accurately model wind power. Parametric copulas may be parameterized by one or more
values arranged in the vector θ. In this paper the copula parameters are selected to preserve the dependency between u and
v, under the reasonable assumption that the dependency is specified by the user. Dependency can be quantified in several
ways. When copulas are used, it is advantageous to use the rank correlation coefficient, which is discussed next.

2.1. Rank Correlation


Kendall’s rank correlation coefficient is a measure of association between two random variables that is based on the concept
of concordance. Informally, random variables x and y are concordant if large values of x are associated with large values
of y, and small values of x are associated with small values of y. They are discordant if the converse is true [1]. The
rank correlation coefficient known as Kendall’s τ , hereafter referred to either as rank correlation or τ , is the probability of
concordance minus the probability of discordance:

τ = Pr{(X1 − X2 )(Y1 − Y2 ) > 0} − Pr{(X1 − X2 )(Y1 − Y2 ) < 0} (4)

where the pair (Xi , Yi ) is the ith simultaneous sample of x and y. It follows that the rank correlation of (x, y) is
preserved through any monotonic transformation, including F (x) and G(y). Therefore (x, y) and (u, v) have identical
rank correlations.
The rank correlation is used to select the parameters of the copulas evaluated in this study. The methods used to
determine θ, as well as the formulation of the copula depend on the class of the copula. This paper evaluates copulas
from the Archimedean and elliptical classes.

2.2. Archimedean Copulas


Archimedean copulas are a common class of copulas that are popular due to their mathematical tractability and because
the copulas can be expressed in terms of a single-argument generator function φ(t) [1, 19]. It will be shown that expressing
the copula as φ(t) rather than C(u, v) is advantageous in some settings. Several examples of Archimedean copulas are
given in Table I. There are numerous other copulas. See [1] for a more comprehensive listing. The function φ(t) can be
any convex decreasing function defined on (0, 1] with the property that φ(1) = 0 [1].
The generator function is related to a bivariate copula by:

C(u, v) = φ[−1] (φ(u) + φ(v)) (5)


 
where φ[−1] (·) is the pseudo-inverse of φ(t). That is, φ φ[−1] (t) = min{t, φ(t)}. In most cases, the generator functions
are dependent on one or more parameters. Parametric generator functions are denoted as φθ (t), where the subscript θ

c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17 3
DOI: 10.1002/we
Prepared using weauth.cls
H. Louie Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power

Table I. Selected Archimedean Copulas

Family C(u, v) φ(t) θ∈ τ ∈


1/θ
−((−ln(u))θ +(−ln(v))θ ) θ
Gumbel e (−ln(t)) [1, ∞) [0, 1)
θ θ 1/θ θ

Joe 1 − (1 − u) + (1 − v) −ln(1 − (1 − t) ) [1, ∞) (0, 1)
   −θt 
− 1θ ln 1 + (e −1)(e−vθ −1)
−uθ
Frank e−θ −1
−ln ee−θ −1
−1
(−∞, ∞), {0} (−1, 1)
 
AMH uv
1−θ(1−u)(1−v)
ln 1−θ(1−t)
t
[−1, 1) (−0.18, 0.33)
−θ −θ
−1/θ −θ
Clayton max{u +v − 1, 0} t−θ − 1 [−1, ∞), {0} [0, 1)
−θln(u)ln(v)
GB uve ln(1 − θln(t)) (0, 1] (−0.36, 0]
θ θ θ
1/θ θ
ln 2 − tθ

C7 max{u v − 2(1 − u )(1 − v ), 0} (0, 0.5] (−0.56, 0]
 1/θ −1

θ
C8 1 + (u−1 − 1)θ + (v −1 − 1)θ (1 − ln(t)) [1, ∞) [0.33, 1)
1/θ
C9 max{1 − (1 − u)θ + (1 − v)θ , 0} (1 − t) θ
[1, ∞) [−1, 1)
Independ. uv −log(t) NA 0

denotes the function’s parametric dependence on θ. In this paper, single parameter Archimedean copulas are considered.
The parameter of the Archimedean copula and the rank correlation τ are related by [1]:
Z 1
φθ (t)
τ =1+4 dt. (6)
0 φ′θ (t)

The simplicity of (6) shows the advantage of working with generator functions rather than the copula.
Inversion of (6) allows the parameter θ to be computed from a specified rank correlation. Closed form versions of
the inverse of (6) exist for several Archimedean copula generator functions; for those without a closed form solution,
univariate numerical methods may be readily applied. Also note that, depending on the generator function, the range of τ
might not be [−1, 1], as summarized in the last column of Table I. For example, the Clayton family can only model positive
rank correlation. Finally, it is notable that Archimedean copulas can be extended to d-dimensional space if the following
additional conditions of the generator function are met: φ(0) = ∞ and φ−1 (t) is completely monotonic on [0, ∞).

2.3. Elliptical Copulas


Elliptical copulas are multivariate functions that are elliptically contoured. Two common elliptical copulas are the Gaussian
and Student’s t. Each of these copulas can be extended to d-dimensional space, but the bivariate case is discussed in the
following for clarity. The bivariate distribution function for the Gaussian copula is parameterized by the linear correlation
coefficient ρ and is expressed as:
C(u, v) = Φρ Φ−1 −1

ρ (u), Φρ (v) (7)
where Φρ (·, ·) is the bivariate Gaussian distribution function and Φ−1
ρ (·) is the inverse of the univariate Gaussian
distribution function. From (7) the distinction between the Gaussian copula and the Gaussian distribution is apparent.
The formula for Φρ (·, ·) is provided in the Appendix.
Although closed form representations of Φρ (·) and Φ−1 ρ (·) do not exist, there are several ways of numerically
approximating them [20]. The Gaussian copula is parameterized by the linear correlation coefficient. For Gaussian
distributed variables, ρ can be computed from τ as:
 πτ 
ρ = sin . (8)
2
The bivariate Student’s t distribution is expressed in terms of a copula as:
−1 −1 
C(u, v) = Tγ,ν Tγ,ν (u), Tγ,ν (v) (9)
−1
where Tγ,ν (·, ·) is the bivariate Student’s t distribution function and Tγ,ν (·) is the inverse of the univariate Student’s t
distribution function. The formula for Tγ,ν (·, ·) is provided in the Appendix. As was the case with the Gaussian copula,
−1
closed form expressions of Tγ,ν (·, ·) and Tγ,ν (·) do not exist, but there are several ways of numerically approximating
them [20].

4 c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17
DOI: 10.1002/we
Prepared using weauth.cls
Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power H. Louie

Student’s t copula differs from the other copulas considered in this work in that it is parameterized by two values: γ and
ν. The parameter γ can be approximated by ρ for large values of ν. In these cases, γ can be computed from rank correlation
as in (8). The presence of the degrees of freedom parameter ν provides additional flexibility in controlling the structure of
the copula, however it cannot be directly computed from τ , which limits its practicality. Nonetheless, it is considered in
this paper for comparison purposes. Also note that Student’s t copula is such that as ν increases, it approaches the Gaussian
copula.
An important property of Archimedean and elliptical copulas is a form of symmetry called exchangeability where
C(u, v) = C(v, u). It is shown in the next Section that wind power copulas in the uniform/rank domain are nearly
exchangeable, which further justifies the consideration of Archimedean and elliptical copulas.

3. DEPENDENCY STRUCTURES OF WIND POWER

The previous Section showed that the dependency structure of random variables (x, y) can be described by a copula in the
uniform/rank domain of (u, v). The challenge is to determine which Archimedean and elliptical copulas are the closest fit
to wind power dependency structures. The first step is to inspect the features of the dependency structures in wind power
data, which is the focus of this Section. The general characteristics of wind power dependency structures are observed
from a subset of the data in the National Renewable Energy Laboratory’s (NREL) Eastern Dataset [17, 16].
The NREL Eastern Dataset was originally created to serve as input data for expansive wind integration studies. The
geographic region covered is approximately the eastern half of the United States. Due to its geographic scope, a wide
variety of terrain and wind regimes are represented in the dataset. Ten-minute wind speed values were generated for 1326
hypothetical wind plants using mesoscale modeling at a resolution of 2 km. The models were validated by comparison with
historical measured meteorological data. The conversion from wind speed to wind plant power output accounted for factors
such as number and type of wind turbines, prevailing wind direction, wake losses, local terrain, wind turbine availability
and electrical losses. Further information on how the NREL Eastern Dataset was created and validated are found in [17].
In total, 500 randomly-selected pairs of wind plants are considered in this paper. The analyses are based on hourly
averages of the ten-minute wind power data for a period of one year. The averages have been normalized to the rated
power of their respective wind plant. The selected wind plants vary in capacity from 100 MW to 1435 MW with an
average of 446 MW. The wind plants are separated by distances ranging from 22 km to 2633 km, with an average of
1018 km. The computed mean rank correlation is 0.16, but the median is just 0.09. Despite many of the pairs having low
rank correlation, the selected wind plants represent a wide range of correlations coefficients, ranging from -0.03 to 0.87.
The dependency structures of wind power are more easily observed and interpreted, at least initially, as densities in the
normalized wind power domain. The scatterplots in Figure 1 show the dependency structures from nine different wind
plant pairs from the NREL Dataset. The clustering of data points indicates an area of increased density of the joint density
function h(x, y). These features are highlighted by the overlayed shaded contour plot, where darker shading indicates
greater density. A kernel density estimator has been used to identify the contours. The wind plant pairs in Figure 1 show
typical—yet specific—dependency structures for increasing levels of rank correlation, ranging from small negative values
to large positive values.
Due to the nature of the wind turbine power curves, wind plants often have power output levels near the extremes of low
or high power output. This is exhibited as higher density toward mutually high and low values of x and y. This is observed
in many plots of Figure 1. Aside from this general characteristic, the data appear to be random, with islands of higher
density surrounded by lower density. The overall dependency structure is difficult to ascertain. Fortunately, transforming
the data to the uniform/rank domain decouples the dependency structure from the power curves of the individual wind
plants, allowing the structure to be more easily recognized. To do this, the empirical marginal distribution functions of x
and y are used, and the results plotted in the uniform/rank domain of (u, v). The empirical marginal distribution function
F̂ (x) is defined as:
number of samples ≤ x
F̂ (x) = . (10)
total number of samples
The resulting scatterplots are shown in Figure 2.
Closely inspecting the plots Figure 2 reveals a narrow band along the axes in which no transformed data points reside.
This feature is an artifact of (10). For most wind plants studied, the wind power was zero in approximately five percent
of the data points. For a given wind plant, the application of (10) transforms all the zero values in the normalized wind
power domain to the same value in the rank/uniform domain. For example, if five percent of data points were zero in the
normalized wind power domain, then they will all be transformed to 0.05 in the rank/uniform domain. Though it may
appear unusual, the data points will transform back to zero in the nominal wind power domain if the inverse empirical
distribution function is applied.
Another feature exhibited by most plots in Figure 2 is near-symmetry along the positive diagonal. The exchangeability
property of Archimedean and elliptical copulas also give rise to this feature, which suggests that they are suitable at

c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17 5
DOI: 10.1002/we
Prepared using weauth.cls
H. Louie Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power

A. τ = -0.03 B. τ = 0.10 C. τ = 0.21


1 1 1

Normalized Power

Normalized Power

Normalized Power
0.75 0.75 0.75
0.5 0.5 0.5
0.25 0.25 0.25
0 0 0
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
Normalized Power Normalized Power Normalized Power
D. τ = 0.30 E. τ = 0.41 F. τ = 0.51
1 1 1
Normalized Power

Normalized Power

Normalized Power
0.75 0.75 0.75
0.5 0.5 0.5
0.25 0.25 0.25
0 0 0
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
Normalized Power Normalized Power Normalized Power
G. τ = 0.61 H. τ = 0.73 I. τ = 0.80
1 1 1
Normalized Power

Normalized Power

Normalized Power
0.75 0.75 0.75
0.5 0.5 0.5
0.25 0.25 0.25
0 0 0
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
Normalized Power Normalized Power Normalized Power

Figure 1. Scatterplots with shaded contour overlay of wind power for nine different pairs of wind plants with increasing rank correlation
values in the normalized wind power domain. Darker shading indicates greater density.

modeling wind power. At low correlation levels (Figure 2A—Figure 2B), the density is nearly uniform. As correlation
increases, the density increases first toward mutually high values of u and v (Figure 2C—Figure 2D), then toward
mutually low values of u and v (Figure 2E—Figure 2F). At the highest correlation levels (Figure 2G—Figure 2I), there is
increased density concentrated along the positive diagonal, with more density at mutually high and low values of u and v.
Copulas that are capable of mimicking these characteristics are promising candidates for wind power dependency structure
modeling.
It must be noted that the dependency structures of wind power are not necessarily stationary. Since yearly data are
considered in this paper, dependency structures that might appear on other than yearly time frames, for example seasonally,
are not apparent or investigated. However, the methodology presented hereafter can trivially altered to consider data of
other time frames.

4. METHODOLOGY

The remainder of this paper is dedicated to evaluating copula-based models of wind power dependency structures. This
Section describes the selection of candidate copulas, and the evaluation procedure.

4.1. Copula Selection


Copulas functions have been devised to exhibit a variety of dependency structures. Of particular interest are those
families of copulas that exhibit density characteristics similar to those in Figure 2. Several examples of density functions

6 c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17
DOI: 10.1002/we
Prepared using weauth.cls
Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power H. Louie

A. τ = -0.03 B. τ = 0.10 C. τ = 0.21


1 1 1
0.75 0.75 0.75
Rank

Rank

Rank
0.5 0.5 0.5
0.25 0.25 0.25
0 0 0
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
Rank Rank Rank
D. τ = 0.30 E. τ = 0.41 F. τ = 0.51
1 1 1
0.75 0.75 0.75
Rank

Rank

Rank
0.5 0.5 0.5
0.25 0.25 0.25
0 0 0
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
Rank Rank Rank
G. τ = 0.61 H. τ = 0.73 I. τ = 0.80
1 1 1
0.75 0.75 0.75
Rank

Rank

Rank
0.5 0.5 0.5
0.25 0.25 0.25
0 0 0
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
Rank Rank Rank

Figure 2. Scatterplots with shaded contour overlay of wind power for nine different wind plant pairs with increasing rank correlation
values in the uniform/rank domain. Darker shading indicates greater density.

corresponding to different copulas are shown in Figure 3. Even though the copulas’ parameters in Figure 3 have been
selected so that each has a rank correlation of 0.33, their dependency structures are distinct. Some of their structures are
similar to that exhibited by wind power at this correlation level, whereas others are not.
Figure 3A—Figure 3G exhibit the increased density along the positive diagonal and at either or both mutual extremes.
These features are also present in the wind power data at correlation levels near 0.33 (see Figure 2D and 2E). Contrast this
with Figure 3H and Figure 3I. The densities are remarkably different, and are poor candidates at this correlation level.
This qualitative inspection was repeated for various correlation levels and copula functions. Based on the inspection,
the Gaussian, Student’s t, Gumbel, Joe, Frank, Ali-Mikhail-Haq (AMH), and Clayton copulas are considered for further
evaluation. For contrast, copulas whose density functions do not generally follow Figure 2 are also considered. These
include the Gumbel-Barnett (GB), and those labeled C7, C8 and C9 in Table I. For benchmarking purposes, the Independent
copula, which assumes u and v are independent random variables, is considered.

4.2. Evaluation Procedure


The evaluation is based on the same 500 randomly-selected pairs of wind plants from the NREL Eastern Dataset that were
considered in Section 3. For each wind plant pair, the rank correlation coefficient was computed and used to determine
the parameters of each copula using (6) or (8) for the Archimedean and elliptical copulas, respectively. The ν parameter
of Student’s t copula is computed using a maximum likelihood method. Recall from the last column of Table I that most
copulas have a limited range of correlation they can achieve. In these cases, the copulas’ parameter was set so that the
correlation was at the closest limit. For example, if the data had a correlation coefficient of -0.03, the parameter for the
Gumbel copula was set to one, resulting in a correlation of zero.

c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17 7
DOI: 10.1002/we
Prepared using weauth.cls
H. Louie Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power

A. Gaussian B. Student’s t C. Gumbel


1 1 1
0.75 0.75 0.75
Rank

Rank

Rank
0.50 0.50 0.50
0.25 0.25 0.25
0 0 0
0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1
Rank Rank Rank
D. Joe E. Frank F. Ali−Mikhail−Haq
1 1 1
0.75 0.75 0.75
Rank

Rank

Rank
0.50 0.50 0.50
0.25 0.25 0.25
0 0 0
0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1
Rank Rank Rank
G. Clayton H. C8 I. C9
1 1 1
0.75 0.75 0.75
Rank

Rank

Rank
0.50 0.50 0.50
0.25 0.25 0.25
0 0 0
0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1
Rank Rank Rank

Figure 3. Contours of densities corresponding to various copulas with rank correlation of τ = 0.33. Darker shading indicates greater
density.

Evaluation of the fit of the copula-based density function to the empirical density function is done using two methods.
The first method compares χ2 statistics [21]. This method is selected because it is well known, though it suffers from the
disadvantage that the bin selection is subjective. The second method compares Cramér-von Mises statistics resulting from
so-called blanket tests of the empirical copula [22, 23]. This method is considered to be the most objective copula test
statistics, and is among the top performers in terms of statistical power [22, 23]. Note that comparative—not absolute—
evaluations of fit are performed in this paper. Evaluation of the goodness-of-fit by way of null hypothesis testing can further
be done by comparing the p-values of the reported test statistics to the desired levels of significance [21], but this is beyond
the scope of this paper.

4.2.1. χ2 Statistic
The χ2 statistic compares the observed number of samples that fall within a user-defined bin to the expected number of
samples in that bin. Each sample (Xi , Yi ) is sorted into the appropriate bin in a B × B contingency table. Each bin of the
contingency table corresponds to a uniformly sized and spaced subspace Bj,k , which in total partition the unit square.The
observed number of samples in the bin at row j, column k—denoted Oj,k —is simply the cardinality of the samples in
Bj,k . The expected number of samples in each bin is found by first computing the density function of each copula model
using (3). The expected number of samples in Bj,k is computed from:
ZZ
Ej,k = n c(u, v)dudv. (11)
Bj,k

8 c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17
DOI: 10.1002/we
Prepared using weauth.cls
Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power H. Louie

where n is the number of samples. In this analysis, hourly data for the duration of one year is used, so that n = 24 × 365 =
8760. The χ2 statistic test statistic is then computed as:

B B
!
2
X X (Oj,k − Ej,k )2
χ = . (12)
j=1
Ej,k
k=1

As always the case in computing χ2 statistics, care must be used in selecting the number and size of the bins. Different
choices can lead to different results and conclusions. In this analysis 100 total bins (B = 10) were used. This balanced the
desired to account for rapidly changing densities with the distortion in the statistic caused by bins with too few expected
values. The choice is also in-line with the heuristic guideline that the width of each bin in a given dimension be 0.3 times
the standard deviation of the univariate data [24]. Bins with fewer than six observations were pooled together. Furthermore,
if more than 20 percent of the bins had expected values less than five, the χ2 statistic was deemed to be unreliable and it
was excluded from the results. To verify the robustness of the results presented in this paper, the evaluations were repeated
using 400 bins with no change in conclusions.

4.2.2. Cramér-von Mises Statistic


The Cramér-von Mises (CvM) statistic is used to complement the subjective χ2 statistic. Whereas the user must decide
the number of bins and sorting rules in computing the χ2 statistic, the CvM statistic is strictly objective. The statistic is
computed from a blanket test that compares an empirical copula to a pre-defined copula. The pre-defined copulas are the
copulas whose fit to wind power is being evaluated. The following describes the computation of the CvM statistic as it
applies to the bivariate case; the d-dimensional case is described in [22, 23].
The process begins by creating the empirical copula from the n samples of x and y. The samples are transformed to the
rank/uniform domain by their empirical distribution functions Ui = F̂ (Xi ) and Vi = Ĝ(Yi ), where F̂ (·) and Ĝ(·) are the
empirical distribution functions of the samples of x and y, respectively. The transformation is such that Ui and Vi ∈ [0, 1].
Next, the bivariate empirical copula Ĉ(a, b) is computed from:
n
1 X
Ĉ(a, b) = I {Uj ≤ a, Vj ≤ b} (13)
n + 1 j=1

where I{·} is the indicator function [25], and a and b ∈ [0, 1]. In other words, the empirical copula is the observed
frequency of Pr{u ≤ a, v ≤ b}. The Cramér-von Mises statistic S is computed from:
n 
X 2
S= Ĉ(Ui , Vi ) − C(Ui , Vi ) (14)
i=1

where C(Ui , Vi ) is the Archimedean or elliptical copula being evaluated [22, 23]. A small value of S indicates less error
in the fit.

5. COPULA MODEL EVALUATION

5.1. Results
The procedures described in the previous Section were applied to 500 randomly-selected wind plant pairs, with the results
summarized in Table II. The second through fifth columns show the average χ2 statistics, and the number of times the given
copula resulted in: the lowest χ2 statistic (best fit), the second lowest statistic, and the highest χ2 statistic (worst fit). Note
that the totals in columns three through five do not necessarily equal 500, as some pairings were discarded because more
than 20 percent of the bins had less than five expected values. The last four columns show the same analysis performed
using the Cramér-von Mises statistic S.
The relative results for the χ2 and CvM statistics are in general agreement with each other; copulas with lower/higher
χ2 statistics tend to have lower/higher CvM statistics. The most salient disagreement is that the C8 copula had a higher
χ2 statistic than the C9, whereas it did not in terms of the CvM statistic. Figures 4 and 5 plot the statistics with respect to
correlation coefficient for all 500 pairs.
The first notable observation from Table II and Figures 4 and 5 is that not one copula family was the best fit for all
wind plant pairs. This is expected due to the variety of dependency structures that wind power exhibits, as discussed in
Section 3. Despite that, there are copula families that are clearly better suited for wind power modeling than others. The
parametric Archimedean and elliptical copula families can be sorted into five tiers based on their performance.

c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17 9
DOI: 10.1002/we
Prepared using weauth.cls
H. Louie Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power

Table II. Summary of Copula Evaluation

Family χ̄2 # Best # 2nd Best # Worst S̄ # Best # 2nd Best # Worst
Gaussian 218 45 79 0 0.22 32 91 0
Student’s t 209 75 128 0 0.22 34 80 0
Gumbel 187 173 78 0 0.14 205 98 0
Joe 295 55 73 0 0.34 86 91 0
Frank 225 83 58 0 0.25 68 66 0
Ali-Mikhail-Haq 657 31 55 0 2.27 30 63 0
Clayton 663 29 17 0 1.10 41 7 0
Gumbel Barnett 1492 3 0 0 11.41 2 2 0
C7 1492 0 4 0 11.41 2 2 0
C8 37 282 0 0 200 358.41 0 0 500
C9 99 215 0 0 296 12.14 0 0 0
Independence 1491 0 0 0 11.41 0 0 0

A. Gaussian B. Student’s t
4 4
10 10

Best 10
3
10
3
χ2

χ2
Second Best
2 2
10 10
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
Correlation Coefficient (τ ) Correlation Coefficient (τ )
C. Gumbel D. Joe E. Frank
4 4 4
10 10 10

3 3 3
10 10 10
χ2

χ2

χ2

2 2 2
10 10 10
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
Correlation Coefficient (τ ) Correlation Coefficient (τ ) Correlation Coefficient (τ )
F. AMH G. Clayton H. Independent
4 4 4
10 10 10

3 3 3
10 10 10
χ2

χ2

χ2

2 2 2
10 10 10
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
Correlation Coefficient (τ ) Correlation Coefficient (τ ) Correlation Coefficient (τ )

Figure 4. Scatterplots of χ2 statistic versus correlation coefficient. Note the logarithmic ordinate scaling.

In the first tier is the Gumbel family. Gumbel family copulas resulted in the lowest average χ2 and S statistics (187 and
0.14), and had the highest number of best fits (173 and 205). The Gumbel family copulas were also never the worst fitting.

10 c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17
DOI: 10.1002/we
Prepared using weauth.cls
Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power H. Louie

A. Gaussian B. Student’s t

1 1
10 10
Best 10
0
10
0

S
Second Best −1 −1
10 10

0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8


Correlation Coefficient (τ ) Correlation Coefficient (τ )
C. Gumbel D. Joe E. Frank
1 1 1
10 10 10
0 0 0
10 10 10
S

S
−1 −1 −1
10 10 10

0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
Correlation Coefficient (τ ) Correlation Coefficient (τ ) Correlation Coefficient (τ )
F. AMH G. Clayton H. Independent
1 1 1
10 10 10
0 0 0
10 10 10
S

10
−1
10
−1
S 10
−1

0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
Correlation Coefficient (τ ) Correlation Coefficient (τ ) Correlation Coefficient (τ )

Figure 5. Scatterplots of CvM statistic versus correlation coefficients. Note the logarithmic ordinate scaling.

In the second tier are the Gaussian, Student’s t, Frank and Joe families. When compared to the Gumbel, their average
χ2 statistics were between 12 and 58 percent higher, and their CvM statistics were between 57 and 142 percent higher.
The elliptical copulas had lower average χ2 and CvM statistics, but the Frank and Joe had a greater number of best fits.
In the third tier are the Clayton and Ali-Mikhail-Haq families. When compared to the Gumbel, their average χ2 statistics
are over three times higher, and their CvM statistics are between seven and seventeen times higher. The Gumbel-Barnett
and C7 are in the fourth tier, with nearly identical fit as measured by both the χ2 and CvM statistics. Their average χ2 and
CvM statistics were nearly eight and eighty times larger than the Gumbel’s, respectively. Though they did have a small
number of best and second best fits, further investigation showed that in these occurrences the rank correlation was near
zero where almost all copulas had similar statistics. In the fifth tier are the C8 and C9 copulas. The fit of these copulas is
considerably worse than the others considered.
It is expected that the fourth and fifth tier copulas performed poorly as they were selected specifically for contrast.
Their densities are dissimilar to those shown in Figure 2 for most rank correlations. The results indicate that an improperly
selected copula can lead to a fit that is several orders of magnitude worse than the best fitting copula as measured by χ2
and CvM statistics. The benchmark Independence copula surprisingly outperformed the copulas in the fifth tier, and was
comparable to the fourth tier in terms of average χ2 and CvM statistics. A conclusion is that in some cases, an assumption
of no correlation between x and y leads to more accurate models than an poorly chosen copula.

5.2. Rank-Based Guidelines


It is notable that though the Gumbel family copulas had the greatest number of best-fits, they did so in less than half of the
500 wind plant pairings. A natural question arising from this observation is: Under what circumstances should the Gumbel
copula be used, and under what circumstances should it not be used?

c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17 11
DOI: 10.1002/we
Prepared using weauth.cls
H. Louie Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power

Table III. χ̄2 Values For Rank Correlation Ranges

τ range Gaussian Student’s t Gumbel Joe Frank AMH Clayton


−0.03 ≤ τ < 0 157 155 162 162 158 158 162
0 ≤ τ < 0.20 184 179 174 202 187 212 300
0.20 ≤ τ < 0.40 295 287 219 437 328 1028 1316
0.40 ≤ τ < 0.60 396 333 252 1012 403 2139 3198
0.60 ≤ τ < 0.87 398 283 210 1490 376 5007 3578

Table IV. S̄ Values For Rank Correlation Ranges

τ range Gaussian Student’s t Gumbel Joe Frank AMH Clayton


−0.03 ≤ τ < 0 0.1079 0.1066 0.1448 0.1449 0.1097 0.1092 0.1448
0 ≤ τ < 0.20 0.1632 0.1613 0.1222 0.1736 0.1644 0.2312 0.4443
0.20 ≤ τ < 0.40 0.4396 0.4401 0.1609 0.5465 0.4819 2.2202 2.6127
0.40 ≤ τ < 0.60 0.4031 0.4105 0.2211 1.2542 0.5145 9.9924 3.7802
0.60 ≤ τ < 0.87 0.2256 0.2370 0.1612 1.3101 0.4016 29.7180 3.1484

Table V. Copula Application Guidelines

τ range Recommended Copula(s)


−0.03 ≤ τ < 0 Student’s t, Gaussian, AMH, Frank
0 ≤ τ < 0.20 Gumbel, Student’s t, Gaussian, Frank
0.20 ≤ τ < 0.40 Gumbel, Student’s t, Gaussian
0.40 ≤ τ < 0.60 Gumbel
0.60 ≤ τ < 0.87 Gumbel, Student’s t

The range of rank correlation coefficients over which a copula is skillful—that is, it results in either the lowest or second
lowest χ2 or CvM statistic—can be deduced from Figures 4 and 5. Clearly copulas such as the Clayton and Ali-Mikhail-
Haq provide good fits over limited ranges of τ , whereas others such as the Gumbel and Student’s t provide good fits over a
wide range. Tables III and IV show the average χ2 and CvM statistics for the top four tiers of copulas for different ranges
of rank correlation values. Highlighted in bold is the copula family with the lowest average statistic for a given range. The
trends and conclusions are the same for χ2 and CvM statistics.
For small, negative correlations, Student’s t, Gaussian, Ali-Mikhail-Haq and Frank copulas resulted in similar statistics
that are lower than those of the other copulas. It is not surprising that they outperformed the others, given that the other
copulas cannot model negative correlations (see Table I). For positive correlations less than 0.20, the Gumbel was the most
skillful. However, from Figures 4 and 5 and Tables III and IV, Student’s t, Gaussian and Frank copulas are also skillful in
this range, at times outperforming the Gumbel, with similarly low statistics. Any of these are suitable copulas to select in
this range of correlation.
In the range of 0.20 ≤ τ < 0.40, the Gumbel, Student’s t and Gaussian copulas all have χ2 and CvM statistics below
300 and 0.45, respectively, and can be reasonably used to model wind power. Tables III and IV show a notable increase
in the χ2 and CvM statistics of copulas in the range of 0.40 ≤ τ < 0.60. The Gumbel copula’s statistics remain low, and
should be used for rank correlations in this range. At the highest correlation levels, when τ is between 0.60 and 0.87, the
Gumbel and Student’s t are the most skillful and are the best choices. It can be argued that the Gaussian should also be
recommended as it has a lower CvM statistic than the Student’s t in this range, but a higher χ2 statistic. Overall, the top
performing copulas reasonably fit the data and are suitable models. The range of rank correlation in which the models
averaged the highest test statistics (worst fitting) is 0.40 < τ ≤ 0.60.
For practical purposes, the results make a compelling case to use the Gumbel copula to model wind power dependence,
particularly when the rank correlation is 0.20 or greater. For positive values less than 0.20, the top tier copulas all perform
well, and any one could be used. Though Student’s t distribution was competitive with the Gumbel, the fact that it requires
two parameters—one of which cannot be directly computed from τ —make this selection impractical for most applications.
A summary of these guidelines is provided in Table V. Further details of the Gumbel copula, including the closed form
solution of the inverse of (6) and visualizations of its density function are found in the Appendix.

12 c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17
DOI: 10.1002/we
Prepared using weauth.cls
Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power H. Louie

6. APPLICATION VIGNETTE

This Section illustrates an application of copula-based wind power models for Monte Carlo simulations in a hypothetical
scenario. Suppose a system operator wishes to perform a preliminary study of the operational effects of the connection
of two proposed wind plants to their system. The proposed wind plants are separated by a distance of 305 km and have
rated capacities of 425.5 MW and 1229 MW. Among other considerations, assume that the system operator is concerned
that when the combined power of the wind plants exceeds 1300 MW, corrective measures in the form of uneconomically
re-dispatching generators may be needed in order to maintain system security. The utility wishes to estimate how many
hours per year this situation could occur.
In order to utilize copulas to model the wind power from the wind plants, their rank correlation must be specified. In
the absence of specific details on terrain and wind regimes, the rank correlation coefficient can be estimated from the
separation distance of the wind plants as in [10]. For a separation of 305 km, the rank correlation is estimated to be 0.38.
It is anticipated that the normalized power output from the wind plants can be reasonably modeled by Beta distributions
with α and β parameters of 0.49 and 1.03, and 0.54 and 0.97 [26]. Details of the Beta distribution’s density function are
found in the Appendix.

6.1. Data Synthesis


The first step in the study is to synthesize wind power values for each wind plant while maintaining the desired rank
correlation of 0.38, and a realistic dependency structure. Based upon the guidelines developed in Table V, the Gumbel
copula family is selected to model the dependency structure. The copula parameter is then fit to the specified rank
correlation coefficient using (6). (See Appendix for details). The resulting parameter θ is 1.62.
Next, 8760 pairs of random samples for the Gumbel copula are drawn. Generating random variables from copulas is
accomplished through the following procedure [1]. First, two independent random variables u and w that are uniform on
[0, 1] are drawn. The inverse conditional distribution function of a Gumbel copula with u held constant is then computed.
∂C(u,v)
This function is denoted c−1u (·). It can be found by computing cu (v) = ∂u
, where C(u, v) is the Gumbel copula
with θ set to 1.62, and then finding the inverse function. After this is found, let v = c−1
u (w). The resulting pair (u, v) are
random variables with uniform marginal distributions, the Gumbel dependency structure, and rank correlation approaching
0.38.
A scatter-histogram of the simulated points (u, v) are shown in the upper left plot of Figure 6. Note that the marginal
histograms are nearly uniform as expected since u and w are drawn from uniform distributions and the transformation
from w to v preserves the distribution. The plot shows increased density toward mutually high ranks, which is similar to
that in Figure 2E. The computed rank correlation of the synthesized data is 0.377.
−1
The generated pairs (u, v) are then transformed into the normalized wind power domain by x = B0.49,1.03 (u) and y =
−1 −1
B0.54,0.97 (v), where Bα,β (·) is the inverse Beta distribution function with parameters α, β. The inverse Beta distribution
function is computed using numerical methods. After transformation via the respective inverse Beta distribution functions,
the simulated pairs (x, y) appear as in the lower left plot of Figure 6. From these data the number of hours that the wind
plants satisfy the condition 425.5Xi + 1229Yi > 1300 can be computed. The Monte Carlo analysis estimates that 790
hours per year the aggregate wind power from the wind plants under study will exceed 1300 MW.

6.2. Validation
The wind plants in this vignette correspond to two wind plants (IDs: 1027 and 3051) in the NREL Eastern Dataset. The
estimate from the Monte Carlo simulation can thus be compared to the data from these wind plants. The upper right plot
in Figure 6 shows the scatter-histograms of the wind plant power in the uniform/rank domain. As described in Section 3,
the absence of points from the thin band along the axes of this plot is an artifact of the empirical distribution function
transformation. Data points lying along the edge of the band are transformed to zero in the normalized wind power
domain. Comparing the top plots of Figure 6 qualitatively shows the skill of the copula at producing realistic data in
the uniform/rank domain. The plots exhibit very similar features, such as increased density in the upper right corner. The
corresponding χ2 and CvM statistics of the fit of the Gumbel copula to the data are 159 and 0.134, indicative of a good fit.
Comparing the bottom plots of Figure 6 shows how the simulated data compares with NREL wind plants in the
normalized wind power domain. The general features are the same: increased density in the lower left and upper right
corners. There are differences, however. The clustering in the upper right corner is less pronounced, and there is more
occurrences of near full power output by either wind plant in the simulated data. The primary contributor to these
structural differences is the assumption of marginal Beta distributions. For example, the empty bands near nominal
power in the NREL data set are from losses and localized diversity in power from wind turbines in the same wind plant.
These characteristics are not adequately captured by the Beta distribution. Despite this, the simulated data is a reasonable
model, retaining the salient features and preserving the desired rank correlation. A non-parametric or mixture model of
the marginal distribution [27, 14] could be used rather than the Beta distribution if increased accuracy is required, but

c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17 13
DOI: 10.1002/we
Prepared using weauth.cls
H. Louie Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power

Simulated Data NREL Data

1 1

0.75 0.75
Rank

Rank
0.5 0.5

0.25 0.25

0 0
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
Rank Rank

1 1
Normalized Power

Normalized Power
0.75 0.75

0.5 0.5

0.25 0.25

0 0
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
Normalized Power Normalized Power

Figure 6. Scatter-histograms of simulated data in the uniform/rank domain (upper left), in the normalized wind power domain (lower
left); and data from wind plants 1027 and 3051 in the uniform/rank domain (upper right) and in the normalized wind power domain
(lower right).

further investigation of this is beyond the scope of this work. Nonetheless, the NREL data indicates that 854 hours of
above 1300 MW combined output. This is less than 10 percent different from the results of the Monte Carlo simulation
and validates the wind power dependency structure model.

14 c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17
DOI: 10.1002/we
Prepared using weauth.cls
Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power H. Louie

7. CONCLUSION AND OUTLOOK

The increased desire to model the dependency of stochastic variables in the power system has naturally led to the use
of copulas. When using copulas, it is important to select a family that is sufficiently capable of modeling the desired
dependency structure. This paper evaluated several Archimedean and elliptical copula families for their fit to model wind
power dependency structures. Several application guidelines were developed from the analysis. In general, it is possible
to achieve accurate models as measured by the bivariate χ2 and Cramér-von Mises statistics. The Gumbel distribution is
the most skillful at modeling wind power dependence, particularly at rank correlation coefficient values greater than 0.20.
Student’s t and Gaussian copulas are also skillful, though on average to a lesser extent. For rank correlation values of 0.20
or less, the Gumbel, Student’s t, Gaussian and Frank all had similarly low χ2 statistics. For negative correlations that are
small in magnitude, the t Gaussian, Ali-Mikhail-Haq or Frank copulas should be used. The reader is advised that while
the results are gathered from 500 wind plant pairs representing a wide diversity of terrain and wind regimes, the fit of the
copulas to different conditions, such as different averaging periods or sub-yearly time frames, may yield different results.
The most significant conclusion of this paper is a warning against the tempting de-facto use of the Gaussian copula to
model wind power dependency. Rather, the Gumbel copula is preferred. The described methodology for evaluating copulas
can be extended from the bivariate case to higher dimensional cases. Future work in this area includes: computational
aspects of general multivariate copula modeling for large systems; investigating terrain- and wind-regime specific copula
selection guidelines; analyzing the influence of sample rate on copula performance more multi-timescale simulation; and
developing a single-parameter Archimedean copula designed specifically for wind power dependence. The latter could
lead to easily implemented models of wind power dependency that can be utilized in integration studies and research on
topics such as stochastic optimization in power systems.

APPENDIX

The standard bivariate Gaussian distribution function is [20]:


Za Zb
1
e−(x −2ρxy+y 2 )/2(1−ρ2 )
2
Φρ (a, b) = p dxdy. (A-1)
2π 1 − ρ2
−∞ −∞

The standard bivariate Student’s t distribution function is [20]:


Za Zb  − ν+2
1 x2 + y 2 − 2γxy 2
Tγ,ν (a, b) = 1+ dxdy. (A-2)
(1 − γ 2 )ν
p
2π 1 − γ2
−∞ −∞

The Beta density function used in the application vignette is:

xα−1 (1 − x)β−1
f (x) = (A-3)
Bα,β

where Z 1
Bα,β = xα−1 (1 − x)β−1 dx. (A-4)
0

The Gumbel copula is of particular interest as it is the overall best fitting copula. The computation of θ from τ using (6)
in several cases yields a closed form solution. For the recommended Gumbel copula, the closed form solution is:
1
θ= . (A-5)
1−τ
The density c(u, v) corresponding to the Gumbel copula for varying rank correlations is presented in Figure 7. Figure 7
can be compared to Figure 2 to qualitatively evaluate the fit of Gumbel copulas to model wind power dependence.

REFERENCES

1. Nelsen R. An Introduction to Copulas. Second edn., Springer: New York, NY, 2006.
2. Hasche B. General statistics of geographically dispersed wind power. Wind Energy 2010; 13:773–784, doi:10.1002/
we.397.

c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17 15
DOI: 10.1002/we
Prepared using weauth.cls
H. Louie Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power

A. Gaussian B. Students t C. Gumbel


1 1 1
0.75 0.75 0.75
Rank

Rank

Rank
0.50 0.50 0.50
0.25 0.25 0.25
0 0 0
0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1
Rank Rank Rank
D. Joe E. Frank F. Ali−Mikhail−Haq
1 1 1
0.75 0.75 0.75
Rank

Rank

Rank
0.50 0.50 0.50
0.25 0.25 0.25
0 0 0
0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1
Rank Rank Rank
G. Clayton H. C8 I. C9
1 1 1
0.75 0.75 0.75
Rank

Rank

Rank
0.50 0.50 0.50
0.25 0.25 0.25
0 0 0
0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1
Rank Rank Rank

Figure 7. The density of the Gumbel copula for increasing rank correlations. Higher density is indicated by darker shading.

3. Wan Y. Wind power plant behaviors: Analyses of long-term wind power data. Technical Report NREL/TP-500-36551,
NREL 2004.
4. Ernst B, Wan Y, Kirby B. Short-term power fluctuation of wind turbines: Analyzing data from the German 250-MW
measurement program from the ancillary services viewpoint. Technical Report NREL/CP-500-26722, NREL 1999.
5. Wan Y, Milligan M, Parsons B. Output power correlation between adjacent wind power plants. Journal of Solar
Energy Engineering 2003; 125:551–555, doi:10.1115/1.1626127.
6. Holttinen H. Hourly wind power variations in the Nordic countries. Wind Energy 2005; 8:173–195, doi:10.1002/we.
144.
7. Osborn D, Hendersen M, Nickell B, Lasher W, Liebold C, Adams J, Caspary J. Driving forces behind wind. Power
& Energy Magazine 2011; 9(6):60–74, doi:10.1109/MPE.2011.942474.
8. Kroese D, Taimre T, Botev Z. Handbook of Monte Carlo Methods. Second edn., John Wiley & Sons: Hoboken, NJ,
2011.
9. Papaefthymiou G, Kurowicka D. Using copulas for modeling stochastic dependence in power system uncertainty
analysis. IEEE Trans. Power Systems 2009; 24(1):40–49, doi:10.1109/PES.2009.5275265.
10. Louie H. Evaluating Archimedean copula models of wind speed for wind power modeling. Proc. Power Africa,
Johannesburg, South Africa, 2012.
11. Hagspiel S, Papaemannouil A, Schmid M, Andersson G. Copula-based modeling of stochastic wind power in Europe
and implications to the Swiss power grid Aug 2012, doi:10.1016/j.apenergy.2011.10.039.
12. Pinson P, Madsen H, Nielsen H, Papaefthymiou G, Klöckl B. From probabilistic forecasts to statistical scenarios of
short-term wind power production. Wind Energy 2009; 12:51–62, doi:10.1002/we.

16 c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17
DOI: 10.1002/we
Prepared using weauth.cls
Evaluation of Bivariate Archimedean and Elliptical Copulas to Model Wind Power H. Louie

13. Pinson P, Girard R. Evaluating the quality of scenarios of short-term wind power generation. Applied Energy Aug
2012; 96:12–20, doi:10.1016/j.apenergy.2011.11.004.
14. Gill S, Stephen B, Galloway S. Wind turbine condition assessment through power curve copula modeling. IEEE
Trans. Sustainable Energy 2012; 3(1):94–101, doi:10.1109/TSTE.2011.2167164.
15. Bessa R, Mendes J, Miranda V, Botterud A, Wang J, Zhou Z. Quantile-copula density forecast for wind power
uncertainty modeling. Proc. IEEE PowerTech, Trondheim, Norway, 2011.
16. NREL. Wind integration datasets Aug 2010. URL http://www.nrel.gov/wind/integrationdatasets.
17. Brower M. Development of eastern regional wind resource and wind plant output datasets. Technical Report
NREL/SR-550-46764, NREL, Golden, CO Dec 2009.
18. Joe H. Multivariate Models and Dependence Concepts. Chapman and Hall/CRC: London, UK, 1997.
19. Genest C, Rivest L. Statistical inference procedures for bivariate Archimedean copulas. Journal of American
Statistical Association Sep 1993; 88(423):1034–1043.
20. Genz A. Numerical computation of rectangular bivariate and trivariate Normal and t probabilities. Statistics and
Computing 2004; 3(14):251–260.
21. Dytham C. Choosing and using statistics:A biologist’s guide. Fourth edn., Wiley–Blackwell: Chichester, UK, 2011.
22. Genest C, Rémillard B, Beaudoin D. Goodness-of-fit tests for copulas: A review and power study. Insurance:
Mathematics and economics 2009; 44:199–213.
23. Berg D. Copula goodness-of-fit testing: an overview and power comparison. European Journal of Finance 2009;
7-8(15):675–701.
24. National Institute of Standards and Technology. Chi squared goodness of fit test Apr 2003. URL
http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/chsqgood.htm.
25. Cormen T, Leiserson C, Rivest R, Stein C. Introduction to Algorithms. Third edn., MIT Press: Cambridge, MA, 2001.
26. Louie H. Evaluation of probabilistic models of wind plant power output characteristics. Proc. Probablistic Methods
Applied to Power Systems, Singapore, 2010.
27. Stephen B, Galloway S, Hill D, McMillan. A copula model of wind turbine performance. IEEE Trans. Power Systems
May 2011; 26:965–966, doi:10.1109/TPWRS.2010.2073550.

c 0000 John Wiley & Sons, Ltd.


Wind Energ. 0000; 00:1–17 17
DOI: 10.1002/we
Prepared using weauth.cls

S-ar putea să vă placă și