Sunteți pe pagina 1din 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/309177974

Estimation of plotting position for flood frequency analysis

Conference Paper · November 2016

CITATIONS READS

0 572

2 authors:

Rob J. Connell Magdy Mohssen


Scion, Christchurch Otago Regional Council
42 PUBLICATIONS   185 CITATIONS    16 PUBLICATIONS   87 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

flood forecasting of the Pomahaka River in Otago, New Zealand View project

Analysis of turbulence experimental data using mathematical techniques View project

All content following this page was uploaded by Rob J. Connell on 12 January 2017.

The user has requested enhancement of the downloaded file.


Estimation of Plotting Position for Flood Frequency Analysis
R.J. Connell1, M. Mohssen2
1
Independent Researcher, Christchurch, New Zealand
E-mail: robertconnell@clear.net.nz
2
Senior Lecturer, Lincoln University, Canterbury, New Zealand

Abstract

The probably distribution of flood peaks for a given river is unknown and many models have been
developed to simulate this hydrological process. Plotting analyses in New Zealand presently use the
Gringorten plotting position, which sets a 178 year return period for the largest flood in 100 years.
Gumbel, in his 1958 book “Frequency of Extremes”, used the Weibull plotting position, or 101 year
return period for 100 years of data. Plotting the two positions on the probability density function shows
it is not clear which to adopt. The difference between the two plotting positions will make a large
change to the calculated design flood and economic analysis. A 100 year flood has one chance in 100
of being exceeded in any one year (1 % annual exceedance probability (AEP)) from which it can be
calculated over a 100 year period there is a 37 % chance it will not be exceeded, (for a 200 year flood
in 100 years it is 61 %). Integrating these probabilities over the full range of floods gives the mean
probability of largest flood as 1/(n+1). However generating a simulated series from the EV1
distribution, showed the average value of the largest flood was the Gringorten plotting position. For the
GEV distribution this position varied gradually from the Gringorten plotting position to either a higher or
lower return period using the EV2 distribution or EV3 distribution respectively. It is proposed that
plotting positions be examined in more detail and collaboration with other researchers.

1. INTRODUCTION

The present practice to estimate the return period of floods, uses annual maximum series of past
floods in New Zealand with the Gringorten plotting position methodology, (which is (i-0.44)/(n+0.12)
where i is the rank (in descending order) and n the number of years of record, Connell & Pearson,
2001, Connell (1991). For the largest flood in an annual series of 100 years i.e. i =1 and n =100 this
gives 178 year return period for its plotting position (i.e. frequency) for the analysis. Converserly,
Gumbel recommended the Weibull plotting position, i/(n+1) the result is 101 year return period for the
largest flood in the 100 years of data.

The difference in exceedance probabilities between the Weibull and Gringorten plotting positions for
the highest observed flow is almost a factor of two. Therefore estimation of the probability of past
floods is extremely important in the analysis to predict future flood probabilities and carry out economic
analysis.

Much of the original research on this subject was presented by Gumbel in his book, “Statistics of
Extremes” (Gumbel 1958) from his actuarial background of risk assessment. Since then, there has
been considerable research published. However as the book was reprinted in 2004, the work of
Gumbel has been more accessible and re-examined in detail. This includes the plotting position
problem, with Gumbel (1958) stating that the Weibull plotting position should be used. This was after
considerable examination of the statistics of the extreme values and stating that the analysis is very
complicated. Gumbel emphases using the Weibull plotting position otherwise the flood risk will be
under estimated.

Of course the analysis assumes that conditions remain constant with time which they do not as there
is climate change, 10 year southern oscillation indices (ENSO, IPO etc), and seasonal changes (larger
floods can occur in summer when more moisture is in the air). However these non-stationarity factors
have been omitted in this paper as it is looking at the fundamental statistical analysis of floods.

In recent research, Makonnen et al. (2013) and Hsian (2012) presented analyses that gave the
Estimation of Plotting Position for Flood Frequency Analysis Connell

Weibull plotting position of 1/(n+1). e.g. for the largest observed flood in 100 years, 101 year return
period. Makonnen (2013) goes further and refutes the arguments of Cunnane, (1978).

Many other plotting positions have been developed as outlined in Yahala, (2012) which presented ten
different formulae. The formula gave plotting positions for the largest flow, between twice the length of
record, to the length of the record, a factor of two. Of course, statistical analysis can be done without
plotting positions. The initial part of the work by Ware & Ladd (2003) presented 12 different methods,
to estimate the discharges of large return periods on the Waimakariri River (near Christchurch, New
Zealand) which gave 12 different results.

This paper presents a theoretical analysis to determine the average plotting position, determined from
a fundamental or Bayesian Statistics viewpoint and examines how to compare it with the Gumbel
distribution and the Generalised Extreme Value distribution. It looks more specifically at the plotting
position of the largest value of an annual series of flood events.

1.1. Plotting Position Problem

Gumbel analysis plots the probability of past floods on Gumbel paper. This is terms of the reduced
variate and is relationship to the probability density function (PDF) and cumulative distribution function
(CDF). The formula –ln(-ln(CDF)) = (x-μ)/β, linearises this relationship, where x-u)/β is the Reduced
Variate , x is a flood measurement, and Gumbel location parameter μ is the mode and scale
parameter β is related to the standard deviation. Figure 1 shows the results from a Gumbel distribution
3 3
with a mean of 1225 m /s and standard deviation of 500 m /s.

Figure 1 Relationship between PDF, CDF and a Gumbel plot for the Gumbel distribution

The analysis for large floods hinges on the probability used for the larger measured discharges in
each year. The large difference in the calculated results between the Gringorten and Weibull plotting
positions is shown in Figure 2 for the Waimakariri River from Connell & Pearson, 2001.

Figure 2 Difference in flood prediction between the Weibull and Gringorten plotting positions
for the Waimakariri River data which increases for very large events over 4000 cumecs

HWRS 2015 Connell, Mohssen 2 of 8


Estimation of Plotting Position for Flood Frequency Analysis Connell

2.0 PLOTTING POSITION THEORETICAL ANALYSIS


The following is a Bayesian type analysis to calculate the mean value of what the largest flood would
be in n years. It is by definition that the probability of a given size flood of return period T of being
exceeded in one year is 1/T (=p) and not being exceeded is (1-1/T), so in ‘n’ years the probability of
non-exceedance is,
n n
(1-1/T) or (1-p) (1)

This is for one sized flood and the probability changes with the size of the flood.

However for the plotting position, what we need to know is the mean value of T or return period in this
formula that could occur in a period of ‘n’ years over the complete range of possible largest floods that
could occur. For example, if ‘n’ is 100 years then a 100 year flood has a 37 % probability of not being
exceeded and for a 200 year flood it is 61 %.

To apply this reasoning to the full range of flood probabilities, equation (1) needs to be applied to the
complete range of the largest floods that could occur in ‘n’ years. The full range of flood probabilities is
the probability ‘p’ or 1/T is between 0 and 1. Therefore integration of (1) is from 0 to 1 giving,

(2)

The result, 1/(n+1), is interpreted to mean the probability of non-exceedance for the largest flood in ‘n’
years.

This is the Weibull plotting position. The integration is shown graphically in Figure 3.

Figure 3 Estimation of mean probability of largest flood probability in ‘n years

The area under the curve is the same as the integration shown by equation (2).

2.1 Gumbel’s Analysis of the mean largest flood in ‘n’ years


Gumbel, in chapter 4 of his book, Gumbel, (1958) presents an analysis of the statistics of the average
-x n
value of the largest flood in ‘n’ years. Gumbel uses a probability function of (1-e ) for the largest value
in ‘n’ years. This is the original analysis which leads to the exponential function and double exponent
distribution and use of the plotting paper that Gumbel developed. This gives a PDF with the
exponential variate (x) on the x axis as shown in figure 4.

HWRS 2015 Connell, Mohssen 3 of 8


Estimation of Plotting Position for Flood Frequency Analysis Connell

Figure 4 PDF of Gumbel Distribution of the largest value in ‘n’ years of the probability function
-x n
(the CDF) which is (1-e ) .

To compare Figure 4 with Figure 3 the mathematical transformation between them needs to be
examined.
-x x
From Gumbel, (1958), F(x) = 1-p (p 21) but F(x) = 1 – e and T(x) = return period = e . If x is examined
and the analysis followed through, x is the reduced variate, e.g. the reduced variate for a 100 year
4.6 -x n n
flood is 4.6 and e = 100 (p21). Therefore (1- e ) is the same as (1-p) with the range of 0 to 1 from
equation (2) above being transformed to be used with the Gumbel reduced variate range of 0 to ∞.
Therefore Figure 4 (PDF) is a mathematical transformation of the same information in Figure 3 (CDF).

Gumbel calculates statistical parameters for this distribution (p114-6). The mode of these functions is
ln(n), the median is ln(n) – ln(ln)2 and the mean is ln(n) + γ where γ is Euler’s constant for a population
or for finite series of measurements values (Gumbel, 1958) (p 228).

Gumbel (1958) (p116) calculated the mean of the return period of the largest flood and found it to be
1.78n. This is exactly the same number as the Gringorten plotting position. Even with this result
Gumbel still advised using the 1/(n+1) plotting position.

2.2 Data generation for calculating the return period of maximum values in a number of years

In this section we generate flood values, from their probabilities, to calculate the mean of the largest
expected flood in ‘n’ years. This has also verified using an independent generation method not
presented here.

2.2.1 Gumbel Distribution – (EV1 Distribution)

Flood probabilities, for the largest flood in the year, were generated using random numbers between 0
and 1 with computer program Octave. This was done for 1 million data sets of 100 years. Then the
largest value of each 100 years was used to find the maximum flood probability of each of the million
data sets. Then the average of the 1 million maximum flood probabilities was calculated giving a value
very close to 1/(n+1).

However this is for the mean of the flood probabilities, not of the flood measurements, Ladd, (pers.
comm. 2016).

To obtain a flood measurement for a given probability the Gumbel distribution was applied.
Measurements are related to the Gumbel distribution by the reduced variate, by the cumulative
distribution function, (CDF) by, CDF = , where RV is the reduced variate, (x-u)/β.

HWRS 2015 Connell, Mohssen 4 of 8


Estimation of Plotting Position for Flood Frequency Analysis Connell

Therefore the maximum probabilities were transformed by the inverse RV, = -ln(-ln(CDF)).

Doing this for the million data sets of maximum flood probabilities with the Gumbel distribution gives
an average maximum reduced variate (RV) of 0.57 above the RV of the average flood probability, the
same result as Gumbel, i.e. for a 100 years data the average maximum flood is 178 year return
period.

2.2.2 Average General Extreme Value (GEV) Distribution

Now this procedure is applied to the general extreme value distribution or EV1, EV2 and EV3
distributions. To obtain measurements for the EV2 and EV3 distributions the following concept was
used,

(3)

which is a restatement of the formula of Jenkins (1955) to relate GEV values with RVs, where u, alpha
and k are location, scale and shape parameters respectively.

If k was very close to zero the average return period of the largest flood in 100 years was close to 178
years (i.e. the Gringorten position). If k was negative then the average return period increased and
vice versa, depending upon how the value of k. The results are shown in Figures 5a and 5b.

Figure 5 a) Plotting position for average maximum sized return period in 100 years versus ‘k’
for the GEV distribution, b) Plot of GEV distribution

If the same procedure is applied to the two component extreme value distribution, as shown in Figure
2, then if the change in slope, from the lower EV1 distribution to the upper EV1 distribution, occurs at a
low return period say 2- 5 years, then the average maximum return period is 178 years.

3.0 Discussion
From the above analysis it could be concluded that the plotting position could vary with the type of
GEV distribution and matches that found by Cunnane, (1978) and Gringorten (1963). We also show
that the plotting position will change with different values of the shape parameter k for the GEV
distribution. This was recognised by Beable and McKerchar (1982).

The change in plotting positon with GEV would be expected with the average plotting position
increasing for the EV2 distribution with the opposite occurring for the EV3 or Weibull distribution. This
is different from the conclusion of Cunnane (1978) who stated that the Weibull distribution should have
a plotting position of 1/(n+1).

However Gumbel uses 1/(n+1) plotting position in his analysis presented in Gumbel (1958), even after
calculating all three main statistical parameters, mean, median and mode which shows the bias.

The issue becomes clearer if the plotting positions are shown against the probability distribution of the
largest flood. This is done for n = 100 years and shown in Figure 6.

HWRS 2015 Connell, Mohssen 5 of 8


Estimation of Plotting Position for Flood Frequency Analysis Connell

Figure 6a Plotting positions shown on the distribution of possibly probabliities for 100 years of
data assuming the Gumbel Distribution (These positions assume the EV1 distribution).

On figure 6 you can see that the 1/(n+1) plotting position is at the mode, or most dense position of the
value of the 100 year flood. The Gringorten plotting positon is not at the most dense position for the
possible probabilities of the 100 year flood. Gumbel (1958) uses the 1/(n+1) plotting position in his
analysis with the plotting positions fitting his procedure.

This highlights the issue of this paper which is; should the mode be used for the analysis, or the mean,
(or even the median)? Statistical theory states that for numerical data, the median or mean should be
used. Manikandan, (2011, stated that the mean is the best statistic to use for numerical data and the
mode should only be used when there is a nominal scale which is not the case for flood data.

However if figure 6 is examined one can say it is not clear which plotting position should be adopted in
this case the mean (Gringorten) or mode (Gumbel, Makonnen). The reason for the different values is
that, if the EV1 distribution is correct, the data’s probability distribution is positively skewed (from the
normal distribution) increasing the median and mean from the mode.

It is also noted that analysis for the GEV needs to be done with the PWM to compare the results, as
this method may be consistent with the above analysis, and also with the other quantile methods given
in section 3.1. A comparison with other techniques such as the log- Pearson III distribution which is
used in the United States is also required to compare and analyse the differences.

However, recent papers, Makonnen (2013) indicate that the plotting position should be 1/(n+1). In
section 2 we have calculated that the mean value of the largest flood for a Gumbel Distribution is the
Gringorten Plotting position. Further research is needed to resolve this issue.

3.1 Waimakariri Data results for three distributions and quantile estimates.

In 2003, Ware and Ladd analysed the measured flood data on the Waimakariri River. They used 3
different distributions; the General Extreme value, Generalised Logistic (GLO) and Log-normal
distributions; with 4 different techniques of quantile or estimates of probability, using ‘at site’, ‘scaled
data from many similar catchments, and two other methods using higher order statistics.

The results for the 3 different distributions using the ‘at site’ quantile estimation method on the
Waimakariri River together with the Weibull and Gringorten plotting positions are shown in figure 7
which shows that three different techniques give different results. The GLO distribution at the reduced
variate values between 4 and 5 gives a larger return period for the same sized flood, effectively
assigning a much higher plotting position to the data compared to the other distributions.

HWRS 2015 Connell, Mohssen 6 of 8


Estimation of Plotting Position for Flood Frequency Analysis Connell

Figure 7 Waimakariri River data - Gringorten and Gumbel plotting positions and results using
three distributions with ‘at site’ data.

By examining Figure 7 it can be seen that the data does not extend past the reduced variate of 4.8 (or
4.2 with Weibull). As there are no data at this frequency, the distribution could be any of the curves of
figure 5 or even be different completely.

As an example, there may be a limit to the probable maximum flood (PMF) that occurs for a
geographical region. This limit could be about, say the 500 year flood or the 10,000 year flood or even
higher, flattening the line (like the TCEV, but with three instead of two components i.e. 3CEV) for
return period above a certain value as shown in a theoretical plots on Gumbel paper by Figure 8.

Figure 8 Possible plots of a three component extreme value distribution on Gumbel paper

Therefore there are many possible options for the flood distribution and the error margins for the very
large values which are very wide as there is no data for this end of the flood frequency spectrum, other
than Paleo flood evidence and also historical Maori records of flooding of old Pa sites which both need
investigation in New Zealand.

HWRS 2015 Connell, Mohssen 7 of 8


Estimation of Plotting Position for Flood Frequency Analysis Connell

In all cases discussed it is the value of plotting position, whether directly calculated or indirectly, (using
quantile estimation) which makes a large difference to the probabilities of very large floods. This
means further work is required on this point.

3.3 Further Research


Further research is required into which plotting position to use and to compare the results with other
widely used distributions including the Log-Pearson III and the Mixture Mass Function method
described in Ware and Ladd (2003). Also collaboration with others in the field is necessary to take part
in the present discussion of this important topic.

4.0 CONCLUSION
The analysis of a generated series of the Gumbel distribution produced plotting positions that are the
same as Gringorten’s. When the same procedure is applied to the general extreme value (GEV)
distribution, the plotting position of the largest value gradually varies, either larger or smaller from the
Gringorten plotting position depending upon the deviation of the GEV shape parameter k from 0.

However it is unclear which plotting position to use if the plot of the density (or distribution) of possible
values of probability are compared to the Gringorten and Weibull plotting positions. Further
investigation is needed to understand why researchers such as Makkonnen (2013) and Gumbel
advocate the use of 1/(n+1) plotting position.

The analysis of Ware & Ladd shows that there are many possibilities for the high values of distribution
of extreme floods. This means that more data is necessary to obtain data more reliable probabilities
for very large floods or the very low frequency end of the spectrum.

Therefore further work on the plotting position needs to be done in collaboration with overseas
researchers. Floods are not the only risk as the method is used in the analysis of risk of low flows in
rivers, building design, e.g. wind and earthquakes and other forms of risks. The discussion continues.

REFERENCES
Beable, M. E. & McKerchar, A. I. (1982), Regional Flood Estimation in New Zealand, Water &
Soil Tech. Publ. no. 20, Ministry of Works and Development, Wellington, New Zealand.

Chang, H.S. (2012), A discussion on the plotting position formula for Gumbel distribution, Discussion
paper on the internet at: http://ocean.cv.nctu.edu.tw/NRCEST/teaching/statics/A%20discussion%-
th
20on%20the%20plotting%20position%20formula%20for.pdf on the 7 June 2016.

Connell, R.J. (1991, Hydrology of Rivers Previously in South Canterbury Catchment Board District.
Canterbury Regional Council Report R91/31, Christchurch.

Connell, R.J. and Pearson, C.P., (2001), Two-component extreme value distribution applied to
Canterbury annual maximum flood peaks, Journal of Hydrology (NZ), 40(2), 105-127.

Cunnane, C. (1978), Unbiased plotting positions – A review. Journal of Hydrology, 37: 205–222.

Gringorten I. I. (1963), A plotting rule for extreme probability paper, J. Geophys. Res. 68(3), 813-814.

Gumbel, E.J. (1958), Statistics of Extremes, Columbia University Press, New York.

Jenkinson, A. F. (1955) The frequency distribution of the annual maximum (or minimum) of
meteorological elements, Quart. J. R. Met. Soc., 81, 158-171.

Makkonen, L., Pajari, M. and Tikanmaki, M (2013), Discussion on “Plotting positions for fitting
distributions and extreme value analysis”, Canadian. J. Civ. Eng. 40: 927–929.

HWRS 2015 Connell, Mohssen 8 of 8


Estimation of Plotting Position for Flood Frequency Analysis Connell

Manikandan, S. (2011), Measures of central tendency: Median and mode, J Pharmacol Pharmacother.
2011 Jul-Sep; 2(3): 214–215.

Ware R. and Ladd, F. (2003), Flood Frequency Analysis of the Waimakariri River, Report Canterbury
University

Yahaya, A.S., Yee, C.S., Ramli, N.A., and Ahmad, F. (2012), Determination of the best probability
plotting position for predicting parameters of the Weibull distribution, International Journal of Applied
Science and Technology, 2:106–111.

HWRS 2015 Connell, Mohssen 9 of 8

View publication stats

S-ar putea să vă placă și