Documente Academic
Documente Profesional
Documente Cultură
Distribution
Alamgir, Amjad Ali, Sajjad Ahmad Khan, Umair Khalil and Dost Muhammad Khan
Abstract
Outliers have been of constant concern for statistician. These outliers must be handled carefully
while analyzing data using statistical tools. In the present study, we introduce some more tests for
detecting one or more outliers, which make use of the more robust statistics, namely, median,
quartile deviation and the variance based on “sample free of suspected outlier(s)”. We simulate
critical values for the tests and also for those tests which are available in the literature while
sampling from Cauchy distribution. We compare their efficiency using power criteria.
Introduction
Statistical analyses are greatly affected by the outliers present in the data. In practice, while
observing several measurements of chemical or physical quantity, one or more of the observed
values may be significantly different from majority of the remaining observations. Such
observations are termed as outliers and may be removed from the data set before performing
statistical analysis. Van der Loo (2010) argued that the detection of outliers is of immense
importance from statistical analysis point of view. This task can be accomplished if one uses some
test procedures for the detection of outliers.
In single sample case, researchers have studied many tests and procedures relating to different
probability distributions to detect and to handle such outliers. Dixon (1950) introduced several
tests for detection of one or more outliers. Several researchers have simulated critical values for
the tests based on sampling from normal distribution. For example, Grubbs and Beck (1972)
simulated critical values for the tests for k = 2, where k is the number of outliers. Barnett and Lewis
(1994) have simulated critical values for the tests when k = 2, 3 and 4. Fernando et al. (2000) have
given a comparison of efficiency of several tests for the detection of outliers in sampling from
normal population. Verma and Quiroz-Ruiz (2006) computed comparatively more precise and
accurate simulated critical values (with four decimal places) for six Dixon tests with various levels
of significance for detection of outliers while sampling from normal population for samples of up
to sizes 100. A nice review of already work done in this area is given by Beckman and Cook
(1983). For non- normal populations, the published work has been concentrated on uniform
distribution, exponential and gamma distribution because of their simplicity and wide applications.
Fung and Paul (1985) conducted a simulation study to compare eight different tests for the
detection of outliers in sampling from Weibull or extreme-value distribution. In the current study
we simulate critical values for several tests and also compare their performance using power
criteria (Beckman and Cook, 1983) while sampling from Cauchy distribution.
The Test Statistics for the Study
Several tests are considered in this study. These tests are described in brief below.
Dixon-Type Tests
xn xn 1 x xn 1 x xn 1
T1 , T2 n , T3 n .
xn x1 xn x2 xn x3
We also consider some more tests involving mean and median. They are
xn x
T4
s
An analog of the above test is obtained if sample standard deviation is replaced by mean absolute
deviation, given by
x median
T5 n
ˆ
where,
1 n
ˆ xi median
n i
x median
T6 n
Q.D
where,
Q. D QuartileDeviation (Q3 Q1 ) / 2
.
xn median
T7
MAD
where,
MAD (median xi median ) / 0.6457
.
The two tests, T6 and T7 also make use of MAD and Q.D which are more robust estimates as
compared to sample standard deviation.
We propose a test making use of the robust location estimate, “median” and standard deviation
based on first “n-1” observations (when the largest observation is tested for possible outlier) and
is defined as
x median
T8 n
sn 1
n 1
(X i X )2
Where sn21 i 1
n 1
Two Sided Tests
We investigate four (4) two- sided tests for an extreme outlier. They are (Tests, T9 – T12) given
below:
x x x x1
T9 max ( n , )
s s
A two-sided analogue of the T5 is given by
x median median x1
T10 max ( n , )
ˆ ˆ
We define two-sided Dixon test given by
x xn1 x2 x1
T11 max ( n , )
xn x1 xn x1
The following test is a two- sided analogue of T6,
x median median x1
T12 max( n , )
Q.D Q.D
Block Tests
The following block tests, available in the literature, are used to detect two or more (d>1)
outliers:
x xn1 xn d x
T13 nd 1
s
x xn d Median
T14 nd 1
ˆ
Dixon type tests, for detecting more than one outlier, are given by
x xn d x xn d x xn d
T15 n , T16 n , T17 n ,
xn x1 xn x2 xn x3
The Proposed Block test
We propose a block test, T18 , which is a generalization of our proposed test T8 and is given by
x xn d Median
T18 nd 1
snd
where sn d is standard deviation based on first “n-d” observations (when the largest d observations
are tested for outliers).
Power Criteria
Simulation Studies
An extensive simulation study is conducted to compute empirically the critical values for various
tests discussed above based on sampling from Cauchy distribution. All the critical values
computed for the tests are based on 10000 simulations. The sample sizes considered for all tests in
this study are n = 5, 6, 7, 8, 9, 10, 12, 14, 15, 16, 18, 20, 25, 30. For each test, critical values are
simulated at 1%, 5% and 10% level of significance.
In order to compare the performance of the above tests (T1- T18), we perform this evaluation using
the power criteria. For this purpose, we consider contaminated models. For upper extreme outliers
detection tests (T1- T8, and T13- T18), we use the following contamination model;
(I) Model based on Shift in Location Parameter
Let d be the number of contaminants. Here we assume that n- d observations are from
Cauchy distribution with location parameter and scale parameter , whereas, d
(contaminants) observations come from C ( a , ) , where, “a” is the amount of shift
in location parameter.
For the two sided tests considered in the study, we have considered a shift in the scale parameter,
that is, scale shift model given below;
(II) Model based on Shift in Scale Parameter
Let “b” be the amount of shift in scale parameter. Here we assume that n- 1 observations
are from Cauchy distribution with location parameter and scale parameter , whereas,
1 (contaminant) observation comes from C ( , b ).
Simulated critical values for various levels of significance and different values of sample size, n,
for all tests considered in this study have been obtained. We calculate the critical values of the test
up to three (3) decimal places and are presented in Table 1 to Table 8.
Table 1. Critical values for T1 and T2
n T1 T2
(10%) (5%) (1%) (10%) (5%) (1%)
5 0.787 0.878 0.957 0.894 0.943 0.981
6 0.766 0.865 0.953 0.862 0.923 0.973
7 0.754 0.857 0.950 0.844 0.913 0.970
8 0.746 0.852 0.948 0.833 0.906 0.967
9 0.741 0.850 0.946 0.826 0.902 0.966
10 0.737 0.847 0.946 0.821 0.899 0.965
12 0.733 0.844 0.945 0.815 0.895 0.963
14 0.730 0.842 0.944 0.811 0.893 0.962
15 0.729 0.840 0.943 0.810 0.891 0.960
16 0.728 0.839 0.943 0.809 0.890 0.960
18 0.727 0.837 0.941 0.807 0.888 0.958
20 0.725 0.834 0.940 0.805 0.885 0.957
25 0.724 0.831 0.937 0.804 0.882 0.955
30 0.723 0.827 0.935 0.803 0.875 0.952
EZ1/(E Z1+EZ2) 0.99 0.99 0.99 0.99 0.99 0.99 0.97 0.99
8 1 P(Z1=0, 2=1) 0.68 0.58 0.44 0.69 0.65 0.81 2.55 0.69
P(Z1=1, 2=0) 38.08 37.71 36.67 38.51 39.15 37.60 37.73 38.50
EZ1/(E Z1+EZ2) 0.98 0.98 0.98 0.98 0.96 0.91 0.89 0.98
12 1 P(Z1=0, Z2=1) 0.89 0.73 0.59 0.97 1.45 3.03 3.52 0.98
P(Z1=1, Z2=0) 20.87 20.84 19.93 21.27 22.55 16.56 15.44 21.29
EZ1/(E Z1+EZ2) 0.96 0.97 0.97 0.96 0.94 0.85 0.81 0.96
15 1 P(Z1=0, Z2=1) 0.95 0.78 0.69 1.00 1.57 3.54 3.71 1.00
P(Z1=1, Z2=0) 11.19 10.04 10.02 12.53 12.56 5.84 6.05 11.80
EZ1/(E Z1+EZ2) 0.93 0.93 0.94 0.92 0.89 0.62 0.62 0.92
20 1 P(Z1=0, Z2=1) 1.21 0.96 0.81 1.36 1.65 4.15 4.34 1.37
P(Z1=1, Z2=0) 5.86 4.66 4.36 4.78 4.71 1.01 1.03 4.78
EZ1/(E Z1+EZ2) 0.83 0.83 0.84 0.78 0.74 0.19 0.19 0.78
25 1 P(Z1=0, Z2=1) 1.41 1.05 0.90 1.59 1.77 4.23 4.43 1.60
P(Z1=1, Z2=0) 3.31 1.95 1.43 2.10 1.63 0.32 0.34 2.20
EZ1/(E Z1+EZ2) 0.70 0.65 0.61 0.57 0.48 0.07 0.07 0.58
30 1 P(Z1=0, Z2=1) 1.63 1.69 1.11 1.69 1.78 4.78 4.69 1.68
P(Z1=1, Z2=0) 1.35 0.80 0.66 0.86 0.64 0.22 0.22 0.89
EZ1/(E Z1+EZ2) 0.45 0.38 0.37 0.34 0.26 0.04 0.04 0.34
EZ1/(E Z1+EZ2) 0.99 0.99 0.99 0.99 0.99 0.99 0.98 0.99
8 1 P(Z1=0, Z2=1) 0.42 0.30 0.23 0.42 0.06 0.58 2.02 0.43
P(Z1=1, Z2=0) 72.61 68.39 66.3 72.92 72.89 72.01 71.57 72.58
EZ1/(E Z1+EZ2) 0.99 0.99 0.99 0.99 0.99 0.99 0.97 0.99
10 1 P(Z1=0, Z2=1) 0.39 0.30 0.24 0.40 0.69 2.03 2.39 0.45
P(Z1=1, Z2=0) 65.76 60.06 59.49 67.51 67.57 67.44 62.07 67.23
EZ1/(E Z1+EZ2) 0.99 0.99 0.99 0.99 0.98 0.97 0.96 0.99
12 1 P(Z1=0, Z2=1) 0.47 0.38 0.36 0.51 0.88 2.50 2.98 0.51
P(Z1=1, Z2=0) 49.24 48.14 45.63 51.34 56.47 49.85 46.17 51.46
EZ1/(E Z1+EZ2) 0.99 0.99 0.99 0.99 0.98 0.95 0.94 0.99
15 1 P(Z1=0, Z2=1) 0.59 0.47 0.40 0.62 0.90 3.12 3.28 0.63
P(Z1=1, Z2=0) 43.97 41.05 40.10 42.88 44.48 40.84 40.74 43.86
EZ1/(E Z1+EZ2) 0.98 0.98 0.99 0.98 0.99 0.93 0.92 0.99
20 1 P(Z1=0, Z2=1) 0.73 0.58 0.49 0.77 1.28 3.92 4.15 0.71
P(Z1=1, Z2=0) 27.64 26.58 26.58 29.15 27.6 21.88 22.44 29.06
EZ1/(E Z1+EZ2) 0.97 0.97 0.98 0.97 0.95 0.84 0.84 0.98
25 1 P(Z1=0, Z2=1) 0.79 0.58 0.50 0.89 1.26 3.97 4.1 0.83
P(Z1=1, Z2=0) 20.25 18.63 17.31 20.87 19.52 6.59 5.76 21.15
EZ1/(E Z1+EZ2) 0.96 0.96 0.97 0.95 0.94 0.62 0.58 0.96
30 1 P(Z1=0, Z2=1) 0.97 0.75 0.6 1.08 1.95 1.93 4.68 0.89
P(Z1=1, Z2=0) 12.89 11.82 11.6 12.90 12.87 4.76 1.68 13.73
EZ1/(E Z1+EZ2) 0.93 0.94 0.95 0.92 0.86 0.71 0.26 0.94
EZ1/(E Z1+EZ2) 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99
8 1 P(Z1=0, Z2=1) 0.69 0.75 0.67 0.62 1.42 0.86 1.42 0.88
P(Z1=1, Z2=0) 66.51 67.15 65.18 63.61 69.42 73.89 69.91 73.83
EZ1/(E Z1+EZ2) 0.99 0.99 0.99 0.99 0.98 0.99 0.98 0.99
10 1 P(Z1=0, Z2=1) 0.82 0.83 0.79 0.75 0.88 0.99 1.39 1.73
P(Z1=1, Z2=0) 68.27 69.73 67.62 62.52 74.51 74.57 74.43 74.42
EZ1/(E Z1+EZ2) 0.99 0.99 0.99 0.99 0.98 0.99 0.99 0.98
12 1 P(Z1=0, Z2=1) 0.97 0.99 0.97 0.93 1.51 1.88 1.47 2.50
P(Z1=1, Z2=0) 73.27 74.05 72.47 70.53 75.34 75.47 75.24 76.85
EZ1/(E Z1+EZ2) 0.99 0.99 0.99 0.99 0.98 0.98 0.98 0.97
15 1 P(Z1=0, Z2=1) 1.20 1.57 1.15 1.14 2.62 2.09 1.59 2.83
P(Z1=1, Z2=0) 73.00 73.01 72.19 68.84 76.37 76.48 76.27 78.84
EZ1/(E Z1+EZ2) 0.98 0.98 0.98 0.98 0.97 0.97 0.98 0.98
20 1 P(Z1=0, Z2=1) 1.66 1.87 1.53 1.25 1.77 2.28 1.73 3.92
P(Z1=1, Z2=0) 69.80 70.11 68.36 64.02 72.15 72.60 72.64 72.86
EZ1/(E Z1+EZ2) 0.98 0.97 0.98 0.98 0.97 0.97 0.98 0.95
25 1 P(Z1=0, Z2=1) 2.49 2.60 2.11 1.83 3.69 3.16 1.99 3.97
P(Z1=1, Z2=0) 66.12 66.53 64.33 63.34 65.81 62.52 69.25 69.52
EZ1/(E Z1+EZ2) 0.96 0.96 0.97 0.97 0.95 0.95 0.97 0.95
30 1 P(Z1=0, Z2=1) 3.37 3.79 3.09 2.78 4.09 4.99 2.92 6.73
P(Z1=1, Z2=0) 61.66 62.23 60.34 58.26 22.92 60.82 63.87 57.83
EZ1/(E Z1+EZ2) 0.95 0.94 0.95 0.95
0.85 0.92 0.96 0.90
As far as the two sided tests (T9- T12) are concerned, the results presented in Table 11 reveal that
T10 performs better than other tests for b = 10 and n = 5, 8, 10, 12, 15, 20, 25 and 30 since it
correctly identifies outliers more often than other three tests. But it also declares more false
positives as compared to other three tests for b = 10 and for all values of n. The performance of
T12, for b = 10 and for all values of n, is very poor as it correctly detects less outliers as compared
to other three tests. But unlike T10, T12 declares the least number of false positives as compared to
the remaining tests for almost all values of n and b = 10. The ratio EZ1/(EZ1+EZ2) almost remains
the same for all four tests when b = 10.
For n = 5, 8 and 10, and b = 15, T10 performs better than the remaining tests as it correctly identifies
more outliers as compared to other three tests and also declares less false positives. But as the
sample size increases (n 12), the performance of T11 seems to be more promising as it correctly
identifies more outliers and at the same time, also declares less false positives. It also gives greater
ratio EZ1/(EZ1+EZ2) as compared to other tests. Hence the test T11 seems the best for b = 15 and
large sample size.
Table 12 and Table 13 present simulated powers for six (6) tests of which T18 is our proposed test.
The results in these two tables show that test T13 performs better than all other block tests as it not
only correctly identifies outliers more often than the other tests but also declares the least number
of false positives among all competitors for all n and both b = 10 and 15. Close competitor of test
T13 is T18 in all cases. Test T13 also gives highest EZ1 and the highest ratio EZ1/(E Z1+EZ2) as
compared to all other tests.
In order of performance, there are two situations (for both b =10 and 15):
1. T16 and T17 are the poorest among all block tests. Test T13 is the best, T14 is the second best
and T18 is the third best for d = 2 and 3 and for all n. T15 is only better than the other two
Dixon tests.
2. But, when d = 4 and 5, the proposed test, T18 performs better than T14 for all n.
Consequently, T13 is the best, T18 is the second best and T14 is the third best. T15 is at number
four.
Table 12: Powers for tests T13 - T18 when a = 10
N d Power T13 T14 T15 T16 T17 T18
5 2 P(Z1=0, Z2=2) 0.01 0.03 0.04 0.01 0.00
P(Z1=1, Z2=1) 0.05 0.14 0.14 0.28 0.16
P(Z1=2, Z2=0) 88.33 53.83 53.02 17.71 55.11
In this section, we present two examples to illustrate how the above tests work for detecting one
or more outliers.
Example 1
Consider the following data containing n = 10 observations drawn from Cauchy distribution:
4.859363, 5.116336, 5.120637, 5.257150, 6.055268, 6.207880, 6.777558, 6.912309, 9.847006,
25.000201.
When the data is plotted, the data plot indicates that the largest observation is a possible outlying
observation. The QQ- plot also reveals that the extreme upper observation is a possible outlier. We
test the last observations for upper outlier by applying the tests used to detect single upper (lower)
outlier.
Figure 1: QQ-plot for data in example 1 Figure 2: Data-plot for data in example 1
Table 14 presents the calculated values for the tests (T1- T8) calculated from the above data.
According to the results in Table 14, the largest observation is declared as an outlier by T1, T4, T5
and T8 at 10% level of significance but not at 5% and 1% significance levels. The remaining tests
(T2, T3, T6 and T7) failed to declare the largest observation as an outlier at any significance level.
Table 14: Calculated values and critical values of Tests (T1- T8)
Test Calculated Critical value
value 10% 5% 1%
T1 0.752 0.737 0.847 0.946
T2 0.762 0.821 0.899 0.965
T3 0.763 0.856 0.919 0.972
T4 2.763 2.730 2.807 2.840
T5 6.658 6.428 7.688 9.088
T6 21.890 30.457 62.085 310.258
T7 20.018 30.110 61.442 307.197
T8 12.187 10.266 20.545 100.643
Let us explore some facts about the above sample containing 10 observations. In this sample, 9 of
the observations were generated from Cauchy distribution with = 10 and = 1, whereas, one
observation was simulated from Cauchy distribution with = 20 and = 1. Here a = 10 (amount
of shift) and d (number of outliers) = 1. Here, in fact, the largest observation is an outlier from
different Cauchy distribution. The same observation is not only declared as an outlier by T1, T4, T5
and T8 but the various graphs also detected it as an outlier. These results are very much in
agreement with the results we found from power comparisons based on Table 9.
Example 2
Consider the following data containing n = 25 observations drawn from Cauchy distribution:
13.19576, 13.96621, 14.28533, 14.52214, 14.52261, 14.56306, 14.59187, 14.60249, 14.68260,
14.70289, 14.70781, 14.93240, 15.04186, 15.14828, 15.31356, 15.45453, 15.46260, 15.64192,
16.92745, 17.20134, 17.41059, 20.23015, 30.50170, 31.15637, 35.24563.
When the data in example 2 is plotted, the plot indicates that the largest three observations are
possible outlying observations. The QQ- plot of the data in the example also reveals that the last
three ordered observations are possible outliers. Based on the results presented in Table 15, the
largest three observations are clearly declared as outliers by test T13 at all levels of significance,
that is, 10%, 5% and 1% levels of significance.
Tests T14 and T18 declare these three observations as outliers only at 10% but not at 5% and 1%
level of significances. The remaining three tests (T15, T16 and T17) failed to declare the largest three
observations as outliers at any significance level. Test T13 is probably the most likely to detect
three outliers as compared to all other block tests.
Figure 3: QQ- plot for data in example 2 Figure 4: Data plot for data in example 2
Table 15: calculated values and critical values of Tests (T13 - T18)
Test Calculated Critical value/Decision
value 10% 5% 1%
T13 7.665 6.226 6.533 6.952
T14 17.874 17.562 20.116 23.006
T15 0.680 0.833 0.906 0.968
T16 0.706 0.896 0.943 0.980
T17 0.716 0.919 0.956 0.985
T18 34.807 33.802 62.979 278.831
Let us explore some facts about the above sample data containing 25 observations. In this sample,
the first 22 ordered observations were generated from Cauchy distribution with = 20 and =
1, whereas, the last three ordered observations were generated from Cauchy distribution with =
30 and = 1. Here a = 10 and d = 3. Here, in fact, the largest three observations in the data are
outliers from different Cauchy distribution. The same observations are not only declared as an
outlier by T13, T14 and T18, but the graphs also detected them as outliers. These results are very
much in accordance with the results we found from power comparisons based on Table 13 and
Table 14 which clearly indicate that T13 is the most powerful test as it gives maximum probability
of detecting three outliers as compared to other tests.
Conclusion
In the present study, we considered several tests used for detecting one or more outliers. We
proposed and introduced some more tests for the detection of outliers.
A simulation study was carried out to compare the performance of all the tests considered in this
study. Among the tests, T1 is found to be the winner for a = 10 but for a = 15, T8 is the winner.
Among two sided tests, we found that T10 and T12 performed well identifying correctly outliers
more often and declaring less false positives respectively for b = 10. Whereas, for b = 15, test T11
is found to be the best when the sample size becomes large.
We examined the performance of six block tests in this study. We have observed that the
performance of these tests was different in different scenarios. Test T13 is found to be the champion,
T14 is the runner up and T18 is the third best for d = 2 and 3 and for all n. In case of d = 4 and 5, our
proposed test, T18 performed better than T14 for all n.
We also presented two examples to illustrate how these tests work. We artificially generated data
from Cauchy distribution in each example and were contaminated with one and three outliers
respectively from two different Cauchy distributions. The tests were applied and their calculated
values were computed from the sample data in each case. The results found in the examples
matched well with the simulation results.
References
Barnett, V. and Lewis, T. (1994). Outliers in Statistical Data, Third edition, John Wiley
& Sons, Colchester, England.
Beckman, R. J. and Cook, R. D. (1983). Outlier…..s, Technometrics, 25, 119-149
Childs, A. M. (1996). Advances in statistical inference and outlier related issues. PhD
Thesis. Open Access Dissertations and Theses. Paper 3693.
Dixon, W. J. (1950). Analysis of extreme values, Annals of Mathematical Statistics, 21,
488-506.
Fernando V., Surendra P. V. and Mirna G. (2000). Comparison of the Performance of
Fourteen Statistical Tests for Detection of Outlying Values in Geochemical Reference
Material Databases, Mathematical Geology, 32(4), 439-464
Fung, K. Y. and Paul, S. R. (1985). Comparisons of outlier detection procedures in Weibull
or extreme-value distribution, Communications in Statistics-Simulation and Computation,
14, 895-917.
Grubbs, F. E. and Beck, G. (1972). Extension of sample sizes and percentage points for
significance tests of outlying observations, Technometrics 14; 847-854.
Van der Loo, M. P. J. (2010). Distribution Based outlier detection in univariate data,
Statistics Netherland, Henri Faasdreef, 312, 2492 JP The Hague.
Verma, S. P. and Quiroz-Ruiz, A. (2006). Critical values for six Dixon tests for outliers in
normal samples up to sizes 100, and applications in science and engineering, Revista
Mexicana de Ciencias Geologicas, 23(2), 133-161.