Documente Academic
Documente Profesional
Documente Cultură
www.elsevier.com/locate/envsoft
Abstract
Multivariate statistical techniques, such as cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant
analysis (DA), were applied for the evaluation of temporal/spatial variations and the interpretation of a large complex water quality data set of
the Fuji river basin, generated during 8 years (1995e2002) monitoring of 12 parameters at 13 different sites (14 976 observations). Hierarchical
cluster analysis grouped 13 sampling sites into three clusters, i.e., relatively less polluted (LP), medium polluted (MP) and highly polluted (HP)
sites, based on the similarity of water quality characteristics. Factor analysis/principal component analysis, applied to the data sets of the three
different groups obtained from cluster analysis, resulted in five, five and three latent factors explaining 73.18, 77.61 and 65.39% of the total
variance in water quality data sets of LP, MP and HP areas, respectively. The varifactors obtained from factor analysis indicate that the parameters responsible for water quality variations are mainly related to discharge and temperature (natural), organic pollution (point source: domestic
wastewater) in relatively less polluted areas; organic pollution (point source: domestic wastewater) and nutrients (non-point sources: agriculture
and orchard plantations) in medium polluted areas; and organic pollution and nutrients (point sources: domestic wastewater, wastewater treatment plants and industries) in highly polluted areas in the basin. Discriminant analysis gave the best results for both spatial and temporal analysis. It provided an important data reduction as it uses only six parameters (discharge, temperature, dissolved oxygen, biochemical oxygen
demand, electrical conductivity and nitrate nitrogen), affording more than 85% correct assignations in temporal analysis, and seven parameters
(discharge, temperature, biochemical oxygen demand, pH, electrical conductivity, nitrate nitrogen and ammonical nitrogen), affording more than
81% correct assignations in spatial analysis, of three different sampling sites of the basin. Therefore, DA allowed a reduction in the dimensionality of the large data set, delineating a few indicator parameters responsible for large variations in water quality. Thus, this study illustrates the
usefulness of multivariate statistical techniques for analysis and interpretation of complex data sets, and in water quality assessment, identification of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.
2006 Elsevier Ltd. All rights reserved.
Keywords: Fuji river basin; Water quality; Cluster analysis; Principal component analysis; Factor analysis; Discriminant analysis
1. Introduction
A river is a system comprising both the main course and the
tributaries, carrying the one-way flow of a significant load of
matter in dissolved and particulate phases from both natural
and anthropogenic sources. The quality of a river at any point
different multivariate statistical techniques to extract information about the similarities or dissimilarities between sampling
sites, identification of water quality variables responsible for
spatial and temporal variations in river water quality, the hidden factors explaining the structure of the database, and the influence of possible sources (natural and anthropogenic) on the
water quality parameters of the Fuji river basin.
13810
2. Methods
2.1. Study area
Fuji river basin study area, drained by the Fuji River, is located in the central part of Japan (Fig. 1). The basin area is 3570 km2 and the main stream
length is 128 km. The Fuji River is located to the west of Mount Fuji, drawing
a curve along the mountain, and is one of three prominent rapid watercourses
in Japan. The river originates, as the Kamanashi River, from Mount Komagatake in the north of the Southern Alps, and as the Fuefuki River from the north
of Yamanashi Prefecture. These two rivers flow together in the south of the
Kofu basin as the Fuji River and, subsequently, flow into the Pacific Ocean
at Suruga Bay. The average flow of the Kamanashi River at Funayamabashi
is w10 m3/s, the Fuefuki River at Torinkyo is w20 m3/s and the flow of the
Fuji River at Fujibashi is w72 m3/s. These rivers drain the major rural, agricultural, urban and industrial areas of Yamanashi Prefecture and discharge into
Suruga Bay. The river, during its course of w128 km, receives a pollution load
from both point and non-point sources. The Fuji River is a major source for
agriculture and industrial activities located in downstream areas. The geological features of the basin are very complex and fragile. This is because a massive dislocation, called the Itoi RivereShizuoka Tectonic Line, runs under it
in a north/south direction; there are also many other dislocations runs through
and across the area. As a result, there are many collapsed areas and collapsed
13820
13830
13850
3550
io
Sh
R.
hi
as
an
am
Japan
13840
R.A
r
K
R.
3550
465
Yamanashi
4 Isawa
Enzan
Ryuou
5
6
Kofu 8
2
7
9
Minami-alps 3
10
11
12
R. Ashi
1
3540
3540
ki
efu
Fu
R.
Nakatomi
R. Fuji
Sampling stations
1. Funayamabashi
2. Singenbashi
3. Sangunnishibashi
3530
4. Sakurabashi
5. Kikkobashi
6. Omogawabashi
7. Hikawabashi
8. Ukaibashi
3520
9. Futagawabashi
10. Torinkyo
11. Sangunhigashibashi
12. Fujibashi
13. Nanbubashi
3530
Minobu
3520
13
Wastewater
treatment plan
13810
13820
20 Kilometers
Suruga bay
13830
13840
13850
Fig. 1. Map of study area and surface water quality monitoring stations (listed 1e13) in the Fuji river basin.
466
rocks and sands, transported by river water, accumulate in gentle flow areas.
More than 75% of the basin area is covered by forestry. Forest land is located
mostly in mountainous areas, whereas agriculture and grassland areas are
sparsely distributed throughout the basin. Orchard plantations and urban areas
are mostly situated along the water bodies. The basin lies in an inland region
and, therefore, has extreme variations in temperature between summer and
winter. Summers are hot and humid, and winters are cold, with average temperatures of 26 and 3 C, respectively. Annual rainfall in the Kofu basin is as
little as 1200 mm, but in the middle and lower reaches it may be as high as
3000 mm. The whole basin receives a mean annual precipitation of
w2100 mm.
The Ministry of Land Infrastructure and Transport (MLIT) of Japan operates and maintains stream-flow gauging stations and The Environment Division of Yamanashi Prefecture (EDYP) has been collecting various water
quality parameters from 50 water quality monitoring stations. However, in
the present study, data from only 13 stations were selected under the river water quality monitoring network, which covers a wide range of catchments and
surface water types (rivers, streams, and tributaries).
significance level which is 0 in this study (less than 0.05) indicates that there
are significant relationships among variables.
Spearman rank-order correlations (Spearman R coefficient) were used to
study the correlation structure between variables to account for non-normal
distribution of water quality parameters (Wunderlin et al., 2001). In this study,
temporal variations of river water quality parameters were first evaluated
through a season parameter correlation matrix, using Spearman non-parametric correlation coefficients (Spearmans R). The water quality parameters were
grouped into four seasons: spring (MarcheMay), summer (JuneeAugust), autumn (SeptembereNovember) and winter (DecembereFebruary), and each assigned a numerical value in the data file (spring 1; summer 2; autumn 3
and winter 4), which, as a variable corresponding to the season, was correlated (pair by pair) with all the measured parameters.
River water quality data sets were subjected to four multivariate techniques:
cluster analysis (CA), principal component analysis (PCA), factor analysis (FA)
and discriminant analysis (DA) (Wunderlin et al., 2001; Simeonov et al., 2003;
Singh et al., 2004, 2005). DA was applied to raw data, whereas PCA, FA and CA
were applied to experimental data, standardized through z-scale transformation
to avoid misclassifications arising from the different orders of magnitude of both
numerical values and variance of the parameters analyzed (Liu et al., 2003; Simeonov et al., 2003). All mathematical and statistical computations were made
using Microsoft Office Excel 2003 and STATISTICA 6.
Table 1
Water quality parameters, units and analytical methods used during 1995e2002 for surface waters of the Fuji river basin
Parameters
Discharge
Temperature
Dissolved oxygen
Biochemical oxygen demand
Chemical oxygen demand (Mn)
pH
Total suspended solids
Electrical conductivity
Total coliforms
Nitrate nitrogen
Ammonical nitrogen
Inorganic dissolved phosphorus
Abbreviations
Q
WT
DO
BOD
CODMn
pH
TSS
EC
TC
NO3-N
NH4-N
PO4-P
Units
3
m /s
C
mg l1
mg l1
mg l1
pH unit
mg l1
mS cm1
MPN/100 ml
mg l1
mg l1
mg l1
Analytical methods
Current meter
Mercury thermometer
Winkler azide method
Winkler azide method
Potassium permanganate
pH-meter
Dried at 103e105 C
Electrometric
Multiple tube method
Ion chromatographic
Phenate
Ascorbic acid
Table 2
Range, mean and S.D. of water quality parameters at different locations of the Fuji river basin during 1995e2002
Parameters
3
Station 1
Station 2
Station 3
Station 4
Station 5
Station 6
Station 7
Station 8
Station 9
Station 10
Station 11
Station 12
Station 13
Range
Mean
S.D
0.72e22.2
10.21
5.27
1.7e32.1
13.35
10.8
1.7e32.1
13.28
10.7
0.01e7.0
1.26
1.55
2.8e27.0
7.49
5.06
0.62e11.3
2.65
2.47
0.5e10.4
2.72
2.57
3.0e116.1
12.09
17.57
0e36
2.15
5.04
8.6e57.9
20.43
10.62
17.9e36.4
24.78
5.93
0.95e77.1
41.16
15.91
3.2e85.8
23.03
22.27
WT ( C)
Range
Mean
S.D
0.8e27.0
13.35
6.90
1.1e29
14.52
7.40
1.8e32.8
15.76
7.70
0.5e29.4
12.83
6.90
3.6e27.8
13.72
6.00
3e31.5
15.71
7.30
2.0e33.2
14.25
6.90
3.2e29.0
15.89
6.70
3.2e34.0
17.53
8.20
4e31.0
17.47
6.70
5.5e28.5
16.72
5.30
1e29.5
16.45
6.70
0.6e27.5
15.82
6.10
Range
DO
(mg l1) Mean
S.D
7.8e13.9
10.24
1.50
7.8e13.6
10.27
1.40
7.4e14.2
10.58
1.50
7.4e15.0
10.45
1.70
7e14.0
10.45
1.40
7e14.0
9.70
1.60
7e13.4
10.06
1.60
6.8e14.0
10.16
1.50
7e14.0
10.45
1.40
5.2e10.4
7.98
1.20
5.5e10.8
8.09
1.10
6e12.6
8.83
1.20
7.2e13.9
9.69
1.30
Range
BOD
(mg l1) Mean
S.D
0e4.4
0.82
0.40
0.2e7.6
1.04
0.70
0.1e4.3
1.11
0.50
0.3e4.4
0.74
0.30
0.1e8.1
0.88
0.70
0.6e10.5
2.73
1.50
0.4e17.7
1.27
1.60
0.5e8.8
1.64
1.00
0.5e10.0
2.67
1.50
0.6e8.7
3.14
1.40
0.8e5.6
2.52
0.90
0.1e101.0
2.54
7.20
0.2e3.2
0.82
0.40
Range
CODMn
(mg l1) Mean
S.D
1.1e19.0
2.02
1.50
1.1e17.0
2.36
1.20
0.8e8.2
2.35
0.80
0.8e4.8
2.10
0.50
0.7e7.6
1.98
0.80
2.3e21.0
4.43
2.20
0.00e22.0
2.55
2.20
1.6e17.0
3.07
1.50
1.9e9.1
4.56
1.30
2.5e8.7
4.33
1.00
2.1e7.2
3.98
0.90
2.1e8.8
3.75
1.00
0.5e11.0
2.03
1.10
pH
6.9e9.0
7.95
0.30
7.4e9.3
8.17
0.40
7.6e10.0
8.30
0.50
7.1e8.3
7.51
0.10
7.1e9.3
7.78
0.30
7.3e9.2
7.92
0.30
7.1e9.2
7.81
0.20
7.3e9.5
8.03
0.40
7e9.0
7.79
0.40
7.1e8.0
7.43
0.10
7.3e8.2
7.57
0.10
7.3e8.6
7.65
0.10
7e9.4
8.06
0.30
Range
TSS
(mg l1) Mean
S.D
1e153
7.71
15.10
1e170.0
9.91
16.50
1e190.0
12.74
21.50
0e39.0
2.77
4.00
1e57.0
4.79
7.00
1e348.0
17.81
34.50
0.6e77.0
5.01
7.50
1e120.0
8.93
12.00
1e34.0
7.62
5.50
3e74.0
14.30
9.80
0.00e49.0
14.59
7.90
3e111.0
16.53
14.30
1e734.0
17.93
66.50
Range
NO3-N
(mg l1) Mean
S.D
0.3e0.9
0.69
0.10
0.21e1.24
0.76
0.20
0.27e1.66
0.94
0.30
0.13e1.10 0.59e1.14
0.36
0.84
0.21
0.16
1.4e2.6
1.95
0.30
0.75e1.89
1.30
0.20
0.84e2.37
1.52
0.30
0.01e1.8
0.90
0.40
1.2e2.56
1.95
0.30
0.01e0.82
1.22
0.60
0.21e2.09
1.59
0.30
0.38e2.0
1.25
0.30
Range
PO4-P
(mg l1) Mean
S.D
0.01e0.08
0.03
0.01
0.02e0.14
0.04
0.02
0.03e0.33
0.09
0.05
0.0e0.10
0.01
0.01
0.01e0.11
0.06
0.02
0.00e0.16
0.03
0.02
0.01e0.11
0.04
0.02
0.01e0.45
0.08
0.05
0.01e0.41
0.13
0.07
0.06e0.22
0.12
0.03
0.02e0.53
0.10
0.05
0.01e0.20
0.05
0.01
Range
NH4-N
(mg l1) Mean
S.D
0.01e0.2
0.04
0.04
0.01e0.20
0.04
0.04
0.01e0.20
0.08
0.06
0.01e0.60 0.01e0.10
0.05
0.02
0.10
0.02
0.01e104.0
8.81
28.80
0.01e0.30
0.03
0.05
0.01e0.60
0.07
0.10
0.0e1.30
0.25
0.32
0.02e3.0
0.84
0.60
0.1e0.82
0.36
0.20
0.01e0.80
0.27
0.20
0.01e0.20
0.03
0.04
Range
EC
(mS cm1)Mean
S.D
101e185
143.28
16.90
18e208.0
154.81
28.40
126e186.0
146.71
12.40
109e216.0
147.86
23.70
95e179.0
138.25
19.40
103e314.0 165.0e333.0
203.79
246.75
49.80
37.70
180e305.0
257.63
25.50
152e278.0
217.17
26.30
14.7e258.0
200.84
59.50
TC (MPN/ Range
100 ml)
Mean
S.D
79e
9.2E 04
7.6E 03
1.4E 04
130e
1.6E 05
1.1E 04
2.0E 04
23e
9.2E 04
8.1E 03
1.6E 04
7600e
7.9E 05
1.3E 05
2.4E 05
490e
1.4E 06
4.7E 04
1.7E 05
1700e
1.4E 06
5.3E 04
54e
2.4E 05
3.9E 04
5.7E 04
2300e
2.4E 05
4.7E 04
4.6E 04
3300e
7.9E 05
3.9E 04
9.8E 04
33e
2.4E 05
1.2E 04
2.9E 04
Range
Mean
S.D
0e
3.5E 04
4.5E 03
6.3E 03
0.00e0.04
0.01
0.01
200e
1.6E 05
1.3E 04
2.3E 04
4 Sakurabashi;
5 Kikkobashi;
6 Omogawabashi;
1700e
4.9E 05
3.2E 04
5.9E 04
7 Hikawabashi;
8 Ukaibashi;
9 Futagawabashi;
Q (m /s)
10 Torinkyo;
467
468
wij pij
j1
where i is the number of groups (G), ki is the constant inherent to each group, n
is the number of parameters used to classify a set of data into a given group, wj
is the weight coefficient, assigned by DA to a given selected parameters (pj).
The weight coefficient maximizes the distance between the means of the criterion (dependent) variable. The classification table, also called a confusion,
assignment or prediction matrix or table, is used to assess the performance
of DA. This is simply a table in which the rows are the observed categories
of the dependent and the columns are the predicted categories of the dependents. When prediction is perfect, all cases will lie on the diagonal. The percentage of cases on the diagonal is the percentage of correct classifications.
In this study, four groups for temporal (four seasons) and three groups for
spatial (three sampling regions) evaluations have been selected and the number
of analytical parameters used to assign a measure from a monitoring site into
a group (season or monitoring area). DA was performed on each raw data matrix using standard, forward stepwise and backward stepwise modes in constructing discriminant functions to evaluate both the spatial and temporal
variations in river water quality of the basin. The site (spatial) and the season
(temporal) were the grouping (dependent) variables, whereas all the measured
parameters constituted the independent variables.
n
X
where z is the measured variable, a is the factor loading, f is the factor score, e
the residual term accounting for errors or other source of variation, i the sample number and m the total number of factors.
Cluster analysis was used to detect the similarity groups between the sampling sites. It yielded a dendrogram (Fig. 2),
grouping all 13 sampling sites of the basin into three statistically significant clusters at (Dlink/Dmax) 100 < 60. Since
we used hierarchical agglomerative cluster analysis, the number of clusters was also decided by practicality of the results as
there is ample information (e.g. landuse, location of wastewater treatment plants etc.) available on the study sites. The cluster 1 (Funayamabashi (1), Singenbashi (2), Sangunnishibashi
(3), Sakurabashi (4) and Nanbubashi (13)) corresponds to
Funayamabashi (1)
Sangunnishibashi (3)
Cluster 1
Nanbubashi (13)
Singenbashi (2)
Sakurabashi (4)
Futagawabashi (9)
Sangunhigashibashi (11)
Cluster 2
Fujibashi (12)
Kikkobashi (5)
Torinkyo (10)
Omogawabashi (6)
Hikawabashi (7)
Cluster 3
Ukaibashi (8)
0
20
40
60
80
100
120
(Dlink/Dmax)*100
Fig. 2. Dendrogram showing clustering of sampling sites according to surface water quality characteristics of the Fuji river basin.
87.67
91.15
86.64
94.30
10.301
9.705
11.875
Discriminant function coefficient for winter, spring, summer and autumn seasons corresponds to wij as defined in Eq. (3).
a
0.251
1.321
6.604
4.978
4.105
41.854
0.029
0.034
0.000
10.436
31.312
6.114
206.94
0.272
0.245
4.997
4.330
4.106
41.902
0.045
0.060
0.000
8.318
30.014
23.103
198.68
0.214
0.767
4.051
4.806
4.163
40.679
0.009
0.062
0.000
6.638
27.968
23.192
196.14
0.267
0.066
5.092
4.892
3.852
40.656
0.011
0.053
0.000
9.057
28.548
22.752
192.44
0.251
1.321
6.604
4.978
4.105
41.854
0.029
0.034
0.000
10.436
31.312
6.114
206.94
0.272
0.245
4.997
4.330
4.106
41.902
0.045
0.060
0.000
8.318
30.014
23.103
198.68
0.214
0.767
4.051
4.806
4.163
40.679
0.009
0.062
0.000
6.638
27.968
23.192
196.14
0.267
0.066
5.092
4.892
3.852
40.656
0.011
0.053
0.000
9.057
28.548
22.752
192.44
7.943
0.020
0.015
0.017
0.010
0.206
3.243
11.189
0.078
0.274
2.291
12.273
0.398
0.252
1.229
13.699
0.123
0.259
2.392
12.202
0.348
469
Q
WT
DO
BOD
CODMn
pH
TSS
EC
TC
NO3-N
NH4-N
PO4-P
Constant
Summer
coefficienta
Spring
coefficienta
Winter
coefficienta
Winter
coefficienta
Spring
coefficienta
Summer
coefficienta
Autumn
coefficienta
Winter
coefficienta
Spring
coefficienta
Summer
coefficienta
Autumn
coefficienta
Parameters
Table 3
Classification functions (Eq. (3)) for discriminant analysis of temporal variations in water quality of the Fuji river basin
Autumn
coefficienta
470
Table 4
Classification matrix for discriminant analysis of temporal variation in water
quality of the Fuji river basin
Monitoring seasons
% Correct
Season assigned by DA
Winter
Spring
Summer
Autumn
99.5
59.4
92.0
71.4
85.3
183
6
0
5
194
1
57
1
11
70
0
4
172
12
188
0
29
14
70
113
183
6
0
5
194
1
57
1
11
70
0
4
172
12
188
0
29
14
70
113
182
8
0
5
195
3
62
5
12
82
0
6
169
11
186
0
21
13
70
104
Standard DA mode
Winter
Spring
Summer
Autumn
Total
30
Q (m3s-1)
b)
40
30
Mean
SE
SD
25
20
WT (oC)
a)
10
0
20
Mean
SE
SD
15
10
5
-10
Winter
Winter
DO (mg l-1)
12
11
d)
4
Mean
SE
SD
13
10
9
8
Mean
SE
SD
1
0
7
6
Winter
-1
Winter
Season
e)
Season
f)
200
180
2.0
Mean
SE
SD
1.8
240
220
EC (Scm-1)
Season
Season
c)
471
160
140
120
100
80
1.6
Mean
SE
SD
1.4
1.2
1.0
0.8
0.6
60
Winter
Season
0.4
Winter
Season
Fig. 3. Temporal variations: (a) discharge, (b) temperature, (c) DO, (d) BOD, (e) EC and (f) NO3-N in surface water quality of the Fuji river basin.
respective water quality data sets. An Eigenvalue gives a measure of the significance of the factor: the factors with the highest
Eigenvalues are the most significant. Eigenvalues of 1.0 or
greater are considered significant (Kim and Mueller, 1987).
Equal numbers of VFs were obtained for three sites through
FA performed on the PCs. Corresponding VFs, variable loadings and explained variance are presented in Table 7. Liu
et al. (2003) classified the factor loadings as strong, moderate and weak, corresponding to absolute loading values of
>0.75, 0.75e0.50 and 0.50e0.30, respectively.
For the data set pertaining to LP sites, among five VFs,
VF1, explaining 22.23% of total variance, has strong positive
loading on discharge. VF2, explaining 17.09% of the total
variance, has strong positive loadings on temperature and
strong negative loadings on dissolved oxygen. VF1 and VF2
represent the seasonal impact of discharge and temperature.
472
Table 5
Classification functions (Eq. (3)) for discriminant analysis of spatial variations of water quality in the Fuji river basin
Parameters Standard mode
0.120
0.143
3.594
5.145
3.183
50.579
0.050
0.094
0.000
10.339
23.916
50.648
222.46
0.057
0.022
3.521
4.292
3.569
50.436
0.022
0.114
0.000
16.997
20.225
64.076
229.15
0.146
0.033
3.991
4.807
3.577
46.929
0.026
0.086
0.000
11.667
24.336
44.699
205.47
0.120
0.143
3.594
5.145
3.183
50.579
0.050
0.094
0.000
10.339
23.916
50.648
222.46
Table 6
Classification matrix for discriminant analysis of spatial variations of water
quality in the Fuji river basin
Monitoring regions
% Correct
Regions assigned by DA
LPa
MPb
HPc
90.6
84.9
75.9
83.7
202
13
47
262
5
107
5
117
16
6
164
186
202
13
47
262
5
107
5
117
16
6
164
186
200
14
54
268
5
105
6
116
18
7
158
183
Standard DA mode
LPa
MPb
HPc
Total
0.057
0.022
3.521
4.292
3.569
50.436
0.022
0.114
0.000
16.997
20.225
64.076
229.15
0.146
0.033
3.991
4.807
3.577
46.929
0.026
0.086
0.000
11.667
24.336
44.699
205.47
0.102
0.513
0.030
0.374
0.111
0.406
0.710
0.308
0.215
51.748
51.393
48.712
0.106
0.127
0.094
8.431
15.155
14.821
10.881
9.663
15.913
203.16
208.44
185.31
Coefficients for different monitoring regions correspond to wij as defined in Eq. (3).
a)
b)
40
Mean
SE
SD
20
WT (oC)
Q (m3 s-1)
30
10
0
-10
LP
HP
26
24
22
20
18
16
14
12
10
8
6
Mean
SE
SD
LP
MP
Mean
SE
SD
pH
d)
5
4
2
1
0
LP
HP
MP
9.0
8.8
8.6
8.4
8.2
8.0
7.8
7.6
7.4
7.2
7.0
SE
SD
LP
HP
MP
Monitoring Region
f)
2.2
2.0
Mean
SE
SD
EC (Scm-1)
300
280
260
240
220
200
180
160
140
120
100
80
60
MP
Mean
Monitoring Region
e)
HP
Monitoring Region
Monitoring Region
c)
473
1.8
Mean
1.6
SE
SD
1.4
1.2
1.0
0.8
0.6
0.4
LP
HP
0.2
MP
LP
Monitoring Region
g)
MP
0.8
0.6
HP
Monitoring Region
0.4
Mean
SE
SD
0.2
0.0
-0.2
LP
HP
MP
Monitoring Region
Fig. 4. Spatial variations: (a) discharge, (b) temperature, (c) BOD, (d) pH, (e) EC, (f) NO3-N and (g) NH4-N in surface water quality of the Fuji river basin.
4. Conclusions
In this case study, different multivariate statistical techniques were used to evaluate spatial and temporal variations
474
Table 7
Loadings of experimental variables (12) on significant principal components
for (a) LP sites, (b) MP sites and (c) HP sites data sets
Variables
VF3
VF4
VF5
VF1
VF2
0.429
0.034
0.075
0.417
0.728
0.005
0.810
0.164
0.038
0.007
0.010
0.078
1.67
13.92
53.24
0.144
0.073
0.124
0.519
0.092
0.056
0.040
0.048
0.173
0.053
0.854
0.555
1.36
11.32
64.56
0.109
0.344
0.057
0.405
0.103
0.850
0.148
0.658
0.405
0.247
0.096
0.326
1.03
8.62
73.18
0.060
0.167
0.027
0.138
0.243
0.791
0.134
0.268
0.232
0.263
0.180
0.799
1.62
13.51
58.69
0.035
0.022
0.175
0.290
0.739
0.209
0.882
0.009
0.150
0.449
0.106
0.226
1.15
9.60
68.30
0.831
0.033
0.008
0.069
0.098
0.162
0.030
L0.734
0.010
0.264
0.003
0.325
1.12
9.31
77.61
sampling sites. Varifactors obtained from factor analysis indicate that the parameters responsible for water quality variations are mainly related to discharge and temperature
(natural), organic pollution (point source: domestic wastewater) in relatively less polluted areas, organic pollution (point
source: domestic wastewater) and nutrients (non-point sources: agriculture and orchard plantations) in medium polluted
areas, and organic pollution and nutrients (point sources: domestic wastewater, wastewater treatment plants and industries)
in highly polluted areas in the basin. Discriminant analysis
gave the best results both spatially and temporally. For three
different sampling sites of the basin, it yielded an important
data reduction, as it used only six parameters (discharge, temperature, dissolved oxygen, biochemical oxygen demand,
electrical conductivity and nitrate nitrogen) affording more
than 85% correct assignations in temporal analysis, and seven
parameters (discharge, temperature, biochemical oxygen demand, pH, electrical conductivity, nitrate nitrogen and ammonical nitrogen) affording more than 81% correct
assignations in spatial analysis. Therefore, DA allowed a reduction in the dimensionality of the large data set, delineating
a few indicator parameters responsible for large variations in
water quality. Thus, this study illustrates the usefulness of
multivariate statistical techniques for analysis and interpretation of complex data sets, and in water quality assessment,
identification of pollution sources/factors and understanding
temporal/spatial variations in water quality for effective river
water quality management.
Acknowledgements
The authors sincerely thank Yuki Hiraga for her help in the
database development and the Fuji Xerox Setsutaro Kobayashi
Memorial Fund for providing funding support. We would also
like to acknowledge the help and support provided by the 21st
Century Center of Excellence (COE), Integrated River Basin
Management in Asian Monsoon Region, University of
Yamanashi.
References
Abdul-Wahab, S.A., Bakheit, C.S., Al-Alawi, S.M., 2005. Principal component and multiple regression analysis in modelling of ground-level ozone
and factors affecting its concentrations. Environmental Modelling & Software 20 (10), 1263e1271.
Adams, S., Titus, R., Pietesen, K., Tredoux, G., Harris, C., 2001. Hydrochemical characteristic of aquifers near Sutherland in the Western Karoo, South
Africa. Journal of Hydrology 241, 91e103.
Bricker, O.P., Jones, B.F., 1995. Main factors affecting the composition of natural waters. In: Salbu, B., Steinnes, E. (Eds.), Trace Elements in Natural
Waters. CRC Press, Boca Raton, FL, pp. 1e5.
Brumelis, G., Lapina, L., Nikodemus, O., Tabors, G., 2000. Use of an artificial
model of monitoring data to aid interpretation of principal component
analysis. Environmental Modelling & Software 15 (8), 755e763.
Dixon, W., Chiswell, B., 1996. Review of aquatic monitoring program design.
Water Research 30, 1935e1948.
EDYP, 2004. Result of Water Quality Measurement: Public and Ground Water.
Atmospheric Water Quality Control Section, Forest and Environment
Division of Yamanashi Prefecture.
475
Otto, M., 1998. Multivariate methods. In: Kellner, R., Mermet, J.M., Otto, M.,
Widmer, H.M. (Eds.), Analytical Chemistry. WileyeVCH, Weinheim.
Reghunath, R., Murthy, T.R.S., Raghavan, B.R., 2002. The utility of multivariate statistical techniques in hydrogeochemical studies: an example from
Karnataka, India. Water Research 36, 2437e2442.
Sarbu, C., Pop, H.F., 2005. Principal component analysis versus fuzzy principal component analysis. A case study: the quality of Danube water (1985e
1996). Talanta 65, 1215e1220.
Simeonov, V., Stratis, J.A., Samara, C., Zachariadis, G., Voutsa, D.,
Anthemidis, A., Sofoniou, M., Kouimtzis, T., 2003. Assessment of the surface water quality in Northern Greece. Water Research 37, 4119e4124.
Simeonova, P., Simeonov, V., Andreev, G., 2003. Environmetric analysis of the
Struma River water quality. Central European Journal of Chemistry 2,
121e126.
Simeonov, V., Simeonova, P., Tsitouridou, R., 2004. Chemometric quality assessment of surface waters: two case studies. Chemical and Engineering
Ecology 11 (6), 449e469.
Singh, K.P., Malik, A., Mohan, D., Sinha, S., 2004. Multivariate statistical
techniques for the evaluation of spatial and temporal variations in water
quality of Gomti River (India): a case study. Water Research 38,
3980e3992.
Singh, K.P., Malik, A., Sinha, S., 2005. Water quality assessment and apportionment of pollution sources of Gomti river (India) using multivariate
statistical techniques: a case study. Analytica Chimica Acta 538,
355e374.
Vega, M., Pardo, R., Barrado, E., Deban, L., 1998. Assessment of seasonal and
polluting effects on the quality of river water by exploratory data analysis.
Water Research 32, 3581e3592.
Wunderlin, D.A., Diaz, M.P., Ame, M.V., Pesce, S.F., Hued, A.C.,
Bistoni, M.A., 2001. Pattern recognition techniques for the evaluation of
spatial and temporal variations in water quality. A case study: Suquia river
basin (Cordoba, Argentina). Water Research 35, 2881e2894.