Defining Homogenous Climate Zones of Bangladesh Using Cluster Analysis

International Journal of Statistics and Mathematics
Vol. 6(1), pp. 119-129, February, 2019. © www.premierpublishers.org. ISSN: 2375-0499
Research Article
Defining Homogenous Climate zones of Bangladesh

using Cluster Analysis
*Md. Siraj-Ud-Doulah1, Md. Nazmul Islam2
1,2Department of Statistics, Begum Rokeya University, Rangpur, Bangladesh.
Climate zones of Bangladesh are identified by using mathematical methodology of cluster

analysis. Monthly data from 34 climate stations for rainfall from 1991 to 2013 are used in the
cluster analysis. Five Agglomerative Hierarchical clustering measures based on mostly used six
proximity measures are chosen to perform the regionalization. Besides three popular measures:
K-means, Fuzzy and density based clustering techniques are applied initially to decide the most
suitable method for the identification of homogeneous region. Stability of the cluster is also tested
based on nine validity indices. It is decided that Ward method based on Euclidean distance, K-
means, Fuzzy are the most likely to yield acceptable results in this particular case, as is often the
case in climatological research. In this analysis we found seven different climate zones in
Bangladesh.
Keywords: Clustering Techniques, Validity Indices, Rainfalls, Climate Zones, Bangladesh.
INTRODUCTION
Rainfall plays an important role in the agro-economy of climatic studies have used a variety of data to define
Bangladesh, located in tropical zone. Its climate is climatic types and delineate zones of similar climate.
characterized by large variations in seasonal rainfall with Several methods have also been applied for the detection
moderately warm temperatures and high humidity. Due to of homogeneous climate zone. In this study, cluster
its geographic location and dense population, Bangladesh analysis methodology has been used. Cluster analysis
has been identified as one of the most vulnerable countries applied to meteorological variables is a suitable approach
to climate change (Islam, 2009). The investigation has for identifying the climate zones, and its use is becoming
been carried out using monthly records of important increasingly more common in atmospheric research (Erin,
climatic variable rainfall observed at 34 ground based 1984; Kalkstein et al. 1987; Tayan et al. 1998). Choosing
stations of Bangladesh Meteorological Department (BMD) appropriate data to cluster is an initial consideration in
distributed over the country during the time period 1991- cluster analysis. In climate classification, the variability of
2013 (http://www.data.gov.bd/). From the combined trend long-term rainfall is the most readily available variables
of rainfall and maximum temperature intensity (determined (Linacre, 1992). In this study we intend to define spatially
by GIS mapping), geographically Bangladesh is divided homogeneous climate regions of Bangladesh by using a
into four regions such as; North-Eastern Region, South- mathematical methodology called cluster analysis.
Eastern Region, South-Western Region and North-
Western Region. Another research show that the
information from each station have been studied and
analyzed, while grouping the stations in one of the eight
hydrological (planning) regions of Bangladesh. North East
(NE), North Central (NC), North West (NW), South East
(SE), South Central (SC), South West (SW), Eastern Hill
(EH) and River and Estuary (RE) which are defined in *Corresponding Author: Md. Siraj-Ud-Doulah,
qualitative terms, not quantitatively. This zone Department of Statistics, Begum Rokeya University,
classification has been used not only for differences in Rangpur, Bangladesh. Email: sdoulah_brur@yahoo.com
climate but also for social and economic variables. Many
Defining Homogenous Climate zones of Bangladesh using Cluster Analysis

Doulah and Islam. 120
Climate Data METHODOLOGY
The investigation has been carried out using daily records For clustering purposes there are two widely used
of one important climatic variable, rainfall, observed at 34 methods: the hierarchical and the non-hierarchical
ground based stations of Bangladesh Meteorological (partitional). The hierarchical clustering process can be
Department (BMD) distributed over the country during the categorized as divisive when a large data set is divided
time period 1991-2013 (http://www.data.gov.bd/). into several small groups and, agglomerative when a small
Although Bangladesh Meteorological Department (BMD) data set are put together to create a larger cluster (Dyeret,
has thirty-six (36) ground based stations, but only data of 1975; Gan et al. 2007; Sarah et al. 2011). There are so
thirty-four (34) stations has been taken in this research. At many descriptive statistics available in the literature
initial stage, quality of rainfall data is checked by verifying (Doulah, 2018) for evaluating the data that we have
the following criteria (Erin, 1984; Masoodian, 2005) applied the most frequently used measures in our analysis
first and then we have used the clustering techniques.
(i) Non-existence of dates
(ii) Negative monthly rainfall Agglomerative Algorithms
(iii) Monthly winter rainfall>100mm
(iv) Weather stations > 35% missing data Some of the agglomerative algorithms are: single linkage,
(v) Stations with gaps three or more years in between complete linkage, average linkage, centroid and Ward’s
series method. Several proximity measures like Euclidean
distance, Minkowski distance, Manhattan distance,
If any of the above mentioned point is true for any dataset, maximum distance, correlation based distance and
it is identified as erroneous data. So, two BMD stations are Canberra distance are used. The partitioned clustering
discarded after following the preceding conditions process is based on recover the natural grouping present
considering data period from 1991 to 2013. R-based in the data thought a single partition. The partitioned
program is used to detect homogenous climate zones. algorithms are divided as: K-means, Fuzzy and model
based clustering techniques (Hossen et al. 2015; Han &
Kamber, 2006; Johnson & Wichern, 1998).
Table 1: Some of the agglomerative algorithms

Methods Statistic Explanation
Single
Linkage
D12  min d ( xi , y j ) This is the distance between the closest members of the two clusters.
i, j
Complete D12  max d ( xi , y j ) This is the distance between the members that are farthest apart (most
Linkage i, j dissimilar)
Average 1 k l This method involves looking at the distances between all pairs and
Linkage D12   d ( xi , y j )
kl i 1 j 1
averages all of these distances. This is also called UPGMA-Un-
Weighted Pair Group Mean Averaging.
Centroid This involves finding the mean vector location for each of the clusters
Method
D12  d ( x, y) and taking the distance between the two centroids.
Ward This method minimizes the total within-cluster variance. Those clusters
2. k . l
Method D12  . x y are combined whose merger results in minimum information loss (ESS
k l criterion)
Distance Measures the nearest mean. K-Means is relatively an efficient

method (Gong & Richman, 1995; Nathan & McMahon,
The distances are normally used to measure the similarity
1990). However, we need to specify the number of
or dissimilarity between two data objects. Though there
clusters, in advance and the final results are sensitive to
are various distance measure available in the literature
initialization and often terminates at a local optimum.
(Hossen & Doulah, 2016; Meila, 2007; Yashwantl &
Sananse, 2015), commonly used six distance measures
are considered here. A simple description of distance
measures are given below:
Non-hierarchical Algorithms
K-means clustering
K-means clustering intends to partition n objects into k
clusters in which each object belongs to the cluster with
Int. J. Stat. Math. 121
Table 2: Some of the distance measures (b) Estimate the parameters using the EM algorithm;
Distance Statistic (c) Choose the model and the number of clusters
Euclidean according to the BIC.
d ( x, y )  ( xi  yi )
2
In this method, a model is hypothesized for each cluster to
Manhattan p find the best fit of data for a given model. Also, this method
d ( x, y )   xi  yi locates the clusters by clustering the density function.
i 1 Thus, it reflects the spatial distribution of the data points.
Minkowski 1/ m This method also provides a way to determine the number
 p m

d ( x, y )    xi  yi 
of clusters. That was based on standard statistics, taking
outlier or noise into account. It, therefore, yields robust
 i 1  clustering methods.
Maximum d ( x, y)  max xi  yi
Validity Indices
Correlation p
 ( x  x)( y  y)
i i In the literature of data clustering, a lot of clustering
d cor ( x, y )  1  i 1
algorithms have been proposed for different applications
p n
and different sizes of data. But clustering a dataset is an
 ( xi  x)2  ( yi  y)2
i 1 i 1
unsupervised process; there are no predefined classes
and no examples that can show that the clusters found by
Canberra p
xi  yi the clustering algorithms are valid (Hardy, 1996; Luxburg,
d ( x, y )   2010). To compare the clustering results of difference
i 1 xi  yi clustering algorithms, it is necessary to develop some
validity criteria. Also, if the number of clusters is not given
Algorithm in the clustering algorithms, it is a highly nontrivial task to
1. Clusters the data into k groups where k is predefined. find the optimal number of clusters in the data set. To do
2. Select k points at random as cluster centers. this, we need some cluster validity methods. The notation
3. Assign objects to their closest cluster center according & meaning of the validity indices are: n = number of
to the Euclidean distance function. observations, p= number of variables, q= number of
4. Calculate the centroid or mean of all objects in each
cluster.
clusters,  
X = xij , i  1, 2,......, n ; j  1, 2,....., p ; =
5. Repeat steps 2, 3 and 4 until the same points are n  p data matrix of p variables measured on n
assigned to each cluster in consecutive rounds.
independent observations, x = centroid of data matrix X
Fuzzy clustering , nk = number of objects in cluster Ck ,
xi = p -dimensional vector of observations of the i th object
The Fuzzy clustering is a clustering algorithm developed
by Dunn, and later on improved by Bezdek (Luxburg, in cluster Ck ,
2010). It is useful when the required numbers of clusters q
are pre-determined; thus, the algorithm tries to put each of Wq =  ( xi  ck )( xi  ck )T is the within-group
the data points to one of the clusters. What makes FCM k 1 ick
different is that it does not decide the absolute membership
dispersion matrix for data clustered into q clusters,
of a data point to a given cluster; instead, it calculates the
p
Bq =  nk (ck  x )(ck  x )T
likelihood (the degree of membership) that a data point will
belong to that cluster. Hence, depending on the accuracy is the between-group
k 1
of the clustering that is required in practice, appropriate
tolerance measures can be put in place. Since the dispersion matrix for data clustered into q clusters,
absolute membership is not calculated, FCM can be T =Total Sum of Squares,
extremely fast because the number of iterations required
1 p BGSS j
S = 
2
to achieve a specific clustering exercise corresponds to ,
the required accuracy. p j 1 TSS j
p
Model-Based clustering BGSS j =  nk (ckj  x j ) 2 and
k 1
The model-based clustering framework consists of three p
major steps (Baldwin & Lakshmivarahan, 2002; Everitt, TSS j =  ( xij  x j ) 2 . The following cluster validity
1993): i 1
(a) Initialize the EM algorithm using the partitions from methods are given in Table 3 below:
model-based agglomerative hierarchical clustering.

Table 3: Some of the validity indices

Validity Index Statistic Criteria for selection
DIFFq
Krzanowski and Lai KL(q)  Maximum value of the index
DIFFq  1
trace( Bq ) / (q  1)
Calinski and Harabasz CH (q)  Maximum value of the index
trace(Wq ) / (n  q)
det(T )
Scott and Symons S cot t  n log Maximum difference Between hierarchy levels of the index
det(Wq )
Marriot Marriot  q 2 det(Wq ) Max. value of second differences between levels of the index
TrCovW Tr covW  trace(cov(Wq )) Maximum difference between hierarchy levels of the index
TraceW TraceW  trace(Wq ) Maximum value of absolute second differences between
levels of the index
Friedman and Rubin Friedman  trace(Wq 1Bq ) Maximum difference between hierarchy levels of the index
det(T )
Rubin Rubin  Max. value of second differences between levels of the index
det(Wq )
S
Ratkowsky Ratkowsky  Maximum value of the index
q1/2
To settle the cluster number is a difficult task since there is RESULTS AND DISCUSSIONS
not a specific method for this purpose and the number is
the result of the assignation of training clusters until the The statistical analysis for the monthly rainfall data of 34
optimal value is found. Some of the indexes to be used for meteorological stations are summarized in Table 4, where
establishing the number of clusters can also be employed the mean, standard deviation (SD), coefficient of variation
to validate cluster quality. (CV), skewness (S) and kurtosis (K) are given.
Chuadanga, Rajshahi and Ishurdi stations were less
monthly rainfall affected station.
Table 4: Descriptive statistics of selected meteorological stations

S/No Stations Mean Standard Deviation (SD) Coefficient of Variation (CV) Skewness (S) Kurtosis (K)
1 Barisal 170.14130 182.63744 107.3445632 1.231543 1.694753
2 Bhola 180.69565 196.6325455 108.8197437 1.174785 1.173446
3 Bogra 141.34782 160.1997005 113.3372227 1.254129 1.168203
4 Chandpur 165.47463 175.7854562 106.2310567 1.238886 1.718038
5 Chittagong 245.56521 289.6960779 117.9711365 1.42098 1.660857
6 chuadanga 125.63768 144.921468 115.3487287 1.599033 3.050739
7 Comilla 172.57608 181.6644199 105.2662759 1.246822 1.446507
8 Cox's Bazar 315.66666 366.0148796 115.9498035 1.103031 0.226135
9 Dhaka 167.45652 175.4360228 104.7651181 1.126478 0.823015
10 Dinajpur 163.86231 196.6004048 119.9790203 1.25614 1.160553
11 Faridpur 143.28623 148.4809432 103.6254086 1.104658 0.958157
12 Feni 240.85869 258.866027 107.4763053 0.956138 -0.03495
13 Hatiya 260.87681 300.1593141 115.0578744 1.076978 0.453877
14 Ishurdi 120.08333 132.3241074 110.1935662 1.314394 1.715892
15 Jessore 139.82608 152.5011076 109.0648469 1.352668 2.405591
16 Khepupara 238.42029 258.0383992 108.2283724 0.921879 -0.10036
17 Khulna 148.85144 156.7208394 105.2867407 1.140018 1.149058
18 Kutubdia 260.01087 319.3049666 122.8044686 1.661191 3.375483
19 M.court 248.76449 265.0863303 106.5611605 0.928332 -0.07512
20 Madaripur 157.97826 166.8337209 105.6054928 1.15376 1.174079

Table 4 Continue: Descriptive statistics of selected meteorological stations

S/No Stations Mean Standard Deviation (SD) Coefficient of Variation (CV) Skewness (S) Kurtosis (K)
21 Mongla 160.46739 169.7316889 105.773321 1.075138 1.189396
22 Mymensingh 182.43840 195.008102 106.8898301 1.055405 0.518139
23 Patuakhali 214.17391 241.5987481 112.8049372 1.15553 0.703598
24 Rajshahi 116.77898 134.3791315 115.0713297 1.416645 2.283301
25 Rangamati 213.88043 234.8292769 109.794651 1.337556 1.671646
26 Rangpur 183.99275 203.9673948 110.8562108 1.044524 0.442857
27 Sandwip 301.01087 383.7351879 127.4821698 2.278171 9.364013
28 Satkhira 145.73550 147.8694718 101.4642722 0.877968 -0.13951
29 Sitakunda 260.95289 291.0339461 111.5273859 1.218129 0.953242
30 Srimangal 193.93115 192.6867235 99.3583105 1.050778 0.752041
31 sydpur 178.93478 215.9816055 120.7040925 1.271583 0.923776
32 Sylhet 323.86594 324.7227743 100.2645639 0.85881 -0.05229
33 Tangail 151.64855 157.5703353 103.9049398 1.076241 0.772751
34 Teknaf 367.02173 454.7473299 123.9020149 1.213534 0.588181
Hierarchical Clustering methods
Now we mentioned below the dendrogram of several linkage methods based on different distance measures for the
monthly rainfall data of 34 meteorological stations.
Single Linkage
Euclidean Distance Minkowski Distance Manhattan Distance
Correlation method Maximum Canbera

Figure 1: Dendrogram of Single linkage for selected rainfall station for different distance measure

Complete Linkage

Figure 2: Dendrogram of Complete linkage for selected rainfall station for different distance measure
Average Linkage

Figure 3: Dendrogram of Average linkage for selected rainfall station for different distance measure

Ward.D

Figure 4: Dendrogram of Ward linkage for selected rainfall station for different distance measure
Centroid

Figure 5: Dendrogram of Centroid linkage for selected rainfall station for different distance measure
To sum up the afore-depicted dendrogram from Figure 1- distance) have identified homogeneous climate zones in
5 of all agglomerative hierarchical clustering (single Bangladesh. Here, we have got the patent homogeneous
linkage, complete linkage, average linkage, ward and climate zones in Bangladesh based on Ward method with
centroid) based on the proximity measures (Euclidean proximity measures. Therefore, we conclude that Ward
distance, Minkowski distance, Manhattan distance, method is the best in this perspective.
correlation method, maximum distance and Canberra
The seven homogeneous climate zones in Bangladesh are Cluster 5: Sylhet

shown below: Cluster 6: Hatiya, Khepupara, Patuakhali, Feni, M.court
Cluster 1: Rangpur, Sydpur, Dinajpur Cluster 7: Kutubdia, Rangamati, Chittagong, Sitakunda
Cluster 2: Satkhira, Khulna, Mongla, Ishurdi, chuadanga,
Rajshahi, Jessore Nonhierarchical Clustering methods
Cluster 3: Barisal, Bhola, Chandpur, Madaripur,
Srimangal, Comilla, Dhaka, Faridpur, The results of Nonhierarchical methods for the monthly
Mymensingh, Bogra, Tangail, rainfall data of 34 meteorological stations are shown
Cluster 4: Sandwip, Cox’s bazar, Teknaf below:
Figure 6: K-means clustering
Figure 7: Fuzzy clustering

Figure 8: Model based clustering
Reviewing the weather stations in the seven clusters, it is Validity Indices

apparent that from Figure 6-8, k-means, Fuzzy and Model
based clustering methods gave results generally Many clustering algorithms have been designed, and thus
consistent with the linkage hierarchical methods. Weather it is important to decide how to choose a good clustering
stations with common or compatible geographical algorithm for a given data set and how to evaluate a
locations cluster. They also depicted the seven clustering method. In these circumstances, one of the
homogeneous climate zones in Bangladesh. techniques, validity indices may help to check the perfect
selection of cluster size. Validity Indices can be used for
defining the number of clusters for 34 meteorological
stations of rainfall. The following validity indices results are
shown in the given below:
Table 5: Different validity indices values for selecting the number of cluster
No of Cluster
Index Name 2 3 4 5 6 7 8 9 10
krzanowski and Lai 2.53 1.40 1.5065 0.9954 0.0126 213.7057 0.8814 0.9856 1.1411
Calinski and Harabasz 23.15 12.68 8.9764 7.1303 5.8794 77.3498 67.5873 60.2785 54.5556
Scott and Symons 181.4 312.26 466.4679 726.1218 896.2206 1367.598 1520.601 1645.157 1755.115
Marriot 0 -4.6E+54 -6.7E+54 4.2E+54 -3.7E+54 5.5E+54 2.0E+53 8.7E+52 0.0E+00
TrCovW 0 777.564 365.859 376.821 170.214 789.408 143.292 502.706 155.105
TraceW 0 0 18.684 0.504 20.313 4973.39 4951.4 0.6 2.543
Friedman and Rubin 0 3.0336 2.4694 3.1761 2.2267 218.1186 5.7366 4.9847 4.2741
Rubin 0 0 -0.0023 0.0001 -0.0025 1.6127 -1.5764 0.0008 -0.001
Ratkowsky 0.19 0.1622 0.1447 0.1328 0.1233 0.3005 0.2823 0.2673 0.2545

Figure 9: Bar plot of different validity indices values
Here we checked the validity of the cluster of climate Fuzzy clustering methods are the best methods among all
variable, rainfall, by using well-recognized nine validity other methods and Bangladesh has seven homogenous
indices. In this paper from Table 5 & Figure 9 we found that climate zones for analyzing rainfall data.
there are seven clusters in our dataset. Therefore, it is to
be concluded that there are seven homogenous climate
zones in Bangladesh. ACKNOWLEDGEMENTS
The author would like to thank the anonymous reviewers

CONCLUSION for their helpful comments to enhance the quality of this
paper.
Cluster Analysis is an unsupervised machine learning
method. It offers a way to partition a dataset into subsets
that share common patterns. Notably, there are many REFERENCES
cluster analysis algorithms to choose from, each making Baldwin ME, Lakshmivarahan S. (2002). Rainfall
certain assumptions about the data and about how cluster classification using histogram analysis an example of
should be formed. In this study, we applied 5- data mining in meteorology, Technical Report, 4:342-
agglomerative hierarchical clustering technique based on 357.
6-proximity measure and other popular 3-clustering Dyeret TGJ. (1975). Assignment of stations into
technique, 9- cluster validity index. Although many of the homogeneous group, Quarterly Journal of
previous studies did not use objective validation methods Meteorological Society, 101:1005-1012.
that are well-justified or did not use validation methods at Doulah MSU. (2018). Alternative Measures of Standard
all, previous studies on the subtyping of weather stations, Deviation Coefficient of Variation and Standard Error,
all employed a single clustering method. Here, formal International Journal of Statistics and Applications,
methods of cluster validation examine how well a cluster 8(6):309-315.
fits a dataset. The goal of this study is to identify the similar Erin S (1984). Climatology and its methods. 3rd edn.
weather station from a group of weather stations with Istanbul.
rainfall data by using cluster analysis. To sum up the whole Everitt BS (1993). Cluster analysis. Edward Arnold,
discussion we conclude that ward method, K-means and London.
Gong X, Richman MB. (1995). On the application of cluster Masoodian SA. (2005). Regionalization of Precipitation
analysis to growing season precipitation data in North Regimes of Iran Using Cluster Analysis, Journal of
America east of the Rockies, Journal of Climate, 8:897– Research in Geography, 52:47-61.
931. Nathan RJ, McMahon TA. (1990). Identification of
Gan G, Ma C, Wu J (2007). Data Clustering Theory, homogeneous regions for the purpose of
Algorithms, and Applications. ASA, Alexandria. regionalization. J. Hydrol. 121:217–238.
Hardy A. (1996). On the number of clusters. Sarah N, Alaa K, El-Halees M. (2011). Implementation of
Computational Statistics and Data Analysis, 23:83-96. Data Mining Techniques for Meteorological Data
Han J, Kamber M (2006). Data Mining: Concepts and Analysis, IJICT Journal, 1(3):25-37.
Techniques. Morgan Kaufmann Publishers, San Yashwant S, Sananse SL. (2015). Comparisons of
Francisco. Different Methods of Cluster Analysis with application
Hossen MB, Doulah MSU. (2016). Identification of Robust to Rainfall Data, IJIRSET Journal, 4(11):49-64.
Clustering Methods in Gene Expression Data Analysis, Tayan M, Dalfes N, Karaca M, Yenigun O. (1998). A
Current Bioinformatics, 12:558-562. comparative assessment of different methods for
Hossen MB, Doulah MSU, Hoque A. (2015). Methods for detecting in homogeneties in Turkish temperature data
Evaluating Agglomerative Hierarchical Clustering for set, International Journal of Climatology, 18:561–578.
Gene Expression Data: A Comparative Study.
Computational Biology and Bioinformatics, 4(6):88-94.
http://www.data.gov.bd/
Islam MN. (2009). Rainfall and Temperature Scenario for
Bangladesh, The Open Atmospheric Science Journal,
3:93-103. Accepted 2 January 2019
Johnson R, Wichern D (1998). Applied Multivariate
Statistical Analysis. Englewood Cliffs, NJ: Prentice– Citation: Doulah S, Islam N (2019). Defining Homogenous
Hall. Climate zones of Bangladesh using Cluster Analysis.
Kalkstein LS, Tan G.R, Skindlov JA. (1987). An evaluation International Journal of Statistics and Mathematics, 6(1):
of 3 clustering procedures For use in synoptic 119-129.
climatological classification, Journal of Climate and
Applied Meteorology, 26:717–730.
Linacre E (1992). Climate Data and Resources: A
Reference and Guide. Routledge, London and New
York. Copyright: © 2019 Doulah and Islam. This is an open-
Luxburg U. (2010). Clustering stability: An overview, access article distributed under the terms of the Creative
Found. Trends Mach. Learn. 2 (3):235–274. Commons Attribution License, which permits unrestricted
Meila M. (2007). Comparing clustering’s – an information use, distribution, and reproduction in any medium,
based distance, Journal of Multivariate Analysis, 98 provided the original author and source are cited.
(5):873–895.

Defining Homogenous Climate Zones of Bangladesh Using Cluster Analysis

Încărcat de

Informații document

Titlu original

Drepturi de autor

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Defining Homogenous Climate Zones of Bangladesh Using Cluster Analysis

Încărcat de

Drepturi de autor:

International Journal of Statistics and Mathematics

Vol. 6(1), pp. 119-129, February, 2019. © www.premierpublishers.org. ISSN: 2375-0499

Defining Homogenous Climate zones of Bangladesh

Climate zones of Bangladesh are identified by using mathematical methodology of cluster

Keywords: Clustering Techniques, Validity Indices, Rainfalls, Climate Zones, Bangladesh.

Defining Homogenous Climate zones of Bangladesh using Cluster Analysis

Climate Data METHODOLOGY

Table 1: Some of the agglomerative algorithms

Distance Measures the nearest mean. K-Means is relatively an efficient

Defining Homogenous Climate zones of Bangladesh using Cluster Analysis

Table 3: Some of the validity indices

Table 4: Descriptive statistics of selected meteorological stations

Defining Homogenous Climate zones of Bangladesh using Cluster Analysis

Table 4 Continue: Descriptive statistics of selected meteorological stations

Hierarchical Clustering methods

Euclidean Distance Minkowski Distance Manhattan Distance

Correlation method Maximum Canbera

Defining Homogenous Climate zones of Bangladesh using Cluster Analysis

Euclidean Distance Minkowski Distance Manhattan Distance

Correlation method Maximum Canbera

Euclidean Distance Minkowski Distance Manhattan Distance

Correlation method Maximum Canbera

Defining Homogenous Climate zones of Bangladesh using Cluster Analysis

Euclidean Distance Minkowski Distance Manhattan Distance

Correlation method Maximum Canbera

Euclidean Distance Minkowski Distance Manhattan Distance

Correlation method Maximum Canbera

The seven homogeneous climate zones in Bangladesh are Cluster 5: Sylhet

Figure 6: K-means clustering

Figure 7: Fuzzy clustering

Figure 8: Model based clustering

Reviewing the weather stations in the seven clusters, it is Validity Indices

Defining Homogenous Climate zones of Bangladesh using Cluster Analysis

Figure 9: Bar plot of different validity indices values

The author would like to thank the anonymous reviewers

Defining Homogenous Climate zones of Bangladesh using Cluster Analysis

S-ar putea să vă placă și