A Comparative Study of Remotely Sensed Data Classification

A Comparative Study of Remotely Sensed Data Classification Using Principal Components Analysis and Divergence
Chih-Cheng Hung*!, Ahmed Fahsi!, Wubishet Tadesse! and Tommy Coleman!
*Department of Mathematics and Computer Science,

!Center for Hydrology, Soil Climatology, and Remote Sensing Alabama A&M University, Normal, AL 35762 U.S.A.
ABSTRACT
This paper investigates principal components analysis (PCA) and divergence for transforming and selecting data bands for multispectral image classification. As the principal components are independent of one another, a color combination of the first three components can be useful in providing maximum visual separability of image features. Therefore, principal components analysis is used to generate a new set of data. Divergence, a measurement of statistical separability, is employed as a method of feature selection to choose the optimal m-band subset from the n-band data for use in the automated classification process. Classification accuracy assessment is carried out using large scale aerial photographs. Classification results on the Landsat Thematic Mapper (TM) data show that PCA is a more effective approach than divergence
1. INTRODUCTION
Image classification is the process of automatically categorizing each pixel of an image into one of several classes. It is an important tool used to analyze remotely sensed data of the earth and to extract useful thematic information. Several different approaches, including per-pixel, textural, and contextual algorithms, have been already developed for remotely sensed multispectral image classification. However, a major challenge for researchers in the field of image processing is to increase the classification accuracy of automated interpretation of remotely sensed multispectral data. Image classification can be done by either a pixel-based or a region-based approach. In the latter case, the image must be divided into homogeneous regions and a set of meaningful features has to be
defined. Once these features are defined, image regions (blocks) can be categorized using pattern recognition techn es [l]. However, image segmentation has proven to be an elusive goal [2]. In pixel-based classification, spectral information (pixel value) is used to classify each pixel in the image. One of the main drawbacks of this method is that each pixel is treated independently without consideration for its neighbors. In most natural scenes, the objects that have similar spectral responses tend to cluster. Hence, groups of like pixels should occur together. Principal components analysis (PCA), which has been widely used in pattern recognition and remote sensing applications, mathematically establishes a new set of variables which describe the variance in the original data set. As the principal components are independent of one another, a combination of the first three components is useful in providing maximum visual separability of image features [4]. Therefore, principal components analysis can be used in image classification to improve the accuracy. Divergence, a measurement of statistical separability, is employed as a method of feature selection to choose the optimal subset of the original data set for use in the automated classification process. A comparison of the classification accuracy for the PCA and divergence is examined in this study. The site selected to conduct this analysis is a 15 km by 9 km area in the Huntsville region, Northern Alabama, U.S.A. The organization of this paper is as follows. An image classification scheme is briefly described in section 2. Section 3 gives a brief description of the principal components analysis based on the contents of [ 5 ] . The divergence analysis is sketched in section 4. Classification Accuracy Assessment is discussed in section 5. Results are shown in section 6 . Conclusion and discussion then follow.
0-78034053-1/97/$10.00 1997 E E E
@
2444
2. IMAGE CLASSIFICATION SYSTEM

Supervised and Unsupervised classification are the most common methods used in image classification. In this study we used the supervised classification technique. This method is usually divided into two stages [ 6 ] :the training stage and the classification stage. The training stage is used to determine the spectral signature of the optimal number of spectral classes. Given a set of classes after the training process, these labeled classes are then used for classification in which the unknown pixel should be assigned to one of these labeled classes. A classified image appears as a mosaic of uniform parcels. Each pixel in the classified image is identified by a value internally and a color externally.
the data which is not already accounted for by the previous principal components. Mathematically, if' XT = [xl, x2, ..., x,,] is an Ndimensional random variable with mean vector M and covariance matrix C and let A be a matrix whose rows are formed from ,the eigenvectors of C, ordered so that the first row of A is the eigenvector corresponding to the largest eigenvalue, and the last row is the eigenvector corresponding to the smallest eigenvalue [5], then the PCA transformation is defined as:
I!= A (X - M)
where Y = [ Y ~ , Y ~ , . . ,T ~ ]the transpose and Y is ~ , each vector y, is the i"' principal component.
3. PRINCIPAL COMPONENTS ANALYSIS

Principal components analysis is a multivariate statistical transformation technique which is based on statistical properties of vector representations. PCA provides a systematic means of reducing the dimensionality of multispectral data. PCA has been used in image data compression [7], image enhancement [SI, and pattern classification [9]. To perform the PCA, the axes of the spectral space are rotated, changing the coordinates of each pixel in spectral space, and the data values as well. In other words, PCA is formed through a linear combination of the input bands. The new axes are parallel to the axes of the ellipse (In an n-dimensional histogram, a hyperellipsoid is formed if the distribution of each input band are normal or near normal). If there is significant correlation between the original image set, most of the image information will be contained in the first few bands (principal components) after PCA transformation. These principal components are uncorrelated and independent. The first principal component shows the direction and length of the widest transect of the ellipse. Therefore, it measures the highest variation with the data. The second principal component is the widest transect of the ellipse that is orthogonal to the first principal component. Hence, the second principal component describes the largest amount of variance in the data that is not described by the first principal component [lo]. In n-dimensional space, there are n principal components. Each successive principal component is the widest transect of the ellipse that is orthogonal to the previous components in the n-dimensional space and accounts for a decreasing amount of variation in
4. DIVERGENCE ANALYSIS
Statistical methods of feature selection are used to quantitatively select the subset of bands that provide the greatest degree of statistical separability between two classes [12, 131. Since remotely sensed data consist of several spectral bands, each band will represent a feature in n-dimensional feature space. In other words, each point represents a pixel of N-bands in N-dimensional feature space. To reduce the computation time and to maintain the same classification accuracy, how many bands (features) should be selected for the classification process? This is the problem in pattern recognition known as feature selection. Several measures of statistical separability have been used in the machine processing of remotely sensed data [ 1 11. Signature separability is a statistical measure of distance between two signatures. The greater the statistical separability of the classes, the smaller the probability of error. Separability can be calculated for any combination of bands that will be used in the classification, thereby ruling out any bands that do not contribute to the classification accuracy. Divergence is one of the popular measures of statistical separability. Divergence is a covariance weighted distance measure between class means collected in the training phase of the supervised classification. The degree of divergence between class ci and class cj is computed as [ 1 11: Diverg(ci, cj) = 0.5Tr[(Ci - Cj)(C;' + C;')(Mi - Mj)( Mi - IV[j)T)]
- C,")] + O.STr[(C;'
2445
where Tr is the trace of a matrix, C, and Cj are the covariance for classes i and j, M, and M, are the mean vectors for classes i and j, and superscript T is the transpose of the matrix. The average divergence is usually computed since more than two classes are defined in the training stage in practical applications. The computation involves getting the average over all possible pairs of classes. Assuming that m classes are already defined in the training phase, the average divergence can be expressed as:
m-1 nr
N = -Z 2 P 9 E2
where N is the number of points (i.e., pixels) to be sampled, p is the predicted accuracy (Sl), q = 100 - p, Z = 2 is the standard normal deviate of 1.96 for the 2 sided confidence interval of 95%, and E = 5 is the allowed error (confidence interval = 95%). Once the number of points to be sampled was determined, it was necessary to select the most efficient and objective sampling design. For this, we used a stratified systematic sampling at three elements per strata, which is considered adequate for land usehover classification accuracy assessment [l5]. The sampling procedure was carried out using the statistical random tables. The N points were then identified on both classified images by PCA and divergence. The ground truth for both images was extracted from the 1:10,000 aerial photographs. Error matrices were generated for both images to quantify and assess their classification accuracy. Kappa statistics has paditionally been used to evaluate the classification accuracy. It is determined as [12]:
Diverg,=x
i=l
j=i+l
D i v e r g ( c i , c j )/ m
Using the average divergence, the subset of bands having the maximum average diverage would be selected as the most appropriate set for the classification. However, to bound the range of the divergence, a transformed divergence is normally used [12]. The transformed divergence scales the degree of divergence to lie between 0 and 2000. The saturated value of 2000 indicates an excellent separability, whiIe a low value suggests a poor separability. In this study, we used the transformed divergence defined as: DivergT(ci,cj) = 2000( 1 - exp(-Diverg(c, cj)/S))
5. CLASSIFICATION ACCURACY ASSESSMENT

Various methods have been developed to evaluate a classification accuracy. In our analysis, we used the technique developed by [13], which is based on pixel by pixel comparison. This technique compares a number of sampled pixels to their corresponding ground truth data to determine the accuracy of the classification. The number of pixels to sample is determined from a predicted accuracy that is itself determined using a preliminary sampling scheme to estimate the predicted classification accuracy. Thus, one hundred points were randomly sampled by overlaying a regular grid onto the classified image. Comparison of the 100 sampled points to their ground truth (i.e., large scale photographs) points yielded an 8 1YOaccuracy. This predicted accuracy was then used to determine the number, N, of pixels to sample to evaluate the classification accuracy. This number is determined by the binomial probability theory as [14]:
where r is the number of rows in the matrix, xil is the number of observations along the matrix diagonal, xi+ and x+iare the marginal totals for row i and column i, respectively, and N is the total number of observations.
6. RESULTS
To quantitatively compare the classified results derived from PCA and divergence methods, 7 bands of TM multispectral images were used. The supervised classification approach was employed tc create 6 spectral classes: cotton (C), forest (F), watei (W), pasture (P), grass (G), and residential (R). The maximum likelihood classifier was then applied to the first three principal components for the PCA and tc bands 1, 3 and 5 for divergence analysis Classification results are visually shown in Figure 1 Visual inspection of figure 1 shows that the classifiec image from divergence method exhibits a largt amount of noise expressed by scattered pixels o
2446
2447
cotton inside forested areas. This anomaly does not appear on the classified image generated by the PCA. Statistical analysis of the classification accuracy is presented in the error matrices below (table 1). These error matrices were established by comparing the classified images to the 1:10000 aerial photographs. Classes listed in the column denote the computer classification results while classes in the row represent reference (actual) results. The overall accuracy for divergence is 78% while the accuracy for PCA is 81%. The Kappa statistic is 70% and 74% for divergence and PCA, respectively. Most confusion is shown between cotton and grass and between cotton and residential Table 1. Error matrices of the classified images fram Divergence (a) and PCA (b). (a) ., Classified Data
components analysis and divergence for multispectral image classification were compared in this study. Classification results on the Landsat Thematic Mapper (TM) data showed that PCA is an effective approach for automated image classification. Although the overall classification accuracy shows a slight difference between these two approaches, we believe that if random groups of pixels instead of single pixels were evaluated, a higher classification accuracy would have resulted for the PCA. This was visually detected when examining the classified images derived from the PCA and divergence methods; the divergence method resulted in an image which presents a high amount of scattered pixels wrongly classified (e.g., a large number of cotton pixels scattered in forested areas).
7. ACKNOWLE
This work was supported by Grant No. NCCW0084 from the National Aeronautics and Space Administration (NASA), Washington, DC. Any use of trade, product or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government. The authors wish to thank Mr. Donvilla Williams for helping produce the output product and Dr. John Adams for proofreading it.
8. REFERENCES
(b)
for both images. The classified image using divergence present more confusion between cotton and forest, also shown visually in figure 1 (scattered cotton pixels inside forested areas), while the classified image using PCA present slightly higher confusion between pasture and grass.
7. CONCLUSION AND DISCUSSION
IEEE Trans. Patte Analysis and Machine Intelligence, Vol. PAMI-4, No. 3,304-306, 1982.
A major challenge for researchers in the field of image processing and remote sensing is to increase classification accuracy in the automated interpretation of remotely sensed multispectral data. Principal
[3] 2. Zhang, "A new spatial classification algorithm for high ground resolution images," proceedings of IGARSS '88 symposium, Edinburgh, Scotland, Sep. 13-16, 1988. [4] A. A. D. Canas and M. E. Barnett, "The Generation and Interpretation of False-Colour Composite Principal Component Images," Int. J. Remote Sensing, Vol. 6, No. 6, 867-881, 1985. [5] ERDAS, IMAGINE: Field Guide (Third Edition), ERDAS, Inc. Atlanta, Georgia, 1995.
2448
[6] T. M. Lillesand and R. W. Kiefer, Remote Sensing and Image Interpretation (2nd Edition), John Wiley & Sons, 1987. [7] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Addision Wesley, 1993.
[121 J. R. Jensen, Introductory Digital Image Processing: A Remote Sensing Pespective (2nd edition), Prentice-Hall, 1996.
[ 131 K. Fitzpatrick-Lins, The Accuracy of Selected Land Use and Land Cover Maps at Scales of 1 :250000 and 1 :1 00000, Journal of Research, U.S. Geological Survey:6: 169-173, 1980.
[SI A. R. Gillispie, Digital Techniques of Image Enhancement, in Remote Sensing in Geology, edited by B. S. Siegal and A. R. Gillispie, New York, Wiley, pp. 139-226, 1980.
[9] M. Shimura and T. Imai, Nonsupervised Classification Using the Principal Component, Pattern Recognition, Vol. 5, pp. 353-363, 1973.
[ 101 P. J. Taylor, Quantitative Methods in Geography: An Introduction to Spatial Analysis, Boston, Massachusetts: Houghton Mifflin Company, 1977.
[14] G. W. Snedecor and W. F. Cochran, Statistical Methods, Ames: Iowa State University Press: 202-21 1 and 516-517, 1967. [15] M. Assafi, A. Fahsi, and M. Azerzak, Utilization des images HRV de Spot pour La classification en mode tioccupation du sol de la ville de Casablanca (Maroc). Societe Francaise de Photogrammetric et de Teledetection, 145 (1): 8-17, 1997.
[ l l ] P. H. Swain and S. M. Davis (ed) Remote Sensing - The Quantitative Analysis, McGraw-Hill, 1978.
2449

A Comparative Study of Remotely Sensed Data Classification

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

A Comparative Study of Remotely Sensed Data Classification

Încărcat de

Drepturi de autor:

Formate disponibile

A Comparative Study of Remotely Sensed Data Classification Using Principal Components Analysis and Divergence

Chih-Cheng Hung*!, Ahmed Fahsi!, Wubishet Tadesse! and Tommy Coleman!

*Department of Mathematics and Computer Science,

2. IMAGE CLASSIFICATION SYSTEM

3. PRINCIPAL COMPONENTS ANALYSIS

5. CLASSIFICATION ACCURACY ASSESSMENT

S-ar putea să vă placă și