Documente Academic
Documente Profesional
Documente Cultură
ABSTRACT
This paper investigates principal components analysis (PCA) and divergence for transforming and selecting data bands for multispectral image classification. As the principal components are independent of one another, a color combination of the first three components can be useful in providing maximum visual separability of image features. Therefore, principal components analysis is used to generate a new set of data. Divergence, a measurement of statistical separability, is employed as a method of feature selection to choose the optimal m-band subset from the n-band data for use in the automated classification process. Classification accuracy assessment is carried out using large scale aerial photographs. Classification results on the Landsat Thematic Mapper (TM) data show that PCA is a more effective approach than divergence
1. INTRODUCTION
Image classification is the process of automatically categorizing each pixel of an image into one of several classes. It is an important tool used to analyze remotely sensed data of the earth and to extract useful thematic information. Several different approaches, including per-pixel, textural, and contextual algorithms, have been already developed for remotely sensed multispectral image classification. However, a major challenge for researchers in the field of image processing is to increase the classification accuracy of automated interpretation of remotely sensed multispectral data. Image classification can be done by either a pixel-based or a region-based approach. In the latter case, the image must be divided into homogeneous regions and a set of meaningful features has to be
defined. Once these features are defined, image regions (blocks) can be categorized using pattern recognition techn es [l]. However, image segmentation has proven to be an elusive goal [2]. In pixel-based classification, spectral information (pixel value) is used to classify each pixel in the image. One of the main drawbacks of this method is that each pixel is treated independently without consideration for its neighbors. In most natural scenes, the objects that have similar spectral responses tend to cluster. Hence, groups of like pixels should occur together. Principal components analysis (PCA), which has been widely used in pattern recognition and remote sensing applications, mathematically establishes a new set of variables which describe the variance in the original data set. As the principal components are independent of one another, a combination of the first three components is useful in providing maximum visual separability of image features [4]. Therefore, principal components analysis can be used in image classification to improve the accuracy. Divergence, a measurement of statistical separability, is employed as a method of feature selection to choose the optimal subset of the original data set for use in the automated classification process. A comparison of the classification accuracy for the PCA and divergence is examined in this study. The site selected to conduct this analysis is a 15 km by 9 km area in the Huntsville region, Northern Alabama, U.S.A. The organization of this paper is as follows. An image classification scheme is briefly described in section 2. Section 3 gives a brief description of the principal components analysis based on the contents of [ 5 ] . The divergence analysis is sketched in section 4. Classification Accuracy Assessment is discussed in section 5. Results are shown in section 6 . Conclusion and discussion then follow.
0-78034053-1/97/$10.00 1997 E E E
@
2444
the data which is not already accounted for by the previous principal components. Mathematically, if' XT = [xl, x2, ..., x,,] is an Ndimensional random variable with mean vector M and covariance matrix C and let A be a matrix whose rows are formed from ,the eigenvectors of C, ordered so that the first row of A is the eigenvector corresponding to the largest eigenvalue, and the last row is the eigenvector corresponding to the smallest eigenvalue [5], then the PCA transformation is defined as:
I!= A (X - M)
where Y = [ Y ~ , Y ~ , . . ,T ~ ]the transpose and Y is ~ , each vector y, is the i"' principal component.
4. DIVERGENCE ANALYSIS
Statistical methods of feature selection are used to quantitatively select the subset of bands that provide the greatest degree of statistical separability between two classes [12, 131. Since remotely sensed data consist of several spectral bands, each band will represent a feature in n-dimensional feature space. In other words, each point represents a pixel of N-bands in N-dimensional feature space. To reduce the computation time and to maintain the same classification accuracy, how many bands (features) should be selected for the classification process? This is the problem in pattern recognition known as feature selection. Several measures of statistical separability have been used in the machine processing of remotely sensed data [ 1 11. Signature separability is a statistical measure of distance between two signatures. The greater the statistical separability of the classes, the smaller the probability of error. Separability can be calculated for any combination of bands that will be used in the classification, thereby ruling out any bands that do not contribute to the classification accuracy. Divergence is one of the popular measures of statistical separability. Divergence is a covariance weighted distance measure between class means collected in the training phase of the supervised classification. The degree of divergence between class ci and class cj is computed as [ 1 11: Diverg(ci, cj) = 0.5Tr[(Ci - Cj)(C;' + C;')(Mi - Mj)( Mi - IV[j)T)]
- C,")] + O.STr[(C;'
2445
where Tr is the trace of a matrix, C, and Cj are the covariance for classes i and j, M, and M, are the mean vectors for classes i and j, and superscript T is the transpose of the matrix. The average divergence is usually computed since more than two classes are defined in the training stage in practical applications. The computation involves getting the average over all possible pairs of classes. Assuming that m classes are already defined in the training phase, the average divergence can be expressed as:
m-1 nr
N = -Z 2 P 9 E2
where N is the number of points (i.e., pixels) to be sampled, p is the predicted accuracy (Sl), q = 100 - p, Z = 2 is the standard normal deviate of 1.96 for the 2 sided confidence interval of 95%, and E = 5 is the allowed error (confidence interval = 95%). Once the number of points to be sampled was determined, it was necessary to select the most efficient and objective sampling design. For this, we used a stratified systematic sampling at three elements per strata, which is considered adequate for land usehover classification accuracy assessment [l5]. The sampling procedure was carried out using the statistical random tables. The N points were then identified on both classified images by PCA and divergence. The ground truth for both images was extracted from the 1:10,000 aerial photographs. Error matrices were generated for both images to quantify and assess their classification accuracy. Kappa statistics has paditionally been used to evaluate the classification accuracy. It is determined as [12]:
Diverg,=x
i=l
j=i+l
D i v e r g ( c i , c j )/ m
Using the average divergence, the subset of bands having the maximum average diverage would be selected as the most appropriate set for the classification. However, to bound the range of the divergence, a transformed divergence is normally used [12]. The transformed divergence scales the degree of divergence to lie between 0 and 2000. The saturated value of 2000 indicates an excellent separability, whiIe a low value suggests a poor separability. In this study, we used the transformed divergence defined as: DivergT(ci,cj) = 2000( 1 - exp(-Diverg(c, cj)/S))
where r is the number of rows in the matrix, xil is the number of observations along the matrix diagonal, xi+ and x+iare the marginal totals for row i and column i, respectively, and N is the total number of observations.
6. RESULTS
To quantitatively compare the classified results derived from PCA and divergence methods, 7 bands of TM multispectral images were used. The supervised classification approach was employed tc create 6 spectral classes: cotton (C), forest (F), watei (W), pasture (P), grass (G), and residential (R). The maximum likelihood classifier was then applied to the first three principal components for the PCA and tc bands 1, 3 and 5 for divergence analysis Classification results are visually shown in Figure 1 Visual inspection of figure 1 shows that the classifiec image from divergence method exhibits a largt amount of noise expressed by scattered pixels o
2446
2447
cotton inside forested areas. This anomaly does not appear on the classified image generated by the PCA. Statistical analysis of the classification accuracy is presented in the error matrices below (table 1). These error matrices were established by comparing the classified images to the 1:10000 aerial photographs. Classes listed in the column denote the computer classification results while classes in the row represent reference (actual) results. The overall accuracy for divergence is 78% while the accuracy for PCA is 81%. The Kappa statistic is 70% and 74% for divergence and PCA, respectively. Most confusion is shown between cotton and grass and between cotton and residential Table 1. Error matrices of the classified images fram Divergence (a) and PCA (b). (a) ., Classified Data
components analysis and divergence for multispectral image classification were compared in this study. Classification results on the Landsat Thematic Mapper (TM) data showed that PCA is an effective approach for automated image classification. Although the overall classification accuracy shows a slight difference between these two approaches, we believe that if random groups of pixels instead of single pixels were evaluated, a higher classification accuracy would have resulted for the PCA. This was visually detected when examining the classified images derived from the PCA and divergence methods; the divergence method resulted in an image which presents a high amount of scattered pixels wrongly classified (e.g., a large number of cotton pixels scattered in forested areas).
7. ACKNOWLE
This work was supported by Grant No. NCCW0084 from the National Aeronautics and Space Administration (NASA), Washington, DC. Any use of trade, product or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government. The authors wish to thank Mr. Donvilla Williams for helping produce the output product and Dr. John Adams for proofreading it.
8. REFERENCES
(b)
for both images. The classified image using divergence present more confusion between cotton and forest, also shown visually in figure 1 (scattered cotton pixels inside forested areas), while the classified image using PCA present slightly higher confusion between pasture and grass.
7. CONCLUSION AND DISCUSSION
IEEE Trans. Patte Analysis and Machine Intelligence, Vol. PAMI-4, No. 3,304-306, 1982.
A major challenge for researchers in the field of image processing and remote sensing is to increase classification accuracy in the automated interpretation of remotely sensed multispectral data. Principal
[3] 2. Zhang, "A new spatial classification algorithm for high ground resolution images," proceedings of IGARSS '88 symposium, Edinburgh, Scotland, Sep. 13-16, 1988. [4] A. A. D. Canas and M. E. Barnett, "The Generation and Interpretation of False-Colour Composite Principal Component Images," Int. J. Remote Sensing, Vol. 6, No. 6, 867-881, 1985. [5] ERDAS, IMAGINE: Field Guide (Third Edition), ERDAS, Inc. Atlanta, Georgia, 1995.
2448
[6] T. M. Lillesand and R. W. Kiefer, Remote Sensing and Image Interpretation (2nd Edition), John Wiley & Sons, 1987. [7] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Addision Wesley, 1993.
[121 J. R. Jensen, Introductory Digital Image Processing: A Remote Sensing Pespective (2nd edition), Prentice-Hall, 1996.
[ 131 K. Fitzpatrick-Lins, The Accuracy of Selected Land Use and Land Cover Maps at Scales of 1 :250000 and 1 :1 00000, Journal of Research, U.S. Geological Survey:6: 169-173, 1980.
[SI A. R. Gillispie, Digital Techniques of Image Enhancement, in Remote Sensing in Geology, edited by B. S. Siegal and A. R. Gillispie, New York, Wiley, pp. 139-226, 1980.
[9] M. Shimura and T. Imai, Nonsupervised Classification Using the Principal Component, Pattern Recognition, Vol. 5, pp. 353-363, 1973.
[ 101 P. J. Taylor, Quantitative Methods in Geography: An Introduction to Spatial Analysis, Boston, Massachusetts: Houghton Mifflin Company, 1977.
[14] G. W. Snedecor and W. F. Cochran, Statistical Methods, Ames: Iowa State University Press: 202-21 1 and 516-517, 1967. [15] M. Assafi, A. Fahsi, and M. Azerzak, Utilization des images HRV de Spot pour La classification en mode tioccupation du sol de la ville de Casablanca (Maroc). Societe Francaise de Photogrammetric et de Teledetection, 145 (1): 8-17, 1997.
[ l l ] P. H. Swain and S. M. Davis (ed) Remote Sensing - The Quantitative Analysis, McGraw-Hill, 1978.
2449