Sunteți pe pagina 1din 9

2014

MOHAMED MOHSEN NASR SABER BAKR


UP201301705
FACULTY OF ARTS UNIVERSITY OF PORTO
MASTERS IN GEOGRAPHICAL INFORMATION SYSTEMS AND SPATIAL PLANNING

Statistical Analysis for the Portuguese Living
Conditions Indicators
By SPSS

Statistical Analysis for the Portuguese Living Conditions Indicators
2014

1


Index:


2 1. Introduction
2 2. Interpreting Output from SPSS
2 2.1 Descriptive Statistics
3 2.2. The Correlation Matrix
4 2.3. Kaiser-Meyer-Olkin (KMO) and Bartlett's Test
5 2.4. Communalities
5 2.5. Total Variance Explained:
6 2.6. Scree Plot
7 2.7. Factor Matrix
7 2.8. The goodness-of-fit Table
8 2.9. Rotated Factor Matrix
8 3. Conclusion
Statistical Analysis for the Portuguese Living Conditions Indicators
2014

2


1. Introduction:
In this report, I will review an analysis method for the
concept of living conditions and the way to measure it.
Specifically, we intend to investigate whether it is
possible to combine all indicators to compute a scale of
living conditions or, on the contrary, if we have a
concept that is multidimensional unfolding into different
aspects that require more than one scale.
So we will analyze the data and calculate composite
indices assessing their reliability. And also I will present
some descriptive analyzes of the variables that i have
created.
For analyzing the given data I will use Factor Analysis
method to find the factors among observed variables. In
other words, because the data contains many variables,
we will use factor analysis to reduce the number of
variables by grouping the variables which have the
similar characteristics together. With factor analysis we
can produce a small number of factors from a large
number of variables which is capable of explaining the
observed variance in the larger number of variables. The
reduced factors can also be used for further analysis.
There are three stages in factor analysis:
First, a correlation matrix is generated for all the
variables. A correlation matrix is a rectangular array of
the correlation coefficients of the variables with each
other.
Second, factors are extracted from the correlation matrix
based on the correlation coefficients of the variables.
Third, the factors are rotated in order to maximize the
relationship between the variables and some of the
factors.

2. Interpreting Output from SPSS:
2.1. Descriptive Statistics:
The first output from the analysis is a table of
descriptive statistics for all the variables under
investigation. Typically, the mean, standard deviation
and number of respondents (N) who participated in the
investigation are given. Looking at the mean, one can
conclude that the age group in 1998 is the most
important variable that influences the living condition of
the householder. It has the highest mean of 3.27.
Statistical Analysis for the Portuguese Living Conditions Indicators
2014

3


2.2. The Correlation matrix:
The next output from the analysis is the correlation coefficient. A correlation matrix is simply a rectangular array of numbers which gives the
correlation coefficients between a single variable and every other variables in the investigation. The correlation coefficient between a variable
and itself is always 1, hence the principal diagonal of the correlation matrix contains 1s. The correlation coefficients above and below the
principal diagonal are the same.
The coefficients of correlation express the degree of linear relationship between the row and column variables of the matrix. The closer to zero
the coefficient, the less the relationship; the closer to one, the greater the relationship. A negative sign indicates that the variables are inversely
related.

Statistical Analysis for the Portuguese Living Conditions Indicators
2014

4


2.3. Kaiser-Meyer-Olkin (KMO) and Bartlett's Test:
Measures strength of the relationship among variables

The KMO measures the sampling adequacy which should be greater than 0.5 for a satisfactory factor
analysis to proceed. If any pair of variables has a value less than this, consider dropping one of them from
the analysis. The off-diagonal elements should all be very small (close to zero) in a good model. Looking at
the table below, the KMO measure is 0.866.

There is no significant answer to question How many cases do I need to factor analysis?, and
methodologies differ. A common rule is to suggest that a researcher has at least 10-15 participants per
variable. Fiedel (2005) says that in general over 300 cases for sampling analysis is probably adequate. There
is universal agreement that factor analysis is inappropriate when sample size is below 50. Kaisen (1974)
recommend 0.5 as minimum (barely accepted), values between 0.7-0.8 acceptable, and values above 0.9 are
superb.

Bartlett's test is another indication of the strength of the relationship among variables. This tests the null
hypothesis that the correlation matrix is an identity matrix. An identity matrix is matrix in which all of the
diagonal elements are 1 and all off diagonal elements are 0. We want to reject this null hypothesis. From the
same table, we can see that the Bartlett's test of sphericity is significant That is, its associated probability is
less than 0.001. The significance level is small enough to reject the null hypothesis. This means that
correlation matrix is not an identity matrix.





Statistical Analysis for the Portuguese Living Conditions Indicators
2014

5


2.4. Communalities:
The next item from the output is a table of communalities which
shows how much of the variance in the variables has been
accounted for by the extracted factors. For instance over 99.9% of
the variance in age group in 1998 is accounted for while 2.3% of
the variance in householder sex is accounted for.


2.5. Total Variance Explained:
The next item shows all the factors extractable from the analysis
along with their eigenvalues, the percent of variance attributable to
each factor, and the cumulative variance of the factor and the
previous factors. Here we notice that the first factor accounts for
6.631% of the variance, the second 26.839% and the third 34.511%
and the fourth 39.398% and fifth 42.086%. All the remaining
factors are not significant.
`
Statistical Analysis for the Portuguese Living Conditions Indicators
2014

6


2.6. Scree Plot:
The scree plot is a graph of the eigenvalues against all the factors. The graph is useful for determining how
many factors to retain. The point of interest is where the curve starts to flatten. It can be seen that the curve
begins to flatten after factor 7. We note also that from factor 8 has an eigenvalue of less than 1, so only
seven factors have been retained.

Eigenvalue: The standardized variance associate with a particular factor. The sum of the eigenvalues cannot
exceed the number of items in the analysis, since each item contributes one to the sum of variances.

Statistical Analysis for the Portuguese Living Conditions Indicators
2014

7


2.7. Factor Matrix:
The table shows the loadings of the 28
variables on the five factors extracted. The
higher the absolute value of the loading, the
more the factor contributes to the variable.
The gap on the table represent loadings that
are less than 0.5, this makes reading the table
easier. We suppressed all loadings less than
0.5.

As we can see the table displays the format
of an unrotated factor matrix. The columns
define the factors; the rows refer to variables.
In the intersection of row and column is
given the loading for the row variable on the
column factor
The third factor is associated with household
can fund to keep warm habitation and
Household can fund worn or damaged
furniture replacement indicators.
The second factor corresponds most strongly
to 19 from 28 of the living conditions
indicators more than the other 4 factors.

2.8. The goodness-of-fit table:
Gives an indication of how well our 5 factors
reproduce the variables' or items' variance
covariance matrix. Here, the test shows that
the reproduced matrix is significantly
different from the observed matrix.


Statistical Analysis for the Portuguese Living Conditions Indicators
2014

8


2.9. Rotated Factor Matrix:
The idea of rotation is to reduce the number
factors on which the variables under
investigation have high loadings. Rotation
does not actually change anything but makes
the interpretation of the analysis easier.

As we see that each factor has several indictors
which only loading in it greater than other
factors.

The Factor Transformation Matrix displays
the correlations among the factors prior to and
after rotation.











3. Conclusion:
As a general conclusion, we can say we have
five factors accounting for 42.086% of the
variance in our 28 items. In the Rotated Factor
Matrix table we see clear factor structure
displayed; meaning, each item loads
predominantly on one factor. For instance, the
first four items load virtually exclusively on
Factor 1. Furthermore, if we look at the
communalities we see that all the items
displayed a communality of 0.40 or
greater, with some exceptions which are a little
lower than we would like and given that each
factor has other items which load substantially
on it, we may choose to remove them from
further analysis or measurement.