Sunteți pe pagina 1din 44

Principal Components Method

A Bose

BIMTECH

November 2009

A Bose (BIMTECH) Factor Analysis 1/ 44 November 2009 1 / 44


What is Factor Analysis

Table of Contents
1 What is Factor Analysis
2 Description of an Ilustrative Problem
3 Inputs for SPSS
Descriptives inputs
Extraction input
Rotation input
Request Analysis
4 Export output
5 PCA Analysis
Check Appropriateness of Factor Analysis
Sample Size Requirement validation
Presence of substantial correlations
Sampling adequacy of individual variables
Sampling adequacy for set of variables
Bartlett test of sphericity
Number of factors to extract
Latent root criterion
Percentage of variance criterion
Evaluating Communalities
Communality requiring variable removal
Identifying complex structure
A Bose (BIMTECH) Factor Analysis 2/ 44 November 2009 2 / 44
What is Factor Analysis

What is Factor Analysis


Factor Analysis is a method of Data Reduction. It does this by seeking underlying
unobservable (latent) variables or factors that are reflected in the observed variables
(manifest variables or factors). The latent factors explain most of the variance
observed in the much larger number of manifest variables

Principal Component Analysis, which is a technique or method for Factor Analysis,


transforms a set of correlated variables into a smaller set of uncorrelated variables
called principal components.

Factor analysis requires a large sample size, at least 300.

A Bose (BIMTECH) Factor Analysis 3/ 44 November 2009 3 / 44


Description of an Ilustrative Problem

Table of Contents
1 What is Factor Analysis
2 Description of an Ilustrative Problem
3 Inputs for SPSS
Descriptives inputs
Extraction input
Rotation input
Request Analysis
4 Export output
5 PCA Analysis
Check Appropriateness of Factor Analysis
Sample Size Requirement validation
Presence of substantial correlations
Sampling adequacy of individual variables
Sampling adequacy for set of variables
Bartlett test of sphericity
Number of factors to extract
Latent root criterion
Percentage of variance criterion
Evaluating Communalities
Communality requiring variable removal
Identifying complex structure
A Bose (BIMTECH) Factor Analysis 4/ 44 November 2009 4 / 44
Description of an Ilustrative Problem

Principal Component Analysis using an example

A college has collected data on a number of faculty measures. The question that
is of interest is how many different kinds of information are we getting from the
measures obtained.

For the sake of expediency, we shall focus our attention on 12 out of the available
67 measures, and we shall examine what these measures collected using a
questionnaire tell us.

A Bose (BIMTECH) Factor Analysis 5/ 44 November 2009 5 / 44


Description of an Ilustrative Problem

Principal Component Analysis using an example

The 12 factors we shall look at are


Factor No. Description
1. Instructor well prepared
2. Instructor’s scholarly grasp
3. Instructor’s confidence
4. Instructor focus lectures
5. Instructor uses clear relevant examples
6. Instructor sensitive to students
7. Instructor allows me to ask questions
8. Instructor is accessible to students outside class
9. Instructor aware of students’ understanding
10. I am satisfied with student performance evaluation
11. Compared to other instructors this instructor is
12. Compared to other courses this course was

A Bose (BIMTECH) Factor Analysis 6/ 44 November 2009 6 / 44


Description of an Ilustrative Problem

Steps for FA in SPSS

1. Input preparation for SPSS 3. Statistical Analysis &


1.1 Get your data into SPSS Necessary Iterations
1.2 Choose Analysis technique 3.1 Validate Sample Size requirement
& add variables for analysis 3.2 Validate Appropriateness of FA
1.3 Descriptives dialogue entry 3.3 Find No. of Factors to extract
1.4 Extraction dialog entry 3.4 Evaluate Communalities
1.5 Rotation dialog entry 3.5 Repeat Factor Analysis as
1.6 Scores dialog entry indicated and appropriate
1.7 Options dialog entry 3.6 Replicate the Factor Analysis
3.7 Check if Communality is
satisfied for all variables
2. Running SPSS PCA & 3.8 Identify Complex Structure
Exporting Analysis Results and Remove complex factors
2.1 Request Analysis 3.9 Examine Variable Loading
2.2 Export Analysis Results on Factors
3.10 Split Sample Validation
3.11 Detect and Remove Outliers
3.12 Chronbach’s Alpha

A Bose (BIMTECH) Factor Analysis 7/ 44 November 2009 7 / 44


Inputs for SPSS

Table of Contents
1 What is Factor Analysis
2 Description of an Ilustrative Problem
3 Inputs for SPSS
Descriptives inputs
Extraction input
Rotation input
Request Analysis
4 Export output
5 PCA Analysis
Check Appropriateness of Factor Analysis
Sample Size Requirement validation
Presence of substantial correlations
Sampling adequacy of individual variables
Sampling adequacy for set of variables
Bartlett test of sphericity
Number of factors to extract
Latent root criterion
Percentage of variance criterion
Evaluating Communalities
Communality requiring variable removal
Identifying complex structure
A Bose (BIMTECH) Factor Analysis 8/ 44 November 2009 8 / 44
Inputs for SPSS Descriptives inputs

Choose Analysis Method, Descriptive Inputs

To compute a principal component analysis in SPSS, select the


1 Dimension Reduction → Factor command from the Analyze menu.
1.1 move the variables listed in the problem to the Variables list box
1.2 click on the Descriptives button to specify statistics to include in the output
1.3 complete the Descriptives dialog box
1.3.1 mark the Univariate descriptives checkbox to get a tally of valid cases
1.3.2 keep the Initial solution checkbox to get the statistics needed to determine the
number of factors to extract
1.3.3 mark the Coefficients checkbox to get a correlation matrix, one of the outputs
needed to assess the appropriateness of factor analysis for the variables
1.3.4 mark the KMO and Bartletts test of sphericity checkbox to get more outputs
used to assess the appropriateness of factor analysis for the variables
1.3.5 mark the Anti-image checkbox to get more outputs used to assess the
appropriateness of factor analysis for the variables
1.3.6 click on the Continue button

A Bose (BIMTECH) Factor Analysis 9/ 44 November 2009 9 / 44


Inputs for SPSS Descriptives inputs

First steps in a Principal Component Analysis

A Bose (BIMTECH) Factor Analysis 10/ 44 November 2009 10 / 44


Inputs for SPSS Extraction input

Select the Extraction Method

The extraction method refers to the mathematical method that SPSS uses to
compute the factors or components.
1 click on the Extraction button to specify statistics to include in the output
2 complete the Extraction dialog box
2.1 retain the default method Principal components
2.2 click on the Continue button

A Bose (BIMTECH) Factor Analysis 11/ 44 November 2009 11 / 44


Inputs for SPSS Extraction input

Extraction input

A Bose (BIMTECH) Factor Analysis 12/ 44 November 2009 12 / 44


Inputs for SPSS Rotation input

Select the Rotation Method

The rotation method refers to the mathematical method that SPSS rotate the
axes in geometric space. This makes it easier to determine which variables are
loaded on which components (factors).
1 click on the Rotation button to specify statistics to include in the output
2 complete the Rotation dialog box
2.1 mark the Varimax method as the type of rotation to used in the analysis
2.2 click on the Continue button

A Bose (BIMTECH) Factor Analysis 13/ 44 November 2009 13 / 44


Inputs for SPSS Rotation input

Rotation input

A Bose (BIMTECH) Factor Analysis 14/ 44 November 2009 14 / 44


Inputs for SPSS Request Analysis

Request Analysis

1 Complete the Request for Analysis


2 Click on the OK button to request the output

A Bose (BIMTECH) Factor Analysis 15/ 44 November 2009 15 / 44


Export output

Table of Contents
1 What is Factor Analysis
2 Description of an Ilustrative Problem
3 Inputs for SPSS
Descriptives inputs
Extraction input
Rotation input
Request Analysis
4 Export output
5 PCA Analysis
Check Appropriateness of Factor Analysis
Sample Size Requirement validation
Presence of substantial correlations
Sampling adequacy of individual variables
Sampling adequacy for set of variables
Bartlett test of sphericity
Number of factors to extract
Latent root criterion
Percentage of variance criterion
Evaluating Communalities
Communality requiring variable removal
Identifying complex structure
A Bose (BIMTECH) Factor Analysis 16/ 44 November 2009 16 / 44
Export output

Exporting the SPSS Output

A Bose (BIMTECH) Factor Analysis 17/ 44 November 2009 17 / 44


PCA Analysis

Table of Contents
1 What is Factor Analysis
2 Description of an Ilustrative Problem
3 Inputs for SPSS
Descriptives inputs
Extraction input
Rotation input
Request Analysis
4 Export output
5 PCA Analysis
Check Appropriateness of Factor Analysis
Sample Size Requirement validation
Presence of substantial correlations
Sampling adequacy of individual variables
Sampling adequacy for set of variables
Bartlett test of sphericity
Number of factors to extract
Latent root criterion
Percentage of variance criterion
Evaluating Communalities
Communality requiring variable removal
Identifying complex structure
A Bose (BIMTECH) Factor Analysis 18/ 44 November 2009 18 / 44
PCA Analysis Check Appropriateness of Factor Analysis

FA Appropriateness: Sample Size Requirement


The number of valid cases for this set of variables is 1365. SPSS has excluded 63 out of
the total 1428 cases because of the option chosen, Option → Exclude Cases Listwise
Principal Component Analysis can be conducted on a sample that has at least 50 cases.
Our sample data has a much larger number, 1365 cases.
Sample size requirement: Ratio of cases to variables
The ratio of cases to variables in PCA should be > 5 : 1. With 1365 cases and 12
variables, the ratio of cases to variables is > 100 : 1, far exceeding the minimum
requirement.

A Bose (BIMTECH) Factor Analysis 19/ 44 November 2009 19 / 44


PCA Analysis Check Appropriateness of Factor Analysis

FA Appropriateness: Presence of substantial correlations


Principal components analysis requires that there be some correlations greater
than 0.30 between the variables included in the analysis. For this set of variables,
there only 2 correlations out of 12 x 12 = 144 correlations in the matrix that are less than
0.30. So the requirement is well satisfied. The correlations less than 0.30 are highlighted in
yellow.

A Bose (BIMTECH) Factor Analysis 20/ 44 November 2009 20 / 44


PCA Analysis Check Appropriateness of Factor Analysis

Sampling adequacy of individual variables


There are two anti-image matrices: the anti-image covariance matrix and the anti-image
correlation matrix. We are interested in the anti-image correlation matrix. Principal
component analysis requires that the Kaiser-Meyer-Olkin Measure of Sampling Adequacy
be greater than 0.50 for each individual variable as well as the set of variables.
On iteration 1, the MSA for all of the individual variables included in the analysis was
greater than 0.5, infact > 0.9, supporting their retention in the analysis.

A Bose (BIMTECH) Factor Analysis 21/ 44 November 2009 21 / 44


PCA Analysis Check Appropriateness of Factor Analysis

FA Appropriateness: Sampling adequacy for set of variables

In addition, the overall MSA for the set of variables included in the analysis was
0.934, which exceeds the minimum requirement of 0.50 for overall MSA.

A Bose (BIMTECH) Factor Analysis 22/ 44 November 2009 22 / 44


PCA Analysis Check Appropriateness of Factor Analysis

FA Appropriateness: Bartlett test of sphericity

Principal component analysis requires that the probability associated with


Bartlett’s Test of Sphericity be less than the level of significance.
The probability associated with the Bartlett test is < 0.001, so our data satisfies
this requirement also.

A Bose (BIMTECH) Factor Analysis 23/ 44 November 2009 23 / 44


PCA Analysis Number of factors to extract

Number of factors to extract: Latent root criterion

Using the output from iteration 1, there are 2 eigenvalues greater than 1.0.
The latent root criterion for extraction of factors if eigen value > 1 indicates that
there are 2 components to be extracted for these variables.

A Bose (BIMTECH) Factor Analysis 24/ 44 November 2009 24 / 44


PCA Analysis Number of factors to extract

Number of factors to extract: Percentage of variance


criterion
In addition, the cumulative proportion of variance criteria can be met with the top
2 factors to satisfy the criterion of explaining 60% or more of the total variance.
A 2 factor solution would explain 62.322% of the total variance.
Since we have instructed SPSS to extract components with eigen value > 1
(latent root criterion), our initial factor solution is based on the extraction of
these 2 components only.

A Bose (BIMTECH) Factor Analysis 25/ 44 November 2009 25 / 44


PCA Analysis Number of factors to extract

Number of factors to extract: Percentage of variance


criterion
Also look at the Scree Plot, and we can immediately see that only the top two
components have significant eigen values (> 1)

A Bose (BIMTECH) Factor Analysis 26/ 44 November 2009 26 / 44


PCA Analysis Evaluating Communalities

Evaluating communalities
Communalities represent the proportion of the variance in the original variables
that is accounted for by the factor solution.

The factor solution should explain at least half of each original variable’s variance,
so the communality value for each variable should be 0.50 or higher.

A Bose (BIMTECH) Factor Analysis 27/ 44 November 2009 27 / 44


PCA Analysis Evaluating Communalities

Communality requiring variable removal

On iteration 1, the communality for the variable


”INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS”
is 0.494. Since this is less than 0.50, this component should be removed from the
next iteration of the principal component analysis.

The variable is removed and the principal component analysis is redone.

A Bose (BIMTECH) Factor Analysis 28/ 44 November 2009 28 / 44


PCA Analysis Evaluating Communalities

Repeating the factor analysis

In the drop down menu, select Factor Analysis to reopen the factor analysis dialog
box.

To remove the Factor


INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS
(whose communality is 0.494 < 0.5) from the list of variables

1 highlight the required variable


2 click on the left arrow button to remove the variable from the Variables list
box
3 The dialog recall command opens the dialog box with all of the settings that
we had selected the last time we used factor analysis
4 to repeat the analysis without the variable that we just removed, click on the
OK button

A Bose (BIMTECH) Factor Analysis 29/ 44 November 2009 29 / 44


PCA Analysis Evaluating Communalities

Communality satisfactory for all variables

We now have one less factor, i.e. 11 factors. We see that the communality
requirements are met by all the 11 factors.

A Bose (BIMTECH) Factor Analysis 30/ 44 November 2009 30 / 44


PCA Analysis Identifying complex structure

Identifying complex structure


Once any variables with communalities less than 0.50 have been removed from the
analysis, the pattern of factor loadings should be examined to identify variables that
have complex structure.

Complex structure occurs when one variable has high loadings or correlations (0.40
or greater) on more than one component. If a variable has complex structure, it
should be removed from the analysis.

Variables are only checked for complex structure if there is more than one component
in the solution. Variables that load on only one component are described as having
simple structure.

A Bose (BIMTECH) Factor Analysis 31/ 44 November 2009 31 / 44


PCA Analysis Identifying complex structure

Identifying complex structure


The highlighted variables show
complex structure & have high loading
on both components. We remove these
variables & repeat the Factor Analysis
with the remaining 11-4=7 variables.

A Bose (BIMTECH) Factor Analysis 32/ 44 November 2009 32 / 44


PCA Analysis Identifying complex structure

Removing the Complex Structure Variables


We redo the Factor Analysis after removing the 4 variables that exhibit complex
structure

A Bose (BIMTECH) Factor Analysis 33/ 44 November 2009 33 / 44


PCA Analysis Final Checks

Communality checks at end of Iteration

Once we have resolved any problems with complex structure, we check the
communalities one last time to make certain that we are explaining a sufficient
portion of the variance of all of the original variables.
The communalities for all of the variables included on the components were
greater than 0.50.

A Bose (BIMTECH) Factor Analysis 34/ 44 November 2009 34 / 44


PCA Analysis Final Checks

Checks for Complex Structure

The Rotated Component Matrix does not reveal any further variables with
complex structure.
So, our iterations have been completed

A Bose (BIMTECH) Factor Analysis 35/ 44 November 2009 35 / 44


PCA Analysis Interpreting the principal components

Interpreting the principal components

The information in 7 of the variables can be represented by 2 components.


Component 1 includes variables Component 2 includes variables
1 INSTRUC WELL PREPARED 1 INSTRUCTOR SENSITIVE TO
2 INSTRUC SCHOLARLY GRASP STUDENTS
3 INSTRUCTOR CONFIDENCE
2 INSTRUCTOR ALLOWS ME TO
ASK QUESTIONS
4 INSTRUCTOR FOCUS LECTURES
3 I AM SATISFIED WITH STUDENT
PERFORMANCE EVALUATION
A Bose (BIMTECH) Factor Analysis 36/ 44 November 2009 36 / 44
PCA Analysis Interpreting the principal components

Component 1 (capability) variables Component 2 (interactiveness) variables


1 INSTRUC WELL PREPARED
2 INSTRUC SCHOLARLY GRASP 1 INSTRUCTOR SENSITIVE TO
STUDENTS
3 INSTRUCTOR CONFIDENCE
2 INSTRUCTOR ALLOWS ME TO
4 INSTRUCTOR FOCUS LECTURES
ASK QUESTIONS
3 I AM SATISFIED WITH STUDENT
PERFORMANCE EVALUATION

We have derived two components, each a linear combination of the 7 remaining


components. These two components explain a large chunk of the variability in the
original set of data and the included varibles are simply structured. This means
that the seven variables that we are left with load on one or other, but not both
components.

Also, observe that component 1 has aligned attributes that indicate the
instructors teaching capabilities, whereas component 2 indicates how interactive
and open the instructor is to his students.

A Bose (BIMTECH) Factor Analysis 37/ 44 November 2009 37 / 44


PCA Analysis Total Variance Explained

Total Variance Explained

The two components explain 69.61% of the total variance in the seven variables
which are included on the components

A Bose (BIMTECH) Factor Analysis 38/ 44 November 2009 38 / 44


PCA Analysis Computing Chronbach’s Alpha

Computing Chronbach’s Alpha

To compute Chronbach’s alpha for each component in our analysis, we select


Scale → Reliability Analysis from the Analyze menu.

We do an iteration for each of the two selected components.


For the 1st component
1 move the four variables that loaded on the first component to the Items list
box
2 click on the Statistics button to select the statistics we will need
3 mark the checkboxes for Item, Scale, and Scale if item deleted
4 click on the Continue button
5 if Alpha is not selected as the Model in the drop down menu, select it now
6 click on the OK button to produce the output

A Bose (BIMTECH) Factor Analysis 39/ 44 November 2009 39 / 44


PCA Analysis Computing Chronbach’s Alpha

Computing Chronbach’s Alpha for Component 1

A Bose (BIMTECH) Factor Analysis 40/ 44 November 2009 40 / 44


PCA Analysis Computing Chronbach’s Alpha

Computing Chronbach’s Alpha

For the 2nd component


1 remove the four variables that loaded on the first component from the Items
list box
2 move the three variables that loaded on the second component to the Items
list box
3 click on the Statistics button to select the statistics we will need
4 mark the checkboxes for Item, Scale, and Scale if item deleted
5 click on the Continue button
6 if Alpha is not selected as the Model in the drop down menu, select it now
7 click on the OK button to produce the output

A Bose (BIMTECH) Factor Analysis 41/ 44 November 2009 41 / 44


PCA Analysis Computing Chronbach’s Alpha

Computing Chronbach’s Alpha for Component 2

A Bose (BIMTECH) Factor Analysis 42/ 44 November 2009 42 / 44


PCA Analysis Chronbach Alpha result for Component 1

Chronbach Alpha result for Component 1

Chronbach’s Alpha is 0.844 for our 1st


component. An alpha of 0.60 or higher is
the minimum acceptable level. Preferably,
alpha will be 0.70 or higher, as it is in this
case.
For the table Item-total Statistics, look at
the column Alpha if Item Deleted.
If alpha is too small, this column may
suggest which variable should be removed to
improve the internal consistency of the scale
variables. It tells us what alpha we would
get if the variable listed were removed from
the scale.

A Bose (BIMTECH) Factor Analysis 43/ 44 November 2009 43 / 44


PCA Analysis Chronbach Alpha result for Component 2

Chronbach Alpha result for Component 2

Chronbach’s Alpha is 0.783 for our 2nd


component. An alpha of 0.60 or higher is
the minimum acceptable level. Preferably,
alpha will be 0.70 or higher, as it is in this
case.

A Bose (BIMTECH) Factor Analysis 44/ 44 November 2009 44 / 44

S-ar putea să vă placă și