Sunteți pe pagina 1din 33

AD 601

EXPLANATORY
FACTOR ANALYSIS
Tutku Sekin elik
Yusuf

Research Question
HBAT company
Newsprint
Magazine
Are there any differences in customer perceptions

towards to magazine industry and newsprint industry of


HBAT company?
We want to use explanatory factor analysis to reduce the

dimensions of perceptions

What is EFA?

An interdependence technique

Primary purpose is to define the underlying structure among

the variables in the analysis.


A tool

for analyzing the structure of interrelationships


(correlations) among a large number of variables by defining
sets of variables that are highly interrelated, known as factors.

Also used for data reduction for further use.

HBAT Data
13 attributes about perceptions of HBAT were developed

through focus groups, a pretest and use in previous


studies
Sample consisted 200 purchasing managers of
companies buying from HBAT
Respondents were asked to rate HBAT on 13 attributes
using a 0-10 graphic rating scale
0 indicates poor and 10 indicates excellent

Evaluation of Data
No missing data
Some outliers, no variable has a standard deviation more

than 2.5, thus we decided to keep the outliers

Evaluation of Data
Also multivariate detection of outliers: Mahalanobis D2
We employ regression then we calculate the z scores for

Mahalanobis values by using compute variable command


and choosing cdf.chisq,
If there is any variable has a Mahalanobis probability less

than 0.001. Since there was none, we decided that there


is no outliers to delete in our dataset.

Assumptions
Normality, homoscedasticity, and linearity
Normality, necessary for statistics
Table 2: Tests of Normality
Kolmogorov-Smirnova
Statistic

df

Shapiro-Wilk
Sig.

Statistic

df

Sig.

X6 - Product Quality

.095

200

.000

.950

200

.000

X7 - E-Commerce

.122

200

.000

.962

200

.000

X8 - Technical Support

.046

200

.200

.989

200

.114

X9 - Complaint Resolution

.045

200

.200*

.996

200

.844

X10 - Advertising

.078

200

.005

.984

200

.021

X11 - Product Line

.063

200

.049

.984

200

.025

X12 - Salesforce Image

.107

200

.000

.981

200

.007

X13 - Competitive Pricing

.091

200

.000

.971

200

.000

X14 - Warranty & Claims

.058

200

.093

.996

200

.824

X15 - New Products

.036

200

.200*

.996

200

.912

X16 - Order & Billing

.105

200

.000

.984

200

.022

X17 - Price Flexibility

.095

200

.000

.968

200

.000

X18 - Delivery Speed

.086

200

.001

.984

200

.026

Those tests indicate that if the significance values of the attributes

are greater than .05, they are normally distributed.

Assumptions
Histograms

Assumptions
Log, 1/x, Square root, x2, x3 did not help to normalize the

other variables, even variables became much


complicated.
We tried Arcsin, but it didnt transform the data
So, we decided to continue our analysis with the original

variables except X11, while keeping in mind that most of


the variables are not normally distributed.

Assumptions - Homogeneity
. According to Levene statistics, only X13 Competitive

Pricing has a homogenous variance


Table 3: Test of Homogeneity of Variances
Levene Statistic

df1

df2

Sig.

X7 - E-Commerce

,533

198

,466

X8 - Technical Support

,018

198

,892

X9 - Complaint Resolution

,002

198

,963

X10 - Advertising

,775

198

,380

1,228

198

,269

,080

198

,777

X13 - Competitive Pricing

5,116

198

,025

X14 - Warranty & Claims

2,403

198

,123

X15 - New Products

,917

198

,339

X16 - Order & Billing

,451

198

,503

X17 - Price Flexibility

2,717

198

,101

X18 - Delivery Speed

,006

198

,939

tr_x11 - transformed product line


X12 - Salesforce Image

Assumptions - Linearity

Objectives of EFA
Specify the unit of analysis
What is being grouped?
Cases or respondents (Q type)
Variables (R type)
Achieving data summarization vs. data reduction
Data summarization: identifying underlying dimensions
Data reduction: using factor loadings as the basis for subsequent
analysis
Variable selection
Consider the conceptual underpinnings and intuition as to the
appropriateness of variables
Comprehensive & parsimonious

Designing a Factor Analysis


Correlations among variables or respondents
R type correlation matrix
Q type factor matrix
Variable selection and measurement issues
Metric variables, if necessary dummy code nonmetric ones
Reasonable number of variables
Sample size 100
More observations than variables
At least 50 observation
Number of observations per variable -> 5:1

EFA
An R type EFA was employed
Aim is data reduction
We looked at the correlation matrix of variables
Sample size is 200, which is more than the required

number 100.
There are more observations then variables, as
suggested.
Number of observations per variable is approximately
15:1, which is more than the desired limit of 5:1

Assumptions in Factor Analysis


Conceptual issues
Conceptually valid & appropriate patterns
Homogenous sample with respect to underlying factor structure

Statistical issues
Overall measures of intercorrelation
Correlations btw variables 0.30
Small partial correlations (unexplained correlation when the effects of
other variables are taken into account)
Anti-image correlation matrix (correlations 0.70)
Bartlett test of sphericity (significance < 0.05)
Measure of sampling adequacy (MSA) (MSA values > 0.50)
Variable-specific measures of intercorrelation
MSA for each variable (MSA values > 0.50)

Assumptions in Factor Analysis


Bartlett test of sphericity is significant at .00 < .05 indicating

that correlation matrix has significant correlations among at


least some of the variables.
Also, MSA value is .648, which is more than desired .5

indicating the appropriateness of factor analysis.


Table 4: KMO and Bartlett's Test

Kaiser-Meyer-Olkin Measure of Sampling Adequacy.

Approx. Chi-Square

Bartlett's Test of Sphericity

df

Sig.

,648

1875,571

78

,000

Correlation Matrix
Hair et al. suggests that correlations lower than 0.30 may show that factor analysis is
inappropriate. Of the 91 correlations between variables, 20 of them had correlation
values higher than 0.30.

X6-ProductQuality

X6-
Product
Quality

X7-ECommerce

X8-
X9-
X10-
Technical Complaint
Advertising
Support
Resolution

tr_x11-
X12-
X13-
X14-
X15-
transformed Salesforce Competitive Warranty
New
productline
Image
Pricing
&Claims Products

X16-
Order
&
Billing

X17-
Price
Flexibility

X18-
Delivery
Speed

X7-E-Commerce

-0,034

X8-TechnicalSupport

0,087

0,041

X9-Complaint
Resolution

0,09

,192**

,152*

X10-Advertising

-0,054

,505**

0,028

,234**

tr_x11-transformed
productline

,491**

0,069

,166*

,576**

,145*

X12-SalesforceImage

-0,116

,788**

0,086

,256**

,627**

0,056

X13-Competitive
Pricing

-,448**

,177*

-0,092

-0,077

0,099

-,484**

,200**

X14-Warranty&
Claims

0,109

0,103

,838**

,181*

0,035

,232**

,163*

-0,085

X15-NewProducts

0,136

-0,041

-0,038

0,09

0,063

,144*

0,009

-0,121

0,03

X16-Order&Billing

0,083

,217

0,121

,741

**

,230

,466

,284

-0,06

,204

0,137

X17-PriceFlexibility

-,487

**

,186

-0,029

,418

**

,260

-,309

,272

,470

-0,041

0,047

,419**

X18-DeliverySpeed

0,067

,241**

0,132

,878**

,323**

,629**

,299**

-0,055

,183**

,147*

,773**

,513**

**

**

**
**

**
**

**
**

**

**

Partial Correlations
Partial correlations should be small, as opposed to

correlations. In anti-image correlation matrix, the values


other than the diagonal shows us the partial correlations.
Only X8 and X14, and X17 and X18 had high partial
correlations with each other.
MSA less than 0.50 MSA values should be deleted one by
one.
X17 Price Flexibility demonstrated a lower individual MSA
value of 0.4 that we deleted this variable as it strictly
suggested by Hair et al.

Partial Correlations

X6-ProductQuality

X6-
Product
Quality

X7-ECommerc
e

X8-
Technical
Support

tr_x11-
X9-
X10-
transforme
Complaint Advertisin
dproduct
Resolution
g
line

X12-
Salesforc
eImage

X13-
Competitiv
ePricing

X17-
X14-
X16-
X15-New
Price
Warranty&
Order&
Products
Flexibilit
Claims
Billing
y

,859a

X7-E-Commerce

-0,096

,692a

X8-Technical
Support

-0,024

0,019

,518a

X9-Complaint
Resolution

-0,017

0,055

-0,092

,908a

X10-Advertising

-0,038

-0,022

-0,084

0,099

,746a

tr_x11-transformed
productline

0,026

0,025

0,029

0,013

-0,209

,499a

X12-Salesforce
Image

0,113

-0,676

0,068

-0,071

-0,437

0,141

,655a

X13-Competitive
Pricing

0,118

-0,08

0,071

0,003

0,021

0,072

-0,044

,923a

X14-Warranty&
Claims

0,014

0,012

-0,839

0,078

0,129

-0,049

-0,144

-0,063

,542a

X15-NewProducts

-0,128

0,089

0,114

0,085

-0,025

-0,086

-0,045

0,086

-0,092

,567a

X16-Order&Billing

-0,089

0,008

0,11

-0,208

0,073

-0,012

-0,084

0,071

-0,137

-0,029

,937a

0,228

0,041

-0,012

0,036

-0,208

0,904

0,121

-0,123

0,017

-0,115

-0,069

,442a

-0,088

-0,057

0,002

-0,352

0,095

-0,871

-0,088

0,011

0,001

0,039

-0,13

-0,86

X17-Price
Flexibility
X18-DeliverySpeed

X18-
Delivery
Speed

,586a

Correlations revisited
After deleting X17, KMO and Bartletts Test results

showed that total MSA value raised to 0,695 from 0,648,


and Bartlett test gave us a significant result, again.
The new correlation matrix presented 15 correlations
more than 0.30 while the number was 20 with X17,
although most of the correlations were significant.

Deriving Factors and Assessing Overall Fit


Selecting the Factor Extraction Method:

Common factor analysis vs. Component factor analysis


Total variance = common variance + specific variance + error variance
Component analysis considers the total variance, most appropriate

when:
Data reduction is primary concern (minimum number of factor to account

for maximum portion of total variance)


Prior knowledge suggests that specific and error variance is small
Common factor analysis considers only common variance, used when:
Data summarization is the primary objective
Little knowledge about specific and error variance

Stage 4: Deriving Factors and Assessing Overall Fit


Criteria for number of factors to extract
Latent root criterion (component)
Eigenvalues 1 if 20 <of variables < 50
A priori criterion set a predetermined of factors
Percentage of variance criterion until achieving a specified

cumulative % of total variance (%60 in social sciences)


Scree test criterion factors before inflection point
More factors if the respondents are heterogeneous

Deriving Factors and Assessing Overall Fit


Principal component factor analysis
Factor extraction method, latent root criterion which only except factors with an

eigenvalues more than 1


The below table show that the first factor accounts for 31%, the second one 19%,
the third %16, and the fourth 15% of total variance.
A total of 75% of total variance was explained with a four-factor solution

InitialEigenvalues
Componen
t
1

RotationSumsofSquaredLoadings

Total
3,723

% of
Variance
31,024

Cumulative
%
31,024

Total
2,888

% of
Variance
24,066

Cumulative
%
24,066

2,320

19,331

50,355

2,330

19,416

43,482

1,689

14,071

64,427

1,910

15,914

59,396

1,267

10,559

74,986

1,871

15,590

74,986

,946

7,879

82,865

,574

4,787

87,652

,489

4,078

91,730

,342

2,852

94,582

,228

1,902

96,484

10

,187

1,561

98,044

11

,136

1,137

99,182

12

,098

,818

100,000

Scree Plot
If we had employed scree test criterion, then we would have came up more
factors. As you can see from the figure, inflection point was after the sixth factor

Interpreting the Factors


Three processes
Estimate the factor matrix
Factor rotation Orthogonal or Oblique
Orthogonal

best suited when the aim is data reduction

QUARTIMAX
VARIMAX
EQUIMAX

best suited when the aim is to obtain theoretically meaningful


factors or constructs

Oblique

OBLIMIN

Factor interpretation and respecification


Factor loadings 0.30, preferably 0.50
Avoid cross-loadings
Communality 0.50
Respecify the factor model if needed
Label the factors

Interpreting the Factors


Orthogonal VARIMAX rotation was used due to its simplicity

and wide usage.


For data reduction purposes orthogonal rotation is suggested
We set factor loadings more than 0.40 as significant

Table11:RotatedComponent
Matrix

Table10:ComponentMatrix
Component

X6

X7

,468

2
-,571

,836

X10

,493

,535

tr_x11

,699

-,499

X12

,535

,689

X6
X7

,867

X9

4
,567

,646

X8

Component
3
,831

,882

X8
X9

,954
,920

X10
tr_x11
X12

,784
,576

,663
,908

Rotation
As you can see unrotated factor solution had lots of cross

loadings that rotation was necessary.


X11 Product Line had a cross-loading both in factor 1 and
3, and loadings were closer than 0.10.
Thus, we tried another rotation methods to remedy this
inconsistency, but other rotation methods also gave crossloadings for X11.
As a result, we decided to eliminate X11.
Also, X15 had a lower loading than 0.40 that we
eliminated this variable too.
After the elimination of X11 and X15, total variance
explained by 4 factor solution increased to 81%.

Rotated Component Matrix


Table12:NewRotatedComponentMatrix

Component

Communalities

X18-DeliverySpeed

0,932

,908

X9-ComplaintResolution

0,929

,886

X16-Order&Billing

0,88

,804

X12-SalesforceImage

0,904

,868

X7-E-Commerce

0,884

,793

X10-Advertising

0,78

,642

X8-TechnicalSupport

0,954

,918

X14-Warranty&Claims

0,948

,921

X6-ProductQuality

0,861

,746

X13-CompetitivePricing

-0,827

,714

Factors
Factor 1: Delivery Speed, Complaint Resolution, Order &

Billing (SalesSupport)
Factor 2: Salesforce Image, E-Commerce, Advertising

(Recognition)
Factor 3: Technical Support, Warranty & Claims (After

SalesServices)
Factor 4: Product Quality, Competitive Pricing (Quality&

Price)

Validation of Factor Analysis


Assess the generalizability of results
Split sample
Separate sample
CFA
Detect the influential observations - outliers

Validation of Factor Analysis


We used split sample. When we run the factor analysis with split sample, MSA value
became 0,652, and Bartlett Test gave a significant result that factor structure can be
examined in split sample also. Besides rotated component matrix demonstrated the
same factor structure.
Table 13: Rotated Component Matrix of Split Sample
Component
1

X18 - Delivery Speed

,927

X9 - Complaint Resolution

,912

X16 - Order & Billing

,866

X12 - Salesforce Image

,922

X7 - E-Commerce

,882

X10 - Advertising

,793

X14 - Warranty & Claims

,952

X8 - Technical Support

,948

X6 - Product Quality
X13 - Competitive Pricing

,869
-,812

Additional Uses of EFA Results


Data Reduction Options
Surrogate variable
Summated scales
Unidimensionality
Reliability
Item-to-total correlations 0.50, inter-item correlations 0.30
Cronbach 0.30

Validity - Convergent, discriminant, nomological

Factor scores, we used factor scores to avoid additional

validaitons.

Thank You for Listening!

S-ar putea să vă placă și