Sunteți pe pagina 1din 31

1.

Nominal Measures of Association


2. Ordinal Measure s of Association

ASSOCIATION
Association
The strength of relationship between 2 variables
Knowing how much variables are related may enable
you to predict the value of 1 variable when you know
the value of another

As with test statistics, the proper measure of


association depends on how variables are
measured

Significance vs. Association


Association = strength of relationship
Test statistics = how different findings are
from null
They do capture the strength of a relationship
t = number of standard errors that separate means
Chi-Square = how different our findings are from
what is expected under null
If null is no relationship, then higher Chi-square values
indicate stronger relationships.

HOWEVER --- test statistics are also influenced by


other stuff (e.g., sample size)

MEASURES OF ASSOCIATION FOR NOMINALLEVEL VARIABLES

Chi-Square Based Measures


2 indicates how different our findings
are from what is expected under null
2 also gets larger with higher sample size (more
confidence in larger samples)
To get a pure measure of strength, you have to
remove influence of N

Phi
Cramer's V

PHI
Phi () = 2
N
Formula standardizes 2 value by sample size
Measure ranges from 0 (no relationship) to
values considerably >1
(Exception: for a 2x2 bivariate table, upper limit of
= 1)

PHI
Example:
2 x 2 table
2=5.28

FAVOR OR OPPOSE DEATH PENALTY FOR MURDER * RESPONDENTS SEX


Crosstabulation
Count

FAVOR OR OPPOSE
DEATH PENALTY
FOR MURDER
Total

LIMITATION OF :
Lack of clear upper limit
makes an
undesirable measure of
association

1 FAVOR
2 OPPOSE

RESPONDENTS SEX
1 MALE
2 FEMALE
52
43

Total
95

10

22

32

62

65

127

CRAMERS V
Cramers V = 2
(N)(Minimum of r-1, c-1)

Unlike , Cramers V will always have an upper limit


of 1, regardless of # of cells in table
For 2x2 table, & Cramers V will have the same value

Cramers V ranges from 0 (no relationship) to +1


(perfect relationship)

2-BASED MEASURES OF ASSOCIATION


Sample problem 1:
The chi square for a 5 x 3 bivariate table
examining the relationship between area of Duluth
one lives in & type of movie preference is 8.42,
significant at .05 (N=100). Calculate & interpret
Cramers V.
ANSWER:
(Minimum of r-1, c-1) = 3-1 = 2
Cramers V = .21
Interpretation: There is a relatively weak association
between area of the city lived in and movie preference.

2-BASED MEASURES OF ASSOCIATION


Sample problem 2:
The chi square for a 4 x 4 bivariate table
examining the relationship between type of
vehicle driven & political affiliation is 12.32, sig.
at .05 (N=300). Calculate & interpret Cramers V.
ANSWER:
(Minimum of r-1, c-1) = 4 -1 = 3
Cramers V = .12
Interpretation: There is a very weak association
between type of vehicle driven & political affiliation.

SUMMARY: 2 -BASED MEASURES


OF ASSOCIATION
Limitation of & Cramers V:
No direct or meaningful interpretation for values
between 0-1
Both measure relative strength (e.g., .80 is stronger
association than .40), but have no substantive
meaning; hard to interpret
Rules of Thumb for what is a weak, moderate, or
strong relationship vary across disciplines

LAMBDA ()
PRE (Proportional Reduction in Error) is the logic
that underlies the definition & computation of
lambda
Tells us the reduction in error we gain by using the IV to
predict the DV
Range 0-1 (i.e., proportional reduction)
E1 Attempt to predict the category into which each case
will fall on DV or Y while ignoring IV or X
E2 Predict the category of each case on Y while taking
X into account
The stronger the association between the variables the
greater the reduction in errors

LAMBDA: EXAMPLE 1
Does risk classification in prison affect the likelihood
of being rearrested after release? ( 2=43.7)

Risk Classification
ReLow
arrested

Medium

High

Total

Yes

25

20

75

120

No

50

20

15

85

Total

75

40

90

205

LAMBDA: EXAMPLE
Find E1 (# of errors made when ignoring X)
E1 = N (largest row total)
= 205 -120 = 85

Risk Classification
Rearrested

Low

Medium

High

Total

Yes

25

20

75

120

No

50

20

15

85

Total

75

40

90

205

LAMBDA: EXAMPLE
Find E2 (# of errors made when accounting for X)
E2 = (each columns total largest N in column)
= (75-50) + (40-20) + (90-75) = 25+20+15 = 60

Risk Classification
Rearrested

Low

Medium

High

Total

Yes

25

20

75

120

No

50

20

15

85

Total

75

40

90

205

CALCULATING LAMBDA: EXAMPLE


Calculate Lambda
= E1 E2 = 85-60 = 25 = 0.294
E1

85

85

Interpretation when multiplied by 100,


indicates the % reduction in error achieved by
using X to predict Y, rather than predicting Y
blind (without X)
0.294 x 100 = 29.4% - Knowledge of risk
classification in prison improves our ability to predict
rearrest by 29%.

LAMBDA: EXAMPLE 2
What is the strength of the relationship between
citizens race and attitude toward police?
(obtained chi square is > 5.991 (2[critical])

Calculate & interpret lambda to answer this question


Attitude
towards
police
Positive
Negative
Totals

Black

Race
White

Other

Totals

40
80
120

150
95
245

35
55
90

225
230
455

LAMBDA: EXAMPLE 2
E1 = N (largest row total)
455 230 = 225
E2 = (each columns total largest N in column)
(120 80) + (245 150) + (90 55) =
40 + 95 + 35 = 170
= E1 E2 = 225 - 170 = 55 = 0.244
E1

225

225

INTERPRETATION:
0. 244 x 100 = 24.4% - Knowledge of an individuals race
improves our ability to predict attitude towards police by 24%
Attitude
Race
towards
Totals
Black
White
Other
police
Positive

40

150

35

225

Negative

80

95

55

230

Totals

120

245

90

455

SPSS EXAMPLE
PRES00 VOTE FOR GORE, BUSH, NADER * SEX RESPONDENTS SEX Crosstabulation

1. IS THERE A
SIGNIFICANT
RELATIONSHIP
B/T GENDER &
VOTING
BEHAVIOR?

PRES00 VOTE
FOR GORE, BUSH,
NADER

2 BUSH

3 NADER

Total

2. If so, what is the


strength of
association
between these
variables?
. ANSWER TO Q1:
YES

1 GORE

Count
% within SEX
RESPONDENTS SEX
Count
% within SEX
RESPONDENTS SEX
Count
% within SEX
RESPONDENTS SEX
Count
% within SEX
RESPONDENTS SEX

SEX RESPONDENTS
SEX
1 MALE
2 FEMALE
143
252
35.8%

49.5%

43.5%

234

240

474

58.6%

47.2%

52.2%

22

17

39

5.5%

3.3%

4.3%

399

509

908

100.0%

100.0%

100.0%

Chi-Square Tests

Pearson Chi-Square
Likelihood Ratio
Linear-by-Linear
Association
N of Valid Cases

Value
17.730 a
17.832
17.295

Total
395

2
2

Asymp. Sig.
(2-sided)
.000
.000

.000

df

908

a. 0 cells (.0%) have expected count less than 5. The


minimum expected count is 17.14.

SPSS EXAMPLE
Symmetric Measures

ANSWER TO
QUESTION 2:
By either measure, the
association between
these variables
appears to be weak

Value
Nominal by
Nominal

Cramer's V

Approx. Sig.

.140

N of Valid Cases

.000

908

a. Not assuming the null hypothesis.


b. Using the asymptotic standard error assuming the null
hypothesis.

Directional Measures

Nominal by
Nominal

Lambda

Goodman and
Kruskal tau

Symmetric
PRES00 VOTE FOR
GORE, BUSH, NADER
Dependent
SEX RESPONDENTS
SEX Dependent
PRES00 VOTE FOR
GORE, BUSH, NADER
Dependent
SEX RESPONDENTS
SEX Dependent

Value
.020

Asymp.
a
Std. Error
.027

Approx. T
.738

Approx. Sig.
.461

.028

.050

.541

.588

.013

.016

.801

.423

.015

.007

.000

.020

.009

.000

a. Not assuming the null hypothesis.


b. Using the asymptotic standard error assuming the null hypothesis.

2 LIMITATIONS OF LAMBDA
1. Asymmetric

Value of the statistic will vary depending on


which variable is taken as independent

2. Misleading when one of the row totals is


much larger than the other(s)

For this reason, when row totals are extremely


uneven, use a chi square-based measure
instead

ORDINAL MEASURE OF ASSOCIATION


GAMMA
For examining STRENGTH & DIRECTION of
collapsed ordinal variables (<6 categories)
Like Lambda, a PRE-based measure
Range is -1.0 to +1.0

GAMMA

Logic: Applying PRE to PAIRS of individuals

Prejudice
Low
Middle
High

Lower Class

Middle
Class

Upper Class

Kenny

Tim

Kim

Joey

Deb

Ross

Randy

Eric

Barb

GAMMA

CONSIDER KENNY-DEB PAIR


In the language of Gamma, this is a same pair

direction of difference on 1 variable is the same as direction


on the other

If you focused on the Kenny-Eric pair, you


would come to the same conclusion
Prejudice
Low
Middle
High

Lower Class

Middle
Class

Upper Class

Kenny

Tim

Kim

Joey

Deb

Ross

Randy

Eric

Barb

GAMMA

NOW LOOK AT THE TIM-JOEY PAIR


In the language of Gamma, this is a different pair

direction of difference on one variable is opposite of the


difference on the other

Prejudice
Low
Middle
High

Lower Class

Middle
Class

Upper Class

Kenny

Tim

Kim

Joey

Deb

Ross

Randy

Eric

Barb

GAMMA

Logic: Applying PRE to PAIRS of individuals


Formula:
same different
same + different

GAMMA

If you were to account for all the pairs in this table, you
would find that there were 9 same & 9 different pairs
Applying the Gamma formula, we would get:
99= 0
= 0.0
18 18

Prejudice
Low
Middle
High

Lower Class

Middle
Class

Upper Class

Kenny

Tim

Kim

Joey

Deb

Ross

Randy

Eric

Barb

GAMMA

3-case example
Applying the Gamma formula, we would get:
30= 3
= 1.00
3
3

Prejudice

Low
Middle
High

Lower Class

Middle
Class

Upper Class

Kenny
Deb
Barb

Gamma: Example 1
Examining the relationship between:
FEHELP (Wife should help husbands career first) &
FEFAM (Better for man to work, women to tend home)
Both variables are ordinal, coded 1 (strongly agree) to 4 (strongly
disagree)
FEHELP WIFE SHOULD HELP HUSBANDS CAREER FIRST * FEFAM BETTER FOR MAN TO WORK, WOMAN TEND HOME Crosstabulation

FEHELP WIFE
SHOULD HELP
HUSBANDS CAREER
FIRST

1 STRONGLY AGREE

2 AGREE

3 DISAGREE

4 STRONGLY DISAGREE

Total

Count
% within FEFAM BETTER
FOR MAN TO WORK,
WOMAN TEND HOME
Count
% within FEFAM BETTER
FOR MAN TO WORK,
WOMAN TEND HOME
Count
% within FEFAM BETTER
FOR MAN TO WORK,
WOMAN TEND HOME
Count
% within FEFAM BETTER
FOR MAN TO WORK,
WOMAN TEND HOME
Count
% within FEFAM BETTER
FOR MAN TO WORK,
WOMAN TEND HOME

FEFAM BETTER FOR MAN TO WORK, WOMAN TEND HOME


1 STRONGLY
4 STRONGLY
AGREE
2 AGREE 3 DISAGREE
DISAGREE
14
8
0
0

Total
22

21.9%

3.8%

.0%

.0%

2.6%

26

72

26

127

40.6%

34.3%

6.4%

1.8%

15.0%

21

111

307

45

484

32.8%

52.9%

75.2%

27.4%

57.2%

19

75

116

213

4.7%

9.0%

18.4%

70.7%

25.2%

64

210

408

164

846

100.0%

100.0%

100.0%

100.0%

100.0%

Gamma: Example 1
Based on the info in this table, does there seem to be a
relationship between these factors?
Does there seem to be a positive or negative relationship
between them?
Does this appear to be a strong or weak relationship?
FEHELP WIFE SHOULD HELP HUSBANDS CAREER FIRST * FEFAM BETTER FOR MAN TO WORK, WOMAN TEND HOME Crosstabulation

FEHELP WIFE
SHOULD HELP
HUSBANDS CAREER
FIRST

1 STRONGLY AGREE

2 AGREE

3 DISAGREE

4 STRONGLY DISAGREE

Total

Count
% within FEFAM BETTER
FOR MAN TO WORK,
WOMAN TEND HOME
Count
% within FEFAM BETTER
FOR MAN TO WORK,
WOMAN TEND HOME
Count
% within FEFAM BETTER
FOR MAN TO WORK,
WOMAN TEND HOME
Count
% within FEFAM BETTER
FOR MAN TO WORK,
WOMAN TEND HOME
Count
% within FEFAM BETTER
FOR MAN TO WORK,
WOMAN TEND HOME

FEFAM BETTER FOR MAN TO WORK, WOMAN TEND HOME


1 STRONGLY
4 STRONGLY
AGREE
2 AGREE 3 DISAGREE
DISAGREE
14
8
0
0

Total
22

21.9%

3.8%

.0%

.0%

2.6%

26

72

26

127

40.6%

34.3%

6.4%

1.8%

15.0%

21

111

307

45

484

32.8%

52.9%

75.2%

27.4%

57.2%

19

75

116

213

4.7%

9.0%

18.4%

70.7%

25.2%

64

210

408

164

846

100.0%

100.0%

100.0%

100.0%

100.0%

GAMMA
Do we reject the null
hypothesis of
independence between
these 2 variables?
Yes, the Pearson chi
square p value (.000) is
< alpha (.05)

Its worthwhile to look at


gamma.
Interpretation:
There is a strong positive
relationship between
these factors.
Knowing someones view
on a wifes first priority
improves our ability to
predict whether they
agree that women should
tend home by 75.5%.

Chi-Square Tests

Pearson Chi-Square
Likelihood Ratio
Linear-by-Linear
Association
N of Valid Cases

Value
457.679 a
383.933

9
9

Asymp. Sig.
(2-sided)
.000
.000

.000

df

285.926
846

a. 2 cells (12.5%) have expected count less than 5. The


minimum expected count is 1.66.

Symmetric Measures

Ordinal by Ordinal Gamma


N of Valid Cases

Value
.755
846

Asymp.
a
b
Std. Error Approx. T Approx. Sig.
.029
18.378
.000

a. Not assuming the null hypothesis.


b. Using the asymptotic standard error assuming the null hypothesis.

USING GSS DATA


Construct a contingency table using two
ordinal level variables
Are the two variables significantly related?
How strong is the relationship?
What direction is the relationship?

S-ar putea să vă placă și