Documente Academic
Documente Profesional
Documente Cultură
Introduction
Outlines for Today
1.Types of Variables
2. Categorical Data
e.g. Race
e.g. Income
Categorical Variable: take a finite number of possible values, including types of a)-c), and
possibly d). The possible values of a categorical variable is referred to as its categories or levels.
e.g. Type of blood
Digit 1 2 3 4 5 6 7 8 9 0 Total
Frequency 7 8 8 15 13 11 12 8 5 13 100
2. Suicide Data The following table shows the classification of suicides in France by day of the week.
Based on these data, Durkheim (1897) concludes that suicide diminishes at the end of the week, beginning
on Friday. He also notes that the suicide rate is not lower on Sunday than on Saturday.
3. Homicide Data The following table shows the monthly distribution of homicides in the USA in
1970.
Month Jan. Feb. Mar. Apr. May Jun. Jul. Aug. Sept. Oct. Nov. Dec. Total
# of Homicides 1318 1229 1327 1257 1424 1399 1475 1559 1417 1507 1400 1534 16848
5. Number of Boys Data The following table shows the number of boys among the first 4 children in
3343 Swedish families of size 4 or more.
6. World Cup Data The following table shows the number of goals scored per team per game, for the 32
matches played in the 1996 Football World Cup.
Placebo Vitamin C
Cold 31 17
Not Cold 109 122
Total 140 139
8. Seal Belt Data The following table is based on the records of accidents in Florida, USA, in 1988.
Safety Equipment
Injury Seal Belt None Total
Fatal 510 1601 2111
Nonfatal 412368 162527 574835
Total 412818 164128 576946
9. Death Penalty Data The following table is based on a study concerning the effects of racial
characteristics on whether individuals convicted of homicide receive the death penalty . It shows that
the defendant’s race (white, black) and the verdict (death penalty, no death penalty) in 326 cases of
homicide in Florida, USA during 1976-1977.
Defendant’s Race
Death Penalty White Black Total
Yes 19 17 36
No 141 149 290
Total 160 166 326
11/21/2019 SA3202, Lecture 1 7
10. University Admission Data The following table shows admission results for the six largest graduate
departments at the University of California at Berkeley, for the fall 1973 session.
Applicant’s Gender
Whether Admitted Male Female
Yes 1198 557
No 1493 1278
Total 2691 1835
11. Smoking and Lung Cancer Data The following table is based on a retrospective study of lung
cancer and tobaco smoking among patients in hospitals in serveral English cities. The table compares
male lung cancer patients with control patients having other diseases, according to the average number of
cigarettes smoked daily over a ten-year period preceding the onsets of the disease
12. Smoking Habit Data The following table is from a study concerning smoking habits of high school
students in Arizona, USA
Student Smokes Students Not Smoke
Both parents smoke 400 1380
One parent smokes 416 1823
Neither parent smokes 188 1168
14. British Social Mobility Data The following table relates father’s and son’s occupational status for
a sample of 3500 British father-son pairs.
Son’s Status
Father’s Status 1 2 3 4 5
1 50 45 8 18 8
2 28 174 84 154 55
3 11 78 110 223 96
4 14 150 185 714 447
5 3 42 72 320 411
This is also known as the hypothesis test of contingency table: compare the observed
frequencies with their “expected frequencies” (the frequencies expected under the model) to see
how close they are.
Degrees of freedom:
Degrees of freedom:
Method 1 and Method 2 are asymptotically equivalent.