Sunteți pe pagina 1din 3

SPSS Practical 5 – Categorical Data

In this practical we will be analysing a data set from a case control study looking at the effect
of alcohol consumption on spinal epidural haematoma. First we will conduct an unadjusted
analysis looking at the relationship between the exposure of interest and case/control status
using the chi-squared test.

The Chi-squared test can be used within the context of the case control study to formally test
the null hypothesis, H0: The odds of having a particular condition in the exposed group =
the odds of having a particular condition in the unexposed group (i.e. odds ratio = 1)
against the alternative hypothesis, HA: The odds of having the condition are not equal in
the two exposure groups.

We will then perform an adjusted analysis using a binary logistic regression model which is
the appropriate regression model to use when you wish to examine the relationship
between a binary outcome variable and a set of factors which we believe may be related to
the binary outcome.

EXERCISE 1

Spinal epidural haematoma (SEH) is a rare, yet potentially devastating complication of spinal
surgery. There is limited evidence available regarding the risk factors for development of
symptomatic SEH following spinal surgery. Case studies suggest that alcohol consumption of
10 units a week or more may be a potential risk factor.

20 SEH cases were identified from hospital records and 60 controls were selected to compare
the distribution of individuals with alcohol consumption ≥ 10 units per week. SEH.sav is an
SPSS data file that contains the data from the case control study. A full list of the variables in
the data set are given below.

Variable
ID Patient Identification Number
SEH Case Indicator: 0 = Control, 1 = Case
Alcohol 0 = Alcohol consumption < 10 units p/w
1 = Alcohol consumption ≥ 10 units p/w
Age Patient age measured in years

Locate the SEH.sav data file, and save it to a network drive or your computer (Right click the
online file, select save as and select the location to save the data to). Open SPSS and load the
trial data set from where the file was saved [File → Open →Data, and navigate to the
location where you saved SEH.sav to open the data file for use]and take a look at the data.
Notice there is one row of data per patient. To investigate the association between alcohol
consumption and SEH case/control status the data for this can be summarised into a 2×2
table like this:

1
Alc ≥ 10 units p/w Alc < 10 units p/w
SEH Case a b
Control c d

To get the frequencies for this table, click ‘Analyze → Descriptive Statistics → Crosstabs’.
Move ‘SEH’ into ‘Row(s)’ and ‘Alcohol’ into ‘Column(s)’. Click ‘Statistics’ and tick the ‘Chi
square’ box. Click ‘Continue’ Now click ‘Exact’ and click in the circle next to ‘Exact’ to ensure
this option is selected. Click ‘Continue’ followed by ‘Ok’ to run the test. Note: SPSS will not
label the rows and columns in the same order as above, you will therefore need to carefully
extract the frequencies (a, b, c, d) from SPSS to complete the above table. Fill in the
frequencies below:

Alc ≥ 10 units p/w Alc < 10 units p/w


SEH Case
Control

In a case control study we compare odds because patients are selected because of their
disease status. We do not interpret risks; you could get any risk value you wish by simply
varying the number of SEH cases and controls selected and often the numbers of cases
studied do not reflect the true mix of case numbers in the general population. SEH is a rare
condition.
The odds ratio quantifies the odds of being a case in the exposed group relative to the odds
of being a case in the unexposed group. It is easy to calculate the odds ratio by hand. This is
a×d
calculated by . Compute the odds ratio.
b×c

We can also get SPSS to calculate the odds ratio. Click ‘Analyze →Regression →Binary
logistic.’ Enter ‘SEH’ as the ‘Dependent’ variable and ‘Alcohol’ as a covariate. Click ‘Options’
and tick the box next to ‘CI for exp(B)’. (exp(B) is the odds ratio and so we want a confidence
interval for it). Click ‘Continue’. Now click ‘Categorical’, we declare alcohol consumption as a
binary categorical variable. Move ‘Alcohol’ over to the ‘Categorical covariates’ box and set
the first category of alcohol (alcohol=0) as the reference category, by circling ‘first’ and
clicking ‘change.’ We do this so the odds ratio for Alcohol compares odds of SEH with alcohol
consumption ≥ 10 units p/w (alcohol =1) to alcohol consumption < 10 units p/w (alcohol = 0).
Click ‘Continue’ then ‘Ok’.

There is a lot of useless output in the resulting output. Go to the very bottom, to the table
labelled ‘Variables in the equation’. Look at the column labelled Exp(B) corresponding to
Alcohol. Check that this is the same as the odds ratio you calculated by hand above. Next to
Exp(B) you will also have a confidence interval for Exp(B), which is what you were after.

QUESTIONS

1. Check that your hand calculation and the SPSS output agree for the odds ratio
2. What does your odds ratio mean?
3. What is the 95% CI for the odds ratio?
4. Do the results of the Chi-squared test suggest that there is a significant association
between alcohol consumption and SEH case/control status?

2
EXERCISE 2

We will now take analysis one step further and explore the relationship between alcohol
consumption and SEH case/control status adjusted for Age. Go back to the binary logistic
regression set up (‘Analyze →Regression →Binary logistic’). Add Age into the model as an
additional covariate and click ‘Ok’. Asses the ‘Variables in the equation’ output.

QUESTIONS

1. What is the Odds ratio for alcohol consumption after adjustment for Age? [EXP(B)]
2. Is alcohol consumption a significant predictor of SEH case/control status after adjustment
for Age?
3. What is the Odds ratio for Age after adjustment for alcohol?
4. Is Age a significant predictor of SHE case/control status?

S-ar putea să vă placă și