Sunteți pe pagina 1din 62

You may download this handout and supporting materials at: http://gseweb.harvard.edu/~faculty/singer/ http://gseacademic.harvard.edu/alda/ http://gseacademic.harvard.edu/~willetjo/ http://www.ats.ucla.edu/stat/examples/alda/ Judith D.

. Singer & John B. Willett (2006)

Individual Growth Modeling: Modern Methods for Studying Change


Judith D. Singer & John B. Willett
Harvard Graduate School of Education

Time is the one immaterial object which we cannot influence neither speed up nor slow down, add to nor diminish. Maya Angelou

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 1

The fundamental problem of Muybridge research: The fundamental problem longitudinal research: The study of Making continuous of longitudinal(18301904) The study of TIME:continuoustime stand still (18301904) TIME: Eadweard stand still Making Eadweard Muybridge time
Eadweard Muybridge Animal Locomotion (1887)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 2

The height of the son of Count Filibert Guneau de Montbeillard (1720-1785) The height of the son of Count Filibert Guneau de Montbeillard (1720-1785)
Scammon, RE (1927) The first seriation study of human growth, Am J of Physical Anthropology, 10, 329-336.

The first known longitudinal study of growth: The first known longitudinal study of growth:

200

150

oopsmeasurement error?
Height (in cm)

100

50

Recorded his sons height approximately every six months from birth (in 1759) until age 18
0 0 5 10 Age
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 3

15

20

Fast forward to the present: Fast forward to the present: In most fields, the quantity of longitudinal research is exploding In most fields, the quantity of longitudinal research is exploding
Annual searches for keyword 'longitudinal' in 9 OVID databases, between 1982 and 2005

10,000
medicine business biology psychology

1,000

sociology agriculture education zoology economics

100

10 '81 '84 '87 '90 '93 '96 '99 '02 '05

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 4

But what about the quality?: What does todays longitudinal research look like? But what about the quality?: What does todays longitudinal research look like?
Read 150 articles in 10 issues of APA journals published in each of 1999, 2003 and 2006

First, the good news: First, the good news: More longitudinal studies are More longitudinal studies are being published being published More of these are truly More of these are truly longitudinal longitudinal

Now, the bad news: Now, the bad news: Very few of these longitudinal Very few of these longitudinal studies use modern analytic studies use modern analytic methods methods
0 10 20 30 40 50

0
>1 Wave
1999 2003 2006

10

20

30

40

50

60

Growth Modeling Survival Analysis Repeated Measures ANOVA Wave-on-Wave regression Separate but parallel analyses Set aside waves

2 Waves

3 Waves

4+ Waves

Combine waves Ignore age heterogeneity

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 5

Part of the problem may well be reviewers ignorance Part of the problem may well be reviewers ignorance
Comments received from two reviewers for Developmental Psychology of a paper that fit individual growth models to 3 waves of data on vocabulary size among young children:

Reviewer A: I do not understand the statistics used in this study deeply enough to evaluate their appropriateness. I imagine this is also true of 99% of the readers of Developmental Psychology. Previous studies in this area have used simple correlation or regression which provide easily interpretable values for the relationships among variables. In all, while the authors are to be applauded for a detailed longitudinal study, the statistics are difficult. I thus think Developmental Psychology is not really the place for this paper.

Reviewer B: The analyses fail to live up to the promiseof the clear and cogent introduction. I will note as a caveat that I entered the field before the advent of sophisticated growthmodeling techniques, and they have always aroused my suspicion to some extent. I have tried to keep up and to maintain an open mind, but parts of my review may be nave, if not inaccurate.

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 6

What kinds of research questions require longitudinal methods? What kinds of research questions require longitudinal methods?

Questions about systematic change over time

Questions about whether and when events occur

Curran et al (1997) studied alcohol use 82 teens interviewed at ages 14, 15 & 16 alcohol use tended to increase over time Children of Alcoholics (COAs) drank more but had no steeper rates of increase over time.

Capaldi et al (1996) studied age of 1st sex 180 boys interviewed annually from 7th to 12th grade (30% remained virgins at end of study) Boys who experienced early parental transitions were more likely to have had sex.

1. Within-person summary: How does a teens alcohol consumption change over time? 2. Between-person comparison: How do these trajectories vary by teen characteristics?

1. Within-person summary: When are boys most at risk of having sex for the 1st time? 2. Between-person comparison: How does this risk vary by teen characteristics?

Individual Growth Model/ Multilevel Model for Change

Discrete- and Continuous-Time Survival Analysis

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 7

Four important advantages of modern longitudinal methods Four important advantages of modern longitudinal methods
You can identify temporal patterns in the data Does the outcome increase, decrease, or remain stable over time? Is the general pattern linear or non-linear? Are there abrupt shifts at substantively interesting moments? You can include time varying predictors (those whose values vary over time) Participation in an intervention Family circumstances (employment, marital status, etc) You can include interactions with time (to test whether a predictors effect varies over time) Some effects dissipatethey wear off Some effects increasethey become more important Some effects are especially pronounced at particular times

You have great flexibility in research design Not everyone needs the same rigid data collection schedulecadence can be person specific Not everyone needs the same number of wavescan use all cases, even those with just one wave! Design can be experimental or observational Designs can be single level (individuals only) or multilevel (e.g., patients within physician practices)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 8

What were going to cover in this workshop What were going to cover in this workshop

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 9

A word about programming, software and other supplemental materials A word about programming, software and other supplemental materials

www.ats.ucla.edu/stat/examples/alda

Chapter
Table of contents A framework for investigating change over time Exploring longitudinal data on change Introducing the multilevel model for change Doing data analysis with the multilevel model for change Treating time more flexibly Modeling discontinuous and nonlinear change Examining the multilevel models error covariance structure Modeling change using covariance structure analysis A framework for investigating event occurrence Describing discrete-time event occurrence data Fitting basic discrete-time hazard models Extending the discrete-time hazard model Describing continuous-time event occurrence data Fitting the Cox regression model Extending the Cox regression model

Datasets Ch 1 Ch 2 Ch 3 Ch 4 Ch 5 Ch 6 Ch 7 Ch 8 Ch 9 Ch 10 Ch 11 Ch 12 Ch 13 Ch 14 Ch 15

Applied Longitudinal Data Analysis website http://gseacademic.harvard.edu/alda materials from past workshops videos of past workshops

MLwiN

SPSS

Mplus

SPlus

Stata

HLM

SAS

S-077: Applied Longitudinal Data Analysis more fully annotated computer code examples of detailed computer output course videos

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Workshop Overview 1, slide 10

Introducing the Multilevel Model for Change:


ALDA, Chapter Three

When youre finished changing, youre finished Benjamin Franklin

John B. Willett & Judith D. Singer Harvard Graduate School of Education

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 1

Chapter 3: Introducing the multilevel model for change Chapter 3: Introducing the multilevel model for change
General Approach: Well go through a worked example from start to finish; well save practical data analytic advice for the next session The level-1 submodel for individual change (3.2)examining The level-1 submodel for individual change (3.2)examining empirical growth trajectories and asking what population model might empirical growth trajectories and asking what population model might have given rise these observations? have given rise these observations? The level-2 submodels for systematic interindividual differences in The level-2 submodels for systematic interindividual differences in change (3.3)what kind of population model should we hypothesize to change (3.3)what kind of population model should we hypothesize to represent the behavior of the parameters from the level-1 model? represent the behavior of the parameters from the level-1 model? Fitting the multilevel model for change to data (3.4)there are now Fitting the multilevel model for change to data (3.4)there are now many options for model fitting, and more practically, many software many options for model fitting, and more practically, many software options. options. Interpreting the results of model fitting (3.5 and 3.6) Having fit the Interpreting the results of model fitting (3.5 and 3.6) Having fit the model, how do we sensibly interpret and display empirical results? model, how do we sensibly interpret and display empirical results?
Interpreting fixed effects Interpreting fixed effects Interpreting variance components Interpreting variance components Plotting prototypical trajectories Plotting prototypical trajectories

(ALDA, Chapter 3 intro, p. 45)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 2

Illustrative example: The effects of early intervention on childrens IQ Illustrative example: The effects of early intervention on childrens IQ
Data source: Peg Burchinal and colleagues (2000) Child Development.
Sample: 103 African American Sample: 103 African American children born to low income families children born to low income families
58 randomly assigned to an early 58 randomly assigned to an early intervention program intervention program 45 randomly assigned to aacontrol 45 randomly assigned to control group group

Research design Research design

Each child was assessed 12 times Each child was assessed 12 times between ages 66and 96 months between ages and 96 months Here, we analyze only 33waves of data, Here, we analyze only waves of data, collected at ages 12, 18, and 24 months collected at ages 12, 18, and 24 months

Research question: What is the effect Research question: What is the effect of the early intervention program on of the early intervention program on childrens cognitive performance? childrens cognitive performance?

Within-individual: How does aachilds Within-individual: How does childs cognitive performance change between cognitive performance change between 12 and 24 months? 12 and 24 months? Between individuals: Do the Between individuals: Do the trajectories for children in the early trajectories for children in the early intervention program differ from those intervention program differ from those in the control group? [And, ififthey do in the control group? [And, they do differ, how do they differ?] differ, how do they differ?]

(ALDA, Section 3.1, pp. 46-49)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 3

The fundamental building block of growth modeling The fundamental building block of growth modeling
General structure: A person-period data set has one row of data General structure: A person-period data set has one row of data for each period when that particular person was observed for each period when that particular person was observed

The person-period data set: The person-period data set:

Fully balanced, Fully balanced, 3 waves per child 3 waves per child AGE=1.0, 1.5, and 2.0 AGE=1.0, 1.5, and 2.0 (clocked in years (clocked in years instead of monthsso instead of monthsso that we assess annual that we assess annual rate of change) rate of change)

PROGRAM is a dummy variable PROGRAM is a dummy variable indicating whether the child was indicating whether the child was randomly assigned to the special randomly assigned to the special early childhood program (1) or early childhood program (1) or not (0) not (0)

COG is a nationally normed scale COG is a nationally normed scale Declines within empirical Declines within empirical growth records growth records Instead of asking whether the Instead of asking whether the growth rate is higher among growth rate is higher among program participants, well ask program participants, well ask whether the rate of decline is whether the rate of decline is lower lower
(ALDA, Section 3.1, pp. 46-49)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 4

Examining empirical growth plots to help suggest a suitable individual growth model Examining empirical growth plots to help suggest a suitable individual growth model
(by superimposing fitted OLS trajectories) (by superimposing fitted OLS trajectories)
Many trajectories are smooth and systematic Many trajectories are smooth and systematic (70, 71, 72, 904, 908) (70, 71, 72, 904, 908)
150 125 COG ID 68 150 125 COG ID 70 125 150 COG ID 71 125 100 150 COG ID 72

Overall impression: Overall impression:


COG declines over COG declines over time, but theres some time, but theres some variation in the fit (its variation in the fit (its quality and shape) quality and shape)

100 75 50

100 75 50

100 75 50


1 1.5 AGE

75 50

1.5 AGE

1.5 AGE

1.5 AGE

150 125 100 75 50

COG ID 902

150 125

COG ID 904

150 125

COG ID 906

150 125 100

COG ID 908

100 75 50


1 1.5 AGE

100 75 50


1 1.5 AGE

75 50

1.5 AGE

1.5 AGE

Other trajectories are scattered, irregular (and could Other trajectories are scattered, irregular (and could even be curvilinear???) even be curvilinear???) (68, 902, 906) (68, 902, 906)

Key question when examining empirical growth Key question when examining empirical growth plots: What type of population individual growth plots: What type of population individual growth model might have generated these sample data? model might have generated these sample data?
Linear or curvilinear? Linear or curvilinear? Smooth or jagged? Smooth or jagged? Continuous or disjoint? Continuous or disjoint?

With just 33waves of data and many of the empirical growth With just waves of data and many of the empirical growth plots suggesting aalinear model would be fine, it makes plots suggesting linear model would be fine, it makes sense to start with aasimple linear individual growth model sense to start with simple linear individual growth model
(ALDA, Section 3.2, pp. 49-51)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 5

Postulating a simple linear level-1 submodel for individual change: Postulating a simple linear level-1 submodel for individual change:
Examining its structural and stochastic portions Examining its structural and stochastic portions
Structural portion,which embodies our hypothesis about the shape of each persons true trajectory of change over time
Key assumption: In the population, COGij is a linear function of child is AGE on occasion j

Stochastic portion,which allows for the effects of random error from the measurement of person i 2 on occasion j. Usually assume ij ~ N(0, )

COGij = 0i + 1i ( AGE ij 1) + ij

] [ ]

i indexes persons (i=1 to 103) j indexes occasions/periods (j=1 to 3)

i1, i2, and i3 are deviations


Individual is hypothesized true change trajectory
150 COG

of is true change trajectory from linearity on each occasion (including the effects of
measurement error & omitted timevarying predictors)

0i is the intercept of is true


change trajectory. Because we have centered AGE at 1, 0i is is true value of COG at AGE=1, his true initial status
125
i1

i3

100

i2

1i is the slope of is true change trajectory, his yearly rate of change in true COG, his true annual rate of change
Net result: The individual growth Net result: The individual growth parameters, 0i and 1i , ,fully describe parameters, 0i and fully describe person is hypothesized1i trueindividual person is hypothesizedtrue individual growth trajectory growth trajectory

1 year

75

50 1
(ALDA, Section 3.2, pp. 49-51)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 6

1.5 AGE

Examining fitted OLS trajectories to help suggest a suitable level-2 model Examining fitted OLS trajectories to help suggest a suitable level-2 model
Most children decline over time (although there are a few exceptions)
COG

But theres also great variation in these OLS estimates


Fitted initial status
14 13* 13. 12* 12. 11* 11. 10* 10. 9* 9. 8* 8. 7* 7. 6* 6. 5* 0 5568 00134 5556778999 02233344 55667777888889 000111112222233334444 55666688999 0012222244 6666677799 344 89 34 7

Fitted rate of change


2. 1* 1. 0* 0. -0* -0. -1* -1. -2* -2. -3* -3. -4* 0 0 79 134 4444332 99998888777765 4333322211000 99888877666655 44322211110000 9999877776655 443322100000 987 443111

Residual variance
46 44 42 40 38 36 34 32 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0 8

150

00 8 3 4 7 1444 8 3 00011 21 44433 1118886666 77744 333844 04444888833338888888 0000111122233334444444466668111114447

125

100

75

What does this behavior suggest about a suitable level-2 model?


50 1 1.5 AGE 2

Average OLS trajectory across the full sample 110-10 (AGE - 1)


(ALDA, Section 3.2.3, pp. 55-56)

The level-2 model must capture both the averages of the individual growth parameters and variation about these averages Andit must also provide a way to represent systematic interindividual differences in change according to variation in predictor(s) (here, PROGRAM participation)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 7

Further developing the level-2 submodel for interindividual differences in change Further developing the level-2 submodel for interindividual differences in change
Four desired features of the level-2 submodel(s)
PROGRAM=0
150 COG 150 COG

PROGRAM=1

125

125

100

100

75

75

50 1 1.5 AGE 2

50 1 1.5 AGE 2

Program participants tend to have: Program participants tend to have: Higher scores at age 1 (higher initial status) Higher scores at age 1 (higher initial status) Less steep rates of decline (shallower slopes) Less steep rates of decline (shallower slopes) But these are only overall trendstheres great But these are only overall trendstheres great interindividual heterogeneity interindividual heterogeneity

1. Outcomes are the level-1 individual growth parameters 0i and 1i 2. Need two level-2 submodels, one per growth parameter (one for initial status, one for change) 3. Each level-2 submodel must specify the relationship between a level-1 growth parameter and predictor(s), here PROGRAM We need to specify a functional form for these relationships at level-2 (beginning with linear but ultimately becoming more flexible) 4. Each level-2 submodel should allow individuals with common predictor values to nevertheless have different individual change trajectories We need stochastic variation at level-2, too Each level-2 model will need its own error term, and we will need to allow for covariance across level-2 errors

(ALDA, Section 3.3, pp. 57-60)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 8

Level-2 submodels for systematic interindividual differences in change Level-2 submodels for systematic interindividual differences in change

0i = 00 + 01 PROGRAM + 0i

For the level-1 intercept (initial status)

For the level-1 slope (rate of change)

1i = 10 + 11 PROGRAM + 1i

Key to remembering subscripts Key to remembering subscripts on the gammas (the s) on the gammas (the s) First subscript indicates role in First subscript indicates role in level-1 model (0 for intercept; 11 level-1 model (0 for intercept; for slope) for slope) Second subscript indicates role Second subscript indicates role in level-2 model (0 for intercept; in level-2 model (0 for intercept; 11for slope) for slope)
(ALDA, Section 3.3.1, pp. 60-61)

What about the zetas (thes)?


Theyre level-2 residuals that permit the level-1 individual growth parameters to vary stochastically across people As with most residuals, were less interested in their values than their population variances and covariances

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 9

Understanding the stochastic components of the level-2 submodels Understanding the stochastic components of the level-2 submodels
0i = 00 + 01 PROGRAM + 0i 1i = 10 + 11 PROGRAM + 1i
125

PROGRAM=0
150 COG Population trajectory for child i, (00 + 0i ) + (10 + 1i ) (AGE-1) 125 150

PROGRAM=1
COG

Key ideas behind the level-2 models: Key ideas behind the level-2 models: Models posit the existence of an average Models posit the existence of an average population trajectory for each program group population trajectory for each program group Because the level-2 models also include residuals Because the level-2 models also include residuals (the zetas), each child i ihas his own true change (the zetas), each child has his own true change trajectory (defined by 0i and 1i)) trajectory (defined by 0i and 1i In the figure, the shading is supposed to suggest the In the figure, the shading is supposed to suggest the existence of many true population trajectories, one existence of many true population trajectories, one per child per child

100 Average population trajectory, 00 + 10 (AGE-1) 75

100

Average population trajectory, (00 + 01) + (10 + 11) (AGE-1)

75

50 1 1.5 AGE 2

50 1 1.5 AGE 2

Assumptions about the level-2 residuals: Assumptions about the level-2 residuals:

initial status rate of change

2 0 0 0i , ~ N 0 1i 10

01 12

(ALDA, Section 3.3.2, pp. 61-63)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 10

Three general types of software options (whose numbers are increasing over time) Three general types of software options (whose numbers are increasing over time)

Fitting the multilevel model for change to data Fitting the multilevel model for change to data

Programs expressly Programs expressly designed for multilevel designed for multilevel modeling modeling

MLwiN

Multipurpose packages Multipurpose packages with multilevel with multilevel modeling modules modeling modules

Specialty packages Specialty packages originally designed for originally designed for another purpose that another purpose that can also fit some can also fit some multilevel models multilevel models
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 11

aML

Two sets of issues to consider when comparing (and selecting) packages Two sets of issues to consider when comparing (and selecting) packages
88practical considerations practical considerations 88technical considerations technical considerations

(that affect ease of use/pedagogic value) (that affect ease of use/pedagogic value) Data input optionslevel-1/level-2 vs. Data input optionslevel-1/level-2 vs. person-period; raw data or xyz.dataset person-period; raw data or xyz.dataset Programming optionsgraphical Programming optionsgraphical interfaces and/or scripts interfaces and/or scripts Availability of other statistical Availability of other statistical procedures procedures Model specification optionslevel-1/ Model specification optionslevel-1/ level-2 vs. composite; random effects level-2 vs. composite; random effects Automatic centering options Automatic centering options Wisdom of programs defaults Wisdom of programs defaults Documentation & user support Documentation & user support Quality of outputtext & graphics Quality of outputtext & graphics

(that affect research value) (that affect research value) ##of levels that can be handled of levels that can be handled

Range of assumptions supported (for Range of assumptions supported (for the outcomes & effects) the outcomes & effects) Types of designs supported (e.g., crossTypes of designs supported (e.g., crossnested designs; latent variables) nested designs; latent variables) Estimation routinesfull vs. restricted; Estimation routinesfull vs. restricted; ML vs. GLSmore on this later ML vs. GLSmore on this later Ability to handle design weights Ability to handle design weights Quality and range of diagnostics Quality and range of diagnostics Speed Speed Strategies for handling estimation Strategies for handling estimation problems (e.g., boundary constraints) problems (e.g., boundary constraints)

Advice: Use whatever package youd like but be sure to invest the time and energy to learn to use it well. Visit http://www.ats.ucla.edu/stat/examples/alda for data, code in the major packages, and more
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 12

Examining estimated fixed effects Examining estimated fixed effects

In the population from which this sample was drawn we estimate that

True initial status (COG at age 1) for the average non-participant is 107.84
Fitted model for initial status Fitted model for rate of change

For the average participant, it is 6.85 higher

0i = 107.84 + 6.85 PROGRAM i


1i = 21.13 + 5.27 PROGRAM i
For the average participant, it is 5.27 higher

True annual rate of change for the average non-participant is 21.13

Advice: As youre learning these methods, take the time to actually write out the fitted level-1/level-2 models before interpreting computer outputIts the best way to learn what youre doing!
(ALDA, Section 3.5, pp. 68-71)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 13

Plotting prototypical change trajectories Plotting prototypical change trajectories

General idea: Substitute prototypical values for the level-2 predictors General idea: Substitute prototypical values for the level-2 predictors (here, just PROGRAM=0 or 1) into the fitted models (here, just PROGRAM=0 or 1) into the fitted models
150 COG

0i = 107.84 + 6.85 PROGRAM i 1i = 21.13 + 65.27 PROGRAM i


PROGRAM = 1
COG = 114.69 15.86( AGE 1)

125

100

0i = 107.84 + 6.85(1) = 114.69 1i = 21.13 + 65.27(1) = 15.86 PROGRAM = 0 0i = 107.84 + 6.85(0) = 107.84 1i = 21.13 + 65.27(0) = 21.13

75

COG = 107.84 21.13( AGE 1)

50 1 1.5 AGE 2

Tentative conclusion: Program participants appear to have higher initial status and slower rates of decline. Question: Might these differences be due to nothing more than sampling variation?

(ALDA, Section 3.5.1, pp. 69-71)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 14

Testing hypotheses about fixed effects using single parameter tests Testing hypotheses about fixed effects using single parameter tests
For initial status:
Average non-participant had a non-zero level of COG at age 1 (surprise!) Program participants had higher initial status, on average, than non-participants
(probably because the intervention had already started)

General formulation:

z=

ase()

For rate of change:


Average non-participant had a nonzero rate of decline (depressing) Program participants had slower rates of decline, on average, than non-participants (the program effect).
(ALDA, Section 3.5.2, pp.71-72)

Careful: Most programs provide appropriate tests but different programs use different terminology
Terms like z-statistic, t-statistic, t-ratio, quasi-tstatisticwhich are not the sameare used interchangeably

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 15

Examining estimated variance components Examining estimated variance components


General idea:: General idea Variance components quantify the amount of Variance components quantify the amount of residual variation leftat either level-1 or level-2 residual variation leftat either level-1 or level-2 that is potentially explainable by other predictors not that is potentially explainable by other predictors not yet in the model. yet in the model. Interpretation is easiest when comparing different Interpretation is easiest when comparing different models that each have different predictors (which we models that each have different predictors (which we will do in the next unit). will do in the next unit).

Level-1 residual variance (74.24***):


Summarizes within-person variability in outcomes around individuals own trajectories (usually non-zero) Here, we conclude there is some within-person residual variability If we had time-varying predictors, they might be able to explain some of this within-person residual variability

Level-2 residual variance:


Summarizes between-person variability in change trajectories (here, initial status and growth rates) after controlling for predictor(s) (here, PROGRAM) There are still statistically significant differences in true initial status after controlling for program (124.64***) There is no statistically significant residual variance in rates of change to be explainedits probably little use to add substantive predictors of change The residual covariance between initial status and rates of change is not statistically significant
(ALDA, Section 3.6, pp. 72-74)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 3, slide 16

124.64 * * * 36.41 36.41 12.29

Doing data analysis with the multilevel model for change


ALDA, Chapter Four
We are restless because of incessant change, but we would be frightened if change were stopped Lyman Bryson

Judith D. Singer & John B. Willett Harvard Graduate School of Education

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 1

Chapter 4: Doing data analysis with the multilevel model for change Chapter 4: Doing data analysis with the multilevel model for change
General Approach: Once again, well go through a worked example, but now well delve into the practical data analytic details Composite specification of the multilevel model for change Composite specification of the multilevel model for change (4.2) and how ititrelates to the level-1/level-2 specification just (4.2) and how relates to the level-1/level-2 specification just introduced introduced First steps: unconditional means model and unconditional First steps: unconditional means model and unconditional growth model (4.4) growth model (4.4)
Intraclass correlation Intraclass correlation Quantifying proportion of outcome variation explained Quantifying proportion of outcome variation explained

Practical model building strategies (4.5) Practical model building strategies (4.5)
Developing and fitting aataxonomy of models Developing and fitting taxonomy of models Displaying prototypical change trajectories Displaying prototypical change trajectories Recentering to improve interpretation Recentering to improve interpretation

Comparing models (4.6) Comparing models (4.6)


Using deviance statistics Using deviance statistics Using information criteria (AIC and BIC) Using information criteria (AIC and BIC)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 2

Illustrative example: The effects of parental alcoholism on adolescent alcohol use Illustrative example: The effects of parental alcoholism on adolescent alcohol use
Data source: Pat Curran and colleagues (1997)
Journal of Consulting and Clinical Psychology.

Sample: 82 adolescents Sample: 82 adolescents

37 are children of an alcoholic parent (COAs) 37 are children of an alcoholic parent (COAs) 45 are non-COAs 45 are non-COAs Each was assessed 33timesat ages 14, 15, and 16 Each was assessed timesat ages 14, 15, and 16 The outcome, ALCUSE, was computed as follows: The outcome, ALCUSE, was computed as follows:

Research design Research design

At age 14, PEER, aameasure of peer alcohol use At age 14, PEER, measure of peer alcohol use was also gathered was also gathered

4 items: (1) drank beer/wine; (2) hard liquor; (3) 5 or more 4 items: (1) drank beer/wine; (2) hard liquor; (3) 5 or more drinks in a row; and (4) got drunk drinks in a row; and (4) got drunk Each item was scored on an 8 point scale (0=not at all to Each item was scored on an 8 point scale (0=not at all to 7=every day) 7=every day) ALCUSE isis the square root of the sum of these 4 items ALCUSE the square root of the sum of these 4 items

Research question Research question

Do trajectories of adolescent alcohol use differ by: Do trajectories of adolescent alcohol use differ by: (1) parental alcoholism; and (2) peer alcohol use? (1) parental alcoholism; and (2) peer alcohol use?

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 3

Whats an appropriate functional form for the level-1 submodel? Whats an appropriate functional form for the level-1 submodel?
(Examining empirical growth plots with superimposed OLS trajectories) (Examining empirical growth plots with superimposed OLS trajectories)

3 features of these plots: 3 features of these plots:

1. Most seem approximately 1. Most seem approximately linear (but not always linear (but not always increasing over time) increasing over time) 2. Some OLS trajectories fit well 2. Some OLS trajectories fit well (23, 32, 56, 65) (23, 32, 56, 65) 3. Other OLS trajectories show 3. Other OLS trajectories show more scatter (04, 14, 41, 82) more scatter (04, 14, 41, 82)

A linear model makes sense

ALCUSEij = 0i + 1i ( AGEij 14) + ij where ij ~ N (0, 2 )

Yij = 0i + 1i TIMEij + ij
is true initial status (ie, when TIME=0)
(ALDA, Section 4.1, pp.76-80)

is true rate of change per unit of TIME

portion of is outcome that is unexplained on occasion j

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 4

Specifying the level-2 submodels for individual differences in change Specifying the level-2 submodels for individual differences in change
Examining variation in OLS-fitted Examining variation in OLS-fitted level-1 trajectories by: level-1 trajectories by:

COA: COAs have higher intercepts but no COA: COAs have higher intercepts but no steeper slopes steeper slopes PEER (split at mean): Teens whose friends at PEER (split at mean): Teens whose friends at age 14 drink more have higher intercepts but age 14 drink more have higher intercepts but shallower slopes shallower slopes
COA = 0 COA = 1
4 ALCUSE

Level-2 intercepts Level-2 intercepts Population average Population average initial status and rate of initial status and rate of change for a non-COA change for a non-COA Level-2 slopes Level-2 slopes Effect of COA on Effect of COA on initial status and initial status and rate of change rate of change

ALCUSE

0i = 00 + 01COAi + 0i
1i = 10 + 11COAi + 1i
13 14 15 AGE 16 17

(for initial status) (for rate of change)

-1

13

14

15 AGE

16

17

-1

Low PEER
4 ALCUSE 4 ALCUSE

High PEER

Level-2 residuals Level-2 residuals 2 0 0 01 0i Deviations of individual Deviations of individual ~ N 0, 2 change trajectories around change trajectories around 1i 10 1 predicted averages predicted averages
13 14 15 AGE 16 17

-1

13

14

15 AGE

16

17

-1

(ALDA, Section 4.1, pp.76-80)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 5

Developing the composite specification of the multilevel model for change Developing the composite specification of the multilevel model for change
by substituting the level-2 submodels into the level-1 individual growth model by substituting the level-2 submodels into the level-1 individual growth model

0i = 00 + 01COAi + 0i

1i = 10 + 11COAi + 1i

Yij = 0i + 1i TIMEij + ij
Y ij = ( 00 + 01 COA i + 0 i ) + ( 10 + 11 COA i + 1i )TIME
ij

+ ij

Yij = [ 00 + 10TIMEij + 01COAi + 11 (COAi TIMEij )] + [ 0i + 1i TIMEij + ij ]

The composite specification shows how The composite specification shows how the outcome depends simultaneously on: the outcome depends simultaneously on:

The composite specification also: The composite specification also:

the level-1 predictor TIME and the level-2 the level-1 predictor TIME and the level-2 predictor COA as well as predictor COA as well as the cross-level interaction, COATIME. the cross-level interaction, COATIME. This tells us that the effect of one predictor This tells us that the effect of one predictor (TIME) differs by the levels of another (TIME) differs by the levels of another predictor (COA) predictor (COA)

Demonstrates the complexity of the Demonstrates the complexity of the composite residualthis isis not regular composite residualthis not regular OLS regression OLS regression Is the specification used by most software Is the specification used by most software packages for multilevel modeling packages for multilevel modeling Is the specification that maps most easily Is the specification that maps most easily onto the person-period data set onto the person-period data set

(ALDA, Section 4.2, pp. 80-83)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 6

The person-period data set and its relationship to the composite specification The person-period data set and its relationship to the composite specification

ID 3 3 3 4 4 4 44 44 44 66 66 66

ALCUSE 1.00 2.00 3.32 0.00 2.00 1.73 0.00 1.41 3.00 1.41 3.46 3.00

AGE-14 0 1 2 0 1 2 0 1 2 0 1 2

COA 1 1 1 1 1 1 0 0 0 0 0 0

COA*(AGE-14) 0 1 2 0 1 2 0 0 0 0 0 0

ALCUSE = [ 00 + 10 ( AGE 14)ij + 01COA + 11(COA ( AGE 14)ij )] ij i i + [ 0i + 1i ( AGE 14)ij + ij ]

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 7

Words of advice before beginning data analysis Words of advice before beginning data analysis
Be sure youve examined Be sure youve examined empirical growth plots and empirical growth plots and fitted OLS trajectories. You fitted OLS trajectories. You First steps: Two unconditional models First steps: Two unconditional models 1. Unconditional means modela model 1. Unconditional means modela model with no predictors at either level, which with no predictors at either level, which will help partition the total outcome will help partition the total outcome variation variation 2. Unconditional growth modela model 2. Unconditional growth modela model with TIME as the only level-1 predictor with TIME as the only level-1 predictor and no substantive predictors at level and no substantive predictors at level 2, which will help evaluate the baseline 2, which will help evaluate the baseline amount of change. amount of change. What these unconditional models tell us: What these unconditional models tell us: 1. Whether there is systematic variation 1. Whether there is systematic variation in the outcome worth exploring and, if in the outcome worth exploring and, if so, where that variation lies (within or so, where that variation lies (within or between people) between people) 2. How much total variation there is both 2. How much total variation there is both within- and between-persons, which within- and between-persons, which provides a baseline for evaluating the provides a baseline for evaluating the success of subsequent model building success of subsequent model building (that includes substantive predictors) (that includes substantive predictors)

dont want to begin data analysis dont want to begin data analysis without being reasonably confident without being reasonably confident that you have aa sound level-1 that you have sound level-1 model. model.

Double check (and then triple Double check (and then triple check) your person-period check) your person-period data set. data set.
Run simple diagnostics using Run simple diagnostics using statistical programs with which statistical programs with which youre very comfortable youre very comfortable Once again, you dont want to Once again, you dont want to invest too much data analytic invest too much data analytic effort in aa mis-formed data set effort in mis-formed data set

Dont jump in by fitting aa Dont jump in by fitting range of models with range of models with substantive predictors. Yes, substantive predictors. Yes,

you want to know the answer, you want to know the answer, but first you need to understand but first you need to understand how the data behave, so instead how the data behave, so instead you should you should

(ALDA, Section 4.4, p. 92+)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 8

The Unconditional Means Model (Model A) The Unconditional Means Model (Model A) Partitioning total outcome variation between and within persons Partitioning total outcome variation between and within persons
Level-1 Model: Y ij = 0 i + ij , where ij ~ N ( 0 , 2 )
2 Level-2 Model: 0i = 00 + 0i , where 0i ~ N (0, 0 )

Composite Model: Y ij = 00 + 0 i + ij
Grand mean across individuals and occasions

Within-person deviations

Person-specific means

Within-person variance

Between-person variance

Lets look more closely at these variances.


(ALDA, Section 4.4.1, p. 92-97)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 9

Using the unconditional means model to estimate Using the unconditional means model to estimate the Intraclass Correlation Coefficient (ICC or )) the Intraclass Correlation Coefficient (ICC or
Major purpose of the unconditional Major purpose of the unconditional means model: To partition the means model: To partition the variation in Y into two components variation in Y into two components Estimated within-person variance: Quantifies the
amount of variation within individuals over time

Estimated between-person variance: Quantifies the


amount of variation between individuals, regardless of time

Intraclass correlation compares the relative magnitude of these VCs by estimating the

2 0 2 0 + 2

proportion of total variation in Y that lies between people

0 . 564 = 0 . 50 0 . 564 + 0 . 562

An estimated 50% of the total variation in alcohol use is attributable to differences between adolescents

Having partitioned the total variation into within-persons and between-persons, lets ask: What role does TIME play?
(ALDA, Section 4.4.1, p. 92-97)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 10

The Unconditional Growth Model (Model B) The Unconditional Growth Model (Model B) A baseline model for change over time A baseline model for change over time
Level-1 Model: Yij = 0 i + 1i TIME Level-2 Model: Composite Model:
ij

+ ij , where ij ~ N ( 0 , 2 )
01 12
Composite residual

0i = 00 + 0i 1i = 10 + 1i

0 2 where 0i ~ N , 0 0 1i 10

Yij = 00 + 10TIME ij + [ 0 i + 1iTIME ij + ij ]


Average true rate of change

Average true initial status at AGE 14

ALCUSE

ALCUSE = 0.651+ 0.271 AGE 14) (

0 13 14 15 AGE 16 17

What about the variance components from this unconditional growth model?
(ALDA, Section 4.4.2, pp 97-102)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 11

The unconditional growth model: Interpreting the variance components The unconditional growth model: Interpreting the variance components

Level-1 (within person) There is still unexplained within-person residual variance

Level-2 (between-persons):
There is between-person residual variance in initial status (but careful, because the definition of initial status has changed) There is between-person residual variance in rate of change (should consider adding a level-2 predictor) Estimated res. covariance between initial status and change is n.s.

Sowhat has been the effect of moving from an unconditional means model to an unconditional growth model?
(ALDA, Section 4.4.2, pp 97-102)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 12

Quantifying the proportion of outcome variation explained Quantifying the proportion of outcome variation explained
R2 = Proportional reduction in the Level - 1 variance component 0.562 0.337 = = 0.40 .562
40% of the within-person variation in ALCUSE is associated with linear time

RY2 ,Y = rY ,Y

( )

= (0 . 21 ) = 0 . 043
2

4.3% of the total variation in ALCUSE is associated with linear time


For later: Extending the idea of proportional reduction For later: Extending the idea of proportional reduction in variance components to Level-2 (to estimate the percentage of in variance components to Level-2 (to estimate the percentage of between-person variation in ALCUSE associated with predictors) between-person variation in ALCUSE associated with predictors)

PseudoR2 =

) 2 (UncondGrowthModel 2 (LaterGrowthModel) 2 (UncondGrowthModel )

Careful : :Dont do this comparison with the unconditional means model Careful Dont do this comparison with the unconditional means model (as you can see in this table!). (as you can see in this table!).
(ALDA, Section 4.4.3, pp 102-104)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 13

Where weve been and where were going Where weve been and where were going
What these unconditional models tell us: 1. About half the total variation in ALCUSE is attributable to differences among teens 2. About 40% of the within-teen variation in ALCUSE is explained by linear TIME 3. There is significant variation in both initial status and rate of change so it pays to explore substantive predictors (COA & PEER)

How do we build statistical models?


Use all your intuition and skill you bring from the cross sectional world

But because the data are longitudinal, we have some other options
Multiple level-2 outcomes (the individual growth parameters)each can be related separately to predictors Two kinds of effects being modeled:
Fixed effects Variance components Not all effects are required in every model

Examine the effect of each predictor separately Prioritize the predictors,


Focus on your question predictors Include interesting and important control predictors

Progress towards a final model whose interpretation addresses your research questions

(ALDA, Section 4.5.1, pp 105-106)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 14

What will our analytic strategy be? What will our analytic strategy be?
Because our research interest focuses on the effect of COA, essentially treating PEER is a control, were going to proceed as follows

Model C: COA predicts both Model C: COA predicts both initial status and rate of change. initial status and rate of change.

Model D: Adds PEER to both Model D: Adds PEER to both Level-2 sub-models in Model C. Level-2 sub-models in Model C.

Model E: Simplifies Model D by Model E: Simplifies Model D by removing the non-significant removing the non-significant effect of COA on change. effect of COA on change.

(ALDA, Section 4.5.1, pp 105-106)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 15

Model C: Assessing the uncontrolled effects of COA (the question predictor) Model C: Assessing the uncontrolled effects of COA (the question predictor)
Fixed effects Fixed effects Est. initial value of ALCUSE for non-COAs is Est. initial value of ALCUSE for non-COAs is 0.316 (p<.001) 0.316 (p<.001) Est. differential in initial ALCUSE between Est. differential in initial ALCUSE between COAs and non-COAs is 0.743 (p<.001) COAs and non-COAs is 0.743 (p<.001) Est. annual rate of change in ALCUSE for nonEst. annual rate of change in ALCUSE for nonCOAs is 0.293 (p<.001) COAs is 0.293 (p<.001) Estimated differential in annual rate of change Estimated differential in annual rate of change between COAs and non-COAS is 0.049 (ns) between COAs and non-COAS is 0.049 (ns) Variance components Variance components Within person VC is identical to Bs because no Within person VC is identical to Bs because no predictors were added predictors were added Initial status VC declines from B: COA Initial status VC declines from B: COA explains 22% of variation in initial status (but explains 22% of variation in initial status (but still stat sig. suggesting need for level-2 preds) still stat sig. suggesting need for level-2 preds) Rate of change VC unchanged from B: COA Rate of change VC unchanged from B: COA explains no variation in change (but also still explains no variation in change (but also still sig suggesting need for level-2 preds) sig suggesting need for level-2 preds)

Next step?
Remove COA? Not yetquestion predictor Add PEERYes, to examine controlled effects of COA
(ALDA, Section 4.5.2, pp 107-108)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 16

Model D: Assessing the controlled effects of COA (the question predictor) Model D: Assessing the controlled effects of COA (the question predictor)
Fixed effects of COA Fixed effects of COA Est. diff in ALCUSE between COAs and nonEst. diff in ALCUSE between COAs and nonCOAs, controlling for PEER, is 0.579 (p<.001) COAs, controlling for PEER, is 0.579 (p<.001) No sig. Difference in rate of change No sig. Difference in rate of change Fixed effects of PEER Fixed effects of PEER Teens whose peers drink more at 14 also drink Teens whose peers drink more at 14 also drink more at 14 (initial status) more at 14 (initial status) Modest neg effect on rate of change (p<.10) Modest neg effect on rate of change (p<.10) Variance components Variance components Within person VC unchanged (as expected) Within person VC unchanged (as expected) Still sig. variation in both initial status and Still sig. variation in both initial status and changeneed other level-2 predictors changeneed other level-2 predictors Taken together, PEER and COA explain Taken together, PEER and COA explain
61.4% of the variation in initial status 61.4% of the variation in initial status 7.9% of the variation in rates of change 7.9% of the variation in rates of change

Next step?
If we had other predictors, wed add them because the VCs are still significant Simplify the model? Since COA is not associated with rate of change, why not remove this term from the model?

(ALDA, Section 4.5.2, pp 108-109)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 17

Model E: Removing the non-significant effect of COA on rate of change Model E: Removing the non-significant effect of COA on rate of change
Fixed effects of COA Fixed effects of COA Controlling for PEER, the estimated diff in ALCUSE Controlling for PEER, the estimated diff in ALCUSE between COAs and non-COAs is 0.571 (p<.001) between COAs and non-COAs is 0.571 (p<.001) Fixed effects of PEER Fixed effects of PEER Controlling for COA, for each 11 pt difference in PEER, Controlling for COA, for each pt difference in PEER, initial ALCUSE is 0.695 higher (p<.001) but rate initial ALCUSE is 0.695 higher (p<.001) but rate of change in ALCUSE is 0.151 lower (p<.10) of change in ALCUSE is 0.151 lower (p<.10)

Variance components are unchanged suggesting Variance components are unchanged suggesting little is lost by eliminating the main effect of COA on little is lost by eliminating the main effect of COA on rate of change (although there is still level-2 rate of change (although there is still level-2 variance left to be predicted by other variables) variance left to be predicted by other variables) Partial covariance is indistinguishable from 0. Partial covariance is indistinguishable from 0. After controlling for PEER and COA, initial After controlling for PEER and COA, initial status and rate of change are unrelated status and rate of change are unrelated

(ALDA, Section 4.5.2, pp 109-110)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 18

Where weve been and where were going Where weve been and where were going

Lets call Model E our tentative final model (based on not just these results but many other analyses not shown here) Controlling for the effects of PEER, the estimated differential in ALCUSE between COAs and nonCOAs is 0.571 (p<.001) Controlling for the effects of COA, for each 1-pt difference in PEER: the average initial ALCUSE is 0.695 higher (p<.001) and average rate of change is 0.151 lower (p<.10)

Displaying prototypical trajectories Recentering predictors to improve interpretation Alternative strategies for hypothesis testing: Comparing models using Deviance statistics and information criteria Additional comments about estimation

(ALDA, Section 4.5.1, pp 105-106)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 19

Displaying analytic results: Constructing prototypical fitted plots Displaying analytic results: Constructing prototypical fitted plots
Key idea: Substitute prototypical values for Key idea: Substitute prototypical values for the predictors into the fitted models to yield the predictors into the fitted models to yield prototypical fitted growth trajectories prototypical fitted growth trajectories
Review of the basic approach (with one dichotomous predictor)

Model C :

0i = 0.316 + 0.743COA 1i = 0.293 0.049COA

1. Substitute observed values for COA (0 and 1)


ALCUSE

COA = 1

= 0.316 + 0.743(0) = 0.316 When COAi = 0 : 0i 1i = 0.293 0.049(0) = 0.293 = 0.316 + 0.743(1) = 1.059 When COAi = 1 0i 1i = 0.293 0.049(1) = 0.244

COA = 0

2. Substitute the estimated growth parameters into the level-1 growth model when COAi = 0 : Yij = 0.316 + 0.293TIME when COAi = 1 : Yij = 1.059 + 0.244TIME

0 13 14 15 AGE 16 17

What happens when the predictors arent all dichotomous?

(ALDA, Section 4.5.3, pp 110-113)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 20

Constructing prototypical fitted plots when some predictors are continuous Constructing prototypical fitted plots when some predictors are continuous
Key idea: Select interesting values of continuous predictors and plot prototypical trajectories by selecting: 1. Substantively interesting values. This is easiest when the predictor has inherently appealing values (e.g., 8, 12, and 16 years of education in the US) 2. A range of percentiles. When there are no well-known values, consider using a range of percentiles (either the 25th, 50th and 75th or the 10th, 50th, and 90th) 3. The sample mean .5 (or 1) standard deviation. Best used with predictors with a symmetric distribution 4. The sample mean (on its own). If you dont want to display a predictors effect but just control for it, use just its sample mean Remember that exposition can be easier if you select whole number values (if the scale permits) or easily communicated fractions (eg.,, , , )

PEER: mean=1.018, sd = 0.726

Low PEER: 1.018-.5( 0.726) = 0.655


ALCUSE

High PEER: 1.018+.5( 0.726) = 1.381

Model E
0i = 0.314 + 0.695 PEER + 0.571COA 1i = 0.425 0.151PEER
COA = 1

High

PEER
Low High

COA = 0

PEER

Intercepts for plotting

Slopes for plotting

Low

0 13 14 15 AGE 16 17

(ALDA, Section 4.5.3, pp 110-113)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 21

How can centering predictors improve the interpretation of their effects? How can centering predictors improve the interpretation of their effects?
At level-1, re-centering TIME is At level-1, re-centering TIME is usually beneficial usually beneficial Ensures that the individual Ensures that the individual intercepts are easily intercepts are easily interpretable, corresponding to interpretable, corresponding to status at aaspecific age status at specific age Often use initial status, but Often use initial status, but as well see, we can center as well see, we can center TIME on any sensible value TIME on any sensible value
Model F centers only PEER Model G centers PEER and COA

Many estimates are unaffected by centering

At level-2, you can re-center by At level-2, you can re-center by subtracting out: subtracting out: The sample mean, which causes The sample mean, which causes the level-2 intercepts to represent the level-2 intercepts to represent average fitted values (mean average fitted values (mean PEER=1.018; mean COA=0.451) PEER=1.018; mean COA=0.451) Another meaningful value, e.g., Another meaningful value, e.g., 12 yrs of ed, IQ of 100 12 yrs of ed, IQ of 100

As expected, centering the level-2 predictors changes the level-2 intercepts

Fs intercepts describe an average non-COA Gs intercepts describe an average teen

(ALDA, Section 4.5.4, pp 113-116)

Our preference: Here we prefer model F because it leaves the dichotomous question predictor COA uncentered
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 22

Hypothesis testing: What weve been doing and an alternative approach Hypothesis testing: What weve been doing and an alternative approach
Single parameter hypothesis tests Single parameter hypothesis tests Deviance based hypothesis tests Deviance based hypothesis tests

Simple to conduct and easy to interpret Simple to conduct and easy to interpret making them very useful in hands on data making them very useful in hands on data analysis (as weve been doing) analysis (as weve been doing) However, statisticians disagree about their However, statisticians disagree about their nature, form, and effectiveness nature, form, and effectiveness Disagreement is do strong that some software Disagreement is do strong that some software packages (e.g., MLwiN) wont output them packages (e.g., MLwiN) wont output them Their behavior is poorest for tests on variance Their behavior is poorest for tests on variance components components

Based on the log likelihood (LL) statistic that is Based on the log likelihood (LL) statistic that is maximized under Maximum Likelihood maximized under Maximum Likelihood estimation estimation Have superior statistical properties (compared Have superior statistical properties (compared to the single parameter tests) to the single parameter tests) Special advantage: permit joint tests on Special advantage: permit joint tests on several parameters simultaneously several parameters simultaneously You need to do the tests manually because You need to do the tests manually because automatic tests are rarely what you want automatic tests are rarely what you want

Deviance = -2[LLcurrent model LLsaturated model]


Quantifies how much worse the current model Quantifies how much worse the current model is in comparison to aasaturated model is in comparison to saturated model
AAmodel with aasmall deviance statistic is nearly as model with small deviance statistic is nearly as good; aamodel with large deviance statistic is much good; model with large deviance statistic is much worse (we obviously prefer models with smaller deviance) worse (we obviously prefer models with smaller deviance)

Simplification: Because aasaturated model Simplification: Because saturated model


fits perfectly, its LL= 00and the second term fits perfectly, its LL= and the second term drops out, making Deviance = -2LLcurrent drops out, making Deviance = -2LL
current

(ALDA, Section 4.6, p 116)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 23

Hypothesis testing using Deviance statistics Hypothesis testing using Deviance statistics
You can use deviance statistics to compare You can use deviance statistics to compare two models ififtwo criteria are satisfied: two models two criteria are satisfied:
Both models are fit to the same exact data Both models are fit to the same exact data beware missing data beware missing data 2. One model is nested within the otherwe 2. One model is nested within the otherwe can specify the less complex model (e.g., A) can specify the less complex model (e.g., A) by imposing constraints on one or more by imposing constraints on one or more parameters in the more complex model (e.g., parameters in the more complex model (e.g., B), usually, but not always, setting them to 0) B), usually, but not always, setting them to 0)
1. 1.

If these conditions hold, then: If these conditions hold, then:

Difference in the two deviance statistics is Difference in the two deviance statistics is 2 asymptotically distributed as 2 asymptotically distributed as df = ##of independent constraints df = of independent constraints

1. We can obtain Model A from Model B by invoking 3 constraints: H0 : 10 = 0,12 = 0, 01 = 0

2: Compute difference in Deviance 2: Compute difference in Deviance 2 statistics and compare to appropriate 2 statistics and compare to appropriate distribution distribution Deviance ==33.55 (3 df, p<.001) Deviance 33.55 (3 df, p<.001) reject H0 reject H
0
(ALDA, Section 4.6.1, pp 116-119)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 24

Using deviance statistics to test more complex hypotheses Using deviance statistics to test more complex hypotheses
Key idea: Deviance statistics are great for Key idea: Deviance statistics are great for simultaneously evaluating the effects of simultaneously evaluating the effects of adding predictors to both level-2 models adding predictors to both level-2 models We can obtain Model B from Model C by invoking 2 constraints:

H 0 : 01 = 0, 11 = 0

2: Compute difference in Deviance 2: Compute difference in Deviance 2 statistics and compare to appropriate 2 statistics and compare to appropriate distribution distribution Deviance ==15.41 (2 df, p<.001) Deviance 15.41 (2 df, p<.001) reject H0 reject H
0

The pooled test does not imply that each level-2 slope is on its own statistically significant

(ALDA, Section 4.6.1, pp 116-119)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 25

Comparing non-nested multilevel models using AIC and BIC Comparing non-nested multilevel models using AIC and BIC
You can You can (supposedly) (supposedly) compare non-nested compare non-nested multilevel models multilevel models using information using information criteria criteria Information Criteria: AIC and BIC Information Criteria: AIC and BIC Each information criterion penalizes the logEach information criterion penalizes the loglikelihood statistic for excesses in the structure of likelihood statistic for excesses in the structure of the current model the current model
The AIC penalty accounts for the number of The AIC penalty accounts for the number of parameters in the model. parameters in the model. The BIC penalty goes further and also accounts for The BIC penalty goes further and also accounts for sample size. sample size.

Models need not be nested, Models need not be nested, but datasets must be the but datasets must be the same. same.

Smaller values of AIC & BIC indicate better fit Smaller values of AIC & BIC indicate better fit Heres the taxonomy of multilevel models that we ended up fitting, in the ALCUSE example.. Model E has the lowest AIC and BIC statistics

Interpreting differences in BIC Interpreting differences in BIC across models (Raftery, 1995): across models (Raftery, 1995):
0-2: Weak evidence 0-2: Weak evidence 2-6: Positive evidence 2-6: Positive evidence 6-10: Strong evidence 6-10: Strong evidence >10: Very strong >10: Very strong

Careful: Gelman & Rubin (1995) declare these statistics and criteria to be off-target and only by serendipity manage to hit the target

(ALDA, Section 4.6.4, pp 120-122)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 26

A final comment about estimation and hypothesis testing A final comment about estimation and hypothesis testing
Two most common methods of estimation Maximum likelihood (ML): Maximum likelihood (ML): Generalized Least Squares (GLS) (& Iterative Generalized Least Squares (GLS) (& Iterative GLS): : Iteratively seeks those parameter estimates that GLS) Iteratively seeks those parameter estimates that

Seeks those parameter estimates that maximize the likelihood Seeks those parameter estimates that maximize the likelihood function, which assesses the joint probability of function, which assesses the joint probability of simultaneously observing all the sample data actually simultaneously observing all the sample data actually obtained (implemented, e.g., in HLM and SAS Proc Mixed). obtained (implemented, e.g., in HLM and SAS Proc Mixed).

minimize the sum of squared residuals (allowing them to be minimize the sum of squared residuals (allowing them to be autocorrelated and heteroscedastic) (implemented, e.g., in autocorrelated and heteroscedastic) (implemented, e.g., in MLwiN). MLwiN).

A more important distinction: Full vs. Restricted (ML or GLS) Full: Simultaneously estimate the fixed effects and Full: Simultaneously estimate the fixed effects and Restricted: Sequentially estimate the fixed effects Restricted: Sequentially estimate the fixed effects

the variance components. the variance components. Default in MLwiN & HLM Default in MLwiN & HLM

and then the variance components and then the variance components Default in SAS Proc Mixed Default in SAS Proc Mixed

Goodness of fit statistics apply to Goodness of fit statistics apply to the entire model the entire model (bothfixed and random effects) fixed and random effects) (both This is the method weve used in This is the method weve used in both the examples shown so far both the examples shown so far

Goodness of fit statistics apply to Goodness of fit statistics apply to only the random effects only the random effects So we can only test hypotheses about So we can only test hypotheses about VCs (and the models being compared VCs (and the models being compared must have identical fixed effects) must have identical fixed effects)

(ALDA, Section, 3.4, pp 63-68; Section 4.3, pp 85-92)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 27

Other topics covered in Chapter Four of ALDA Other topics covered in Chapter Four of ALDA
Using Wald statistics to test composite hypotheses Using Wald statistics to test composite hypotheses about fixed effects (4.7)generalization of the about fixed effects (4.7)generalization of the parameter estimate divided by its standard error parameter estimate divided by its standard error approach that allows you to test composite hypotheses approach that allows you to test composite hypotheses about fixed effects, even if youve used restricted about fixed effects, even if youve used restricted estimation methods estimation methods Evaluating the tenability of the models assumptions Evaluating the tenability of the models assumptions (4.8) (4.8)
Checking functional form Checking functional form Checking normality Checking normality Checking homoscedasticity Checking homoscedasticity

Model-Based (empirical Bayes) estimates of the Model-Based (empirical Bayes) estimates of the individual growth parameters (4.9) Superior estimates individual growth parameters (4.9) Superior estimates that combine OLS estimates with population average that combine OLS estimates with population average estimates that are usually your best bet if you would like estimates that are usually your best bet if you would like to display individual growth trajectories for particular to display individual growth trajectories for particular sample members sample members

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 4, slide 28

Extending the multilevel model for change


ALDA, Chapter Five

Change is a measure of time Edwin Way Teale

John B. Willett & Judith D. Singer Harvard Graduate School of Education

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 1

Chapter 5: Treating TIME more flexibly Chapter 5: Treating TIME more flexibly
General idea: Although all our examples have been equally spaced, time-structured, and fully balanced, the multilevel model for change is actually far more flexible
Variably spaced measurement occasions (5.1)each Variably spaced measurement occasions (5.1)each individual can have his or her own customized data individual can have his or her own customized data collection schedule collection schedule Varying numbers of waves of data (5.2)not everyone Varying numbers of waves of data (5.2)not everyone need have the same number of waves of data need have the same number of waves of data
Allows us to handle missing data Allows us to handle missing data Can even include individuals with just one or two waves Can even include individuals with just one or two waves

Including time-varying predictors (5.3) Including time-varying predictors (5.3)


The values of some predictors vary over time The values of some predictors vary over time Theyre easy to include and can have powerful interpretations Theyre easy to include and can have powerful interpretations

Re-centering the effect of TIME (5.4) Re-centering the effect of TIME (5.4)
Initial status is not the only centering constant for TIME Initial status is not the only centering constant for TIME Recentering TIME in the level-1 model improves interpretation Recentering TIME in the level-1 model improves interpretation in the level-2 model in the level-2 model

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 2

Example for handling variably spaced waves: Reading achievement over time Example for handling variably spaced waves: Reading achievement over time Data source: Children of the National Longitudinal Survey of Youth (CNLSY)
Sample: 89 children Sample: 89 children Research design Research design
Each approximately 66years old at study start Each approximately years old at study start 33waves of data collected in 1986, 1988, and waves of data collected in 1986, 1988, and 1990, when the children were to be in their 1990, when the children were to be in their 6th yr, in their 8th yr, and in their 10th 6th yr, in their 8th yr, and in their 10th yr yr Of course, not each child was tested on Of course, not each child was tested on his/her birthday or half-birthday, which his/her birthday or half-birthday, which creates the variably spaced waves creates the variably spaced waves The outcome, PIAT, is the childs The outcome, PIAT, is the childs unstandardized score on the reading portion unstandardized score on the reading portion of the Peabody Individual Achievement Test of the Peabody Individual Achievement Test Not standardized for age so we can see Not standardized for age so we can see growth over time growth over time No substantive predictors to keep the No substantive predictors to keep the example simple example simple How do PIAT scores change over time? How do PIAT scores change over time?

Research question Research question

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 3

What does the person-period data set look like when waves are variably spaced? What does the person-period data set look like when waves are variably spaced?

Person-period data sets are easy to construct even with variably spaced waves

We could build models of PIAT scores over time using ANY of these 3 measures for TIMEso which should we use?

Three different ways of coding TIME WAVEreflects design but has no substantive meaning AGEGRPchilds expected age on each occasion AGEchilds actual age (to the day) on each occasionnotice occasion creeplater waves are more likely to be even later in a childs life

(ALDA, Section 5.1.1, pp 139-144)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 4

Comparing OLS trajectories fit using AGEGRP and AGE Comparing OLS trajectories fit using AGEGRP and AGE
80 60 40 20 0
5 6 7 8 9 10 11 12

80 60 40 20 0
5 6 7 8 9 10 11 12

80 60 40 20 0
5 6

AGEGRP (+s with solid line)

For many childrenespecially those assessed near the half-yearsit makes little difference
AGE (s with dashed line)
7 8 9 10 11 12

80 60 40 20 0
5 6 7 8 9 10 11 12

80 60 40 20 0
5 6 7 8 9 10 11 12

80 60 40 20 0
5 6 7 8 9 10 11 12

Why ever use rounded AGE? Note that this what we did in the past two examples, and so do lots of researchers!!!

80 60 40 20 0
5 6 7 8 9 10 11 12

80 60 40 20 0
5 6 7 8 9 10 11 12

80 60 40 20 0
5 6 7 8 9 10 11 12

For some children thoughtheres a big difference in slope, which is our conceptual outcome (rate of change)

(ALDA, Figure 5.1 p. 143)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 5

Comparing models fit with AGEGRP and AGE Comparing models fit with AGEGRP and AGE
Level-1 Model: Level-2 Model: Composite Model:

Yij = 0i + 1i TIMEij + ij , where ij ~ N (0, 2 )


0i = 00 + 0i 1i = 10 + 1i
0 2 where 0i ~ N , 0 0 1i 10

01 12

By writing the level-1 By writing the level-1 model using the generic model using the generic predictor TIME, the predictor TIME, the specification is identical specification is identical

Yij = 00 + 10TIME ij + [ 0 i + 1iTIME ij + ij ]


Some parameter estimates are virtually identical Other ests larger with AGEGRP 10 , the slope, is pt larger cumulates to a 2 pt diff over 4 yrs Level-2 VCs are also larger AGEGRP associates the data from later waves with earlier ages than observed, making the slope steeper Unexplained variation for initial status is associated with real AGE

AIC and BIC better with AGE

Treating an unstructured data set as structured introduces error into the analysis
(ALDA, Section 5.1.2, pp 144-146)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 6

Example for handling varying numbers of waves: Wages of HS dropouts Example for handling varying numbers of waves: Wages of HS dropouts Data source: Murnane, Boudett and Willett (1999), Evaluation Review
Sample: 888 male high school dropouts Sample: 888 male high school dropouts
Based on the National Longitudinal Survey of Based on the National Longitudinal Survey of Youth (NLSY) Youth (NLSY) Tracked from first job since HS dropout, Tracked from first job since HS dropout, when the men varied in age from 14 to 17 when the men varied in age from 14 to 17 Each interviewed between 11and 13 times Each interviewed between and 13 times

Research design Research design

Both variable number and spacing of waves Both variable number and spacing of waves Outcome is log(WAGES), inflation adjusted Outcome is log(WAGES), inflation adjusted natural logarithm of hourly wage natural logarithm of hourly wage

Interviews were approximately annual, but some were Interviews were approximately annual, but some were every 22 years every years Each waves interview conducted at different times Each waves interview conducted at different times during the year during the year

Research question Research question

How do log(WAGES) change over time? How do log(WAGES) change over time? Do the wage trajectories differ by ethnicity Do the wage trajectories differ by ethnicity and highest grade completed? and highest grade completed?

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 7

Examining a person-period data set with varying numbers of waves of data per person Examining a person-period data set with varying numbers of waves of data per person

ID 206 has 3 waves # waves


1 2 3-4 5-6 7-8 9-10 >10

N men
38 39 82 166 226 240 97

ID 332 has 10 waves

ID 1028 has 7 waves

EXPER = specific moment (to the nearest day) in each mans labor force history Varying # of waves Varying spacing LNW in constant dollars seems to rise over time
(ALDA, Section 5.2.1, pp 146-148)

Covariates: Race and Highest Grade Completed

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 8

Fitting multilevel models for change when data sets have varying numbers of waves Fitting multilevel models for change when data sets have varying numbers of waves
Everything remains the sametheres really no difference! Everything remains the sametheres really no difference!
Unconditional growth model: On average, a dropouts hourly wage increases with work experience 100(e(0.0457)-1)=4.7 is the %age change in Y per annum

Model C: an intermediate final model Almost identical Deviance as Model B Effect of HGCdropouts who stay in school longer earn higher wages on labor force entry (~4% higher per yr of school) Effect of BLACKin contrast to Whites and Latinos, the wage of Black males increase less rapidly with labor force experience Rate of change for Whites and Latinos is 100(e0.489-1)=5.0% Rate of change for Blacks is 100(e0.489-0.0161-1)=3.3% Significant level-2 VCs indicate that theres still unexplained variationthis is hardly a final model
Fully specified growth model (both HGC & BLACK) HGC is associated with initial status (but not change) BLACK is associated with change (but not initial status)

Fit Model C, which removes non-significant parameters

(ALDA, Table 5.4 p. 149)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 9

Prototypical wage trajectories of HS dropouts Prototypical wage trajectories of HS dropouts


Race At dropout, no racial differences in wages Racial disparities increase over time because wages for Blacks increase at a slower rate

2.4

LNW White/Latino

2.2 Black 12 th grade dropouts

2.0

1.8 9 th grade dropouts

Highest grade completed Those who stay in school longer have higher initial wages This differential remains constant over time (lines remain parallel)

1.6 0 2 4 6 EXPER 8 10

(ALDA, Section 5.2.1 and 5.2.2, pp150-156) D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 10 Judith

Practical advice: Problems can arise when analyzing unbalanced data sets Practical advice: Problems can arise when analyzing unbalanced data sets
The multilevel model for change is designed to handle The multilevel model for change is designed to handle unbalanced data sets, and in most circumstances, it does unbalanced data sets, and in most circumstances, it does its job well, however its job well, however When imbalance is severe, or lots of people have just 11 When imbalance is severe, or lots of people have just or 22waves of data, problems can occur or waves of data, problems can occur
You may not estimate some parameters (well) You may not estimate some parameters (well) Iterative fitting algorithms may not converge Iterative fitting algorithms may not converge Some estimates may hit boundary constraints Some estimates may hit boundary constraints Problem is usually manifested via VCs not fixed effects (because the Problem is usually manifested via VCs not fixed effects (because the fixed portion of the model is like aaregular regression model). fixed portion of the model is like regular regression model). IfIfyoure lucky, youll get negative variance components youre lucky, youll get negative variance components Another sign is too much time to convergence (or no convergence) Another sign is too much time to convergence (or no convergence) Most common problem: your model is overspecified Most common problem: your model is overspecified Most common solution: simplify the model Most common solution: simplify the model

Software packages may not issue clear warning signs Software packages may not issue clear warning signs

Many practical strategies discussed in ALDA, Section 5.2.2 Many practical strategies discussed in ALDA, Section 5.2.2
Another major advantage of the multilevel model for change: How easy it is to include time-varying predictors
(ALDA, Section 5.2.2, pp151-156) Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 11

Example for illustrating time-varying predictors: Unemployment & depression Example for illustrating time-varying predictors: Unemployment & depression Source: Liz Ginexi and colleagues (2000), J of Occupational Health Psychology
Sample: 254 people identified at unemployment offices. Sample: 254 people identified at unemployment offices. Research design: Goal was to collect 33waves of data per person Research design: Goal was to collect waves of data per person

Research question Research question

at 1, 55and 11 months of job loss. In reality, however, data set is not at 1, and 11 months of job loss. In reality, however, data set is not time-structured: time-structured: Interview 11was within 11day and 22months of job loss Interview was within day and months of job loss Interview 22was between 33and 88months of job loss Interview was between and months of job loss Interview 33was between 10 and 16 months of job loss Interview was between 10 and 16 months of job loss In addition, not everyone completed the 2nd and 3rd In addition, not everyone completed the 2nd and 3rd interview. interview. Time-varying predictor: Unemployment status (UNEMP) Time-varying predictor: Unemployment status (UNEMP) 132 remained unemployed at every interview 132 remained unemployed at every interview 61 were always working after the 1st interview 61 were always working after the 1st interview 41 were still unemployed at the 2nd interview, but 41 were still unemployed at the 2nd interview, but working by the 3rd working by the 3rd 19 were working at the 2nd interview, but were 19 were working at the 2nd interview, but were unemployed again by the 3rd unemployed again by the 3rd Outcome: CES-D scale20 4-pt items (score of 00to 80) Outcome: CES-D scale20 4-pt items (score of to 80) How does unemployment affect depression symptomatology? How does unemployment affect depression symptomatology?

(ALDA, Section 5.3..1, pp160-161) Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 12

A person-period data set with a time-varying predictor A person-period data set with a time-varying predictor

TIME=MONTHS since job loss

UNEMP (by design, must be 1 at wave 1)

ID 7589 has 3 waves, all unemployed

ID 65641 has 3 waves, re-employed after 1st wave ID 53782 has 3 waves, re-employed at 2nd, unemployed again at 3rd

(ALDA, Table 5.6, p161)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 13

Analytic approach: Were going to sequentially fit 4 increasingly complex models Analytic approach: Were going to sequentially fit 4 increasingly complex models
Model A: An individual growth model with no substantive predictors Model B: Adding the main effect of UNEMP Model C: Allowing the effect of UNEMP to vary over TIME Model D: Also allows the effect of UNEMP to vary over TIME, but does so in a very particular way
Y ij = 0 i + 1 i TIME + ij , where ij ~ N ( 0 , 2 )

ij

Y ij = 00 + 10 TIME ij + 20 UNEMP ij + [ 0 i + 1i TIME ij + ij ]

Yij = 00 + 10 TIME ij + 20UNEMP ij + 30UNEMP ij TIME ij + [ 0 i + 1iTIME ij + ij ]

Yij = 00 + 20UNEMP + 30UNEMP TIMEij ij ij + [ 0i + 2iUNEMP + 3iUNEMP TIMEij + ij ] ij ij

As we go through this analysis, we will demonstrate: Strategies for the thoughtful inclusion of time varying predictors Strategies for practical data analysis more generally (youre almost ready to fly solo!) How both the level-1/level-2 and composite specifications facilitate understanding The need to simultaneously consider the models structural (fixed effects) and stochastic components (variance components) and whether you want them to be parallel
(ALDA, Section 5.3.1, pp 159-164)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 14

First step: Model A: The unconditional growth model First step: Model A: The unconditional growth model
Lets get a sense of the data by ignoring UNEMP and fitting the usual unconditional growth model

Level-1 Model: Y ij = 0 i + 1 i TIME Level-2 Model: Composite Model:


0 i = 00 + 0 i 1i = 10 + 1i

ij

+ ij , where ij ~ N ( 0 , 2 )
01 12

0 2 0 i where ~ N , 0 0 1i 10

How can it go at level-2??? It seems like it can go here

Yij = 00 + 10 TIME ij + [ 0 i + 1i TIME ij + ij ]


On the first day of job loss, the average person has an estimated CES-D of 17.7

On average, CES-D declines by 0.42/mo

Theres significant residual withinperson variation

Theres significant variation in initial status and rates of change

How do we add the timevarying predictor UNEMP?


(ALDA, Section 5.3.1, pp 159-164)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 15

Model B: Adding time-varying UNEMP to the composite specification Model B: Adding time-varying UNEMP to the composite specification
Y ij = 00 + 10 TIME ij + 20 UNEMP ij + [ 0 i + 1i TIME ij + ij ]
Logical impossibility Population average rate of change in CES-D, controlling for UNEMP Population average difference, over time, in CES-D by UNEMP status

How can we understand this graphically? Although the magnitude of the TV How can we understand this graphically? Although the magnitude of the TV predictors effect remains constant, the TV nature of UNEMP implies the predictors effect remains constant, the TV nature of UNEMP implies the existence of many possible population average trajectories, such as: existence of many possible population average trajectories, such as:
Remains unemployed
20

20

CES-D

CES-D

Reemployed at 5 months
20

CES-D

Reemployed at 10 months
20

CES-D

Reemployed at 5 months Unemployed again at 10

15

15

20

15

15

20
10 10

20

20

10

10

4 6 8 10 12 Months since job loss

14

10

12

14

Months since job loss

4 6 8 10 12 Months since job loss

14

4 6 8 10 12 Months since job loss

14

What happens when we fit Model B to data?


(ALDA, Section 5.3.1, pp 159-164)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 16

Fitting and interpreting Model B, which includes the TV predictor UNEMP Fitting and interpreting Model B, which includes the TV predictor UNEMP
Monthly rate of decline is cut in half by controlling for UNEMP (still sig.)

UNEMP has a large and stat sig effect

Model A is a much poorer fit ( Deviance = 25.5, 1 df, p<.001)

20

CES-D

Consistently unemployed (UNEMP=1):

UNEMP = 1 15

Y j = (12.6656 + 5.1113) 0.2020 MONTHS j Y j = 17.7769 0.2020MONTHS j


Consistently employed (UNEMP=0):

What about people who get a job?

10

UNEMP = 0

Y j = 12.6656 0.2020 MONTHS j


5 0
(ALDA, Section 5.3.1, pp. 162-167)

What about the variance components?


2 4 6 8 10 12 14 Months since job loss

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 17

Variance components behave differently when youre working with TV predictors Variance components behave differently when youre working with TV predictors
When analyzing time-invariant When analyzing time-invariant predictors, we know which VCs will predictors, we know which VCs will change and how: change and how:

When analyzing time-varying When analyzing time-varying predictors, all VCs can change, but predictors, all VCs can change, but

Level-1 VCs will remain relatively stable Level-1 VCs will remain relatively stable because time-invariant predictors cannot because time-invariant predictors cannot explain much within-person variation explain much within-person variation Level-2 VCs will decline ififthe timeLevel-2 VCs will decline the timeinvariant predictors explain some of the invariant predictors explain some of the between person variation between person variation Although you can interpret aadecrease in Although you can interpret decrease in the magnitude of the Level-1 VCs the magnitude of the Level-1 VCs Changes in Level-2 VCs may not be Changes in Level-2 VCs may not be meaningful! meaningful!

Level-1 VC, Adding UNEMP to the unconditional growth model (A) reduces its magnitude 68.85 to 62.39 UNEMP explains 9.4% of the variation in CES-D scores
2

Look what happened to the Level-2 VCs In this example, theyve increased! Why?: Because including a TV predictor changes the meaning of the individual growth parameters (e.g., the intercept now refers to the value of the outcome when all level-1 predictors, including UNEMP are 0). We can clarify whats happened by decomposing the composite specification back into a Level 1/Level-2 representation

(ALDA, Section 5.3.1, pp. 162-167)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 18

Decomposing the composite specification of Model B into a L1/L2 specification Decomposing the composite specification of Model B into a L1/L2 specification

Y ij = 00 + 10 TIME ij + 20 UNEMP ij + [ 0 i + 1i TIME ij + ij ]


Level-1 Model: Level-2 Models:
Yij = 0 i + 1i TIME ij + 2 i UNEMP ij + ij
Unlike time-invariant predictors, TV predictors go into the level-1 model

0i = 00 + 0i 1i = 10 + 1i 2i = 20

Model Bs level-2 model for 2i has no residual! Model B automatically assumes that 2i is fixed (that it has the same value for everyone).

Should we accept this constraint? Should we assume that the effect of the person-specific predictor is constant across people? When predictors are time-invariant, we have no choice When predictors are time-varying, we can try to relax this assumption
(ALDA, Section 5.3.1, pp. 168-169)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 19

Trying to add back the missing level-2 stochastic variation in the effect of UNEMP Trying to add back the missing level-2 stochastic variation in the effect of UNEMP
Level-1 Model: Level-2 Models:
Yij = 0 i + 1i TIME ij + 2 i UNEMP ij + ij

0i = 00 + 0i 1i = 10 + 1i 2i = 20 + 2i

Its easy to allow the effect of UNEMP to vary randomly across people by adding in a level-2 residual Check your software to be sure you know what youre doing.
2 0 0 0 i ~ N 0 , and 1i 10 0 20 2 i

But, you pay a price you may not be able to afford


Adding this one term adds 3 new VCs If you have only a few waves, you may not have enough data Here, we cant actually fit this model!!

ij ~ N ( 0, )
2

01 12 21

02 12 2 2

Moral: The multilevel model for change can easily handle TV predictors, but
Think carefully about the consequences for both the structural and stochastic parts of the model. Dont just buy the default specification in your software. Until youre sure you know what youre doing, always write out your model before specifying code to a computer package

So Are we happy with Model B as the final model??? Is there any other way to allow the effect of UNEMP to vary if not across people, across TIME?

(ALDA, Section 5.3.1, pp. 169-171)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 20

Model C: Might the effect of a TV predictor vary over time? Model C: Might the effect of a TV predictor vary over time?
When analyzing the effects of time-invariant predictors, we automatically allowed predictors to affect the trajectorys slope Because of the way in which weve constructed the models with TV predictors, weve automatically constrained UNEMP to have only a main effect influencing just the trajectorys level

To allow the effect of the TV predictor to vary over time, just add its interaction with TIME

Y ij = 00 + 10 TIME ij + 20 UNEMP ij + 30 UNEMP ij TIME ij + [ 0 i + 1i TIME ij + ij ]

Two possible (equivalent) interpretations: The effect of UNEMP differs across occasions The rate of change in depression differs by unemployment status

But you need to think very carefully about the hypothesized error structure: Weve basically added another level-1 parameter to capture the interaction Just like we asked for the main effect of the TV predictor UNEMP, should we allow the interaction effect to vary across people? We wont right now, but we will in a minute.

What happens when we fit Model C to data?

(ALDA, Section 5.3.2, pp. 171-172)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 21

Model C: Allowing the effect of a TV predictor to vary over time Model C: Allowing the effect of a TV predictor to vary over time
Main effect of TIME is now positive (!) & not stat sig ?!?!?!?!?!?!?!?!

UNEMP*TIME interaction is stat sig (p<.05) Model B is a much poorer fit than C ( Deviance = 4.6, 1 df, p<.05)

20

CES-D

Consistently unemployed (UNEMP=1)

UNEMP 15

=1

Y j = (9.6167 + 8.5291) + 0.(0.1620 0.4652) MONTHS j Y j = 18.1458 0.3032 MONTHS j


Consistently employed (UNEMP=0)
10 UNEMP =0

What about people who get a job?

Y j = 9.6167 + 0.1620MONTHS j
(ALDA, Section 5.3.2, pp. 171-172)

5 0 2 4 6 8 10 12 14 Months since job loss

Should the trajectory for the reemployed be constrained to 0?

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 22

How should we constrain the individual growth trajectory for the re-employed? How should we constrain the individual growth trajectory for the re-employed?
Should we remove the main effect of TIME? (which is the slope when UNEMP=0) Yes, but this creates a lack of congruence between the models fixed and stochastic parts

Y ij = 00 + 10 TIME ij + 20 UNEMP ij + 30 UNEMP ij TIME ij + [ 0 i + 1i TIME ij + ij ]


So, lets better align the parts by having UNEMP*TIME be both fixed and random

Y ij = 00 + 20 UNEMP ij + 30 UNEMP ij TIME ij + [ 0 i + 3i UNEMP ij TIME ij + ij ]


If were allowing the UNEMP*TIME slope to vary randomly, might we also need to allow the effect of UNEMP itself to vary randomly?

But, this actually fits worse (larger AIC & BIC)!

Model D:

Yij = 00 + 20UNEMP + 30UNEMP TIMEij ij ij + [ 0i + 2iUNEMP + 3iUNEMP TIMEij + ij ] ij ij


UNEMP*TIME has both a fixed & random effect What happens when we fit Model D to data?

UNEMP has both a fixed & random effect

(ALDA, Section 5.3.2, pp. 172-173)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 23

Model D: Constraining the individual growth trajectory among the reemployed Model D: Constraining the individual growth trajectory among the reemployed

Consistently unemployed = (11.2666 + 6.8795) 0.3254MONTHS Yj j

Y j = 18.1461 0.3254MONTHS j

Best fitting model (lowest AIC and BIC)

What about people who get a job?

Consistently employed

Y j = 11.2666
(ALDA, Section 5.3.2, pp. 172-173)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 24

Recentering the effects of TIME Recentering the effects of TIME

All our examples so far have centered TIME on the first wave of data collection
Allows us to interpret the level-1 intercept as individual is true initial status While commonplace and usually meaningful, this approach is not sacrosanct.

We always want to center TIME on a value that ensures that the level-1 growth parameters are meaningful, but there are other options
Middle TIME pointfocus on the average value of the outcome during the study Endpointfocus on final status Any inherently meaningful constant can be used

(ALDA, Section 5.4, pp. 181-182)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 25

Example for recentering the effects of TIME Example for recentering the effects of TIME
Data source: Tomarken & colleagues (1997) American Psychological Society Meetings

Sample: 73 men and women with major depression who Sample: 73 men and women with major depression who were already being treated with non-pharmacological were already being treated with non-pharmacological therapy therapy Research design Research design
Randomized trial to evaluate the efficacy of supplemental Randomized trial to evaluate the efficacy of supplemental antidepressants (vs. placebo) antidepressants (vs. placebo)

Research question: Research question:

Pre-intervention night, the researchers prevented all Pre-intervention night, the researchers prevented all participants from sleeping participants from sleeping Each person was electronically paged 33times aaday (at 88 Each person was electronically paged times day (at am, 33pm, and 10 pm) to remind them to fill out aamood am, pm, and 10 pm) to remind them to fill out mood diary diary With full compliancewhich didnt happen, of course With full compliancewhich didnt happen, of course each person would have 21 mood assessments (most had each person would have 21 mood assessments (most had at least 16 assessments, although 11person had only 22and at least 16 assessments, although person had only and 11only 12) only 12) The outcome, POS is the number of positive moods The outcome, POS is the number of positive moods How does POS change over time? How does POS change over time? What is the effect of medication on the trajectories of What is the effect of medication on the trajectories of change? change?

(ALDA, Section 5.4, pp. 181-183)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 26

How might we clock and code TIME? How might we clock and code TIME?
DAYIntuitively appealing, but doesnt distinguish readings each day TIME OF DAY quantifies 3 distance between readings (could also make unequal) (TIME-3.33) Same as TIME but now centered on the studys midpoint

WAVE Great for data processingno intuitive meaning

READING right idea, but how to quantify?

TIMEdays since study began (centered on first wave of data collection)

(TIME-6.67) Same as TIME but now centered on the studys endpoint

(ALDA, Section 5.4, pp 181-183)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 27

Understanding what happens when we recenter TIME Understanding what happens when we recenter TIME
Instead of writing separate models depending upon the representation for TIME, let use a generic form:
2 Level-1 Model: Yij = 0 i + 1i (TIME ij c ) + ij , where ij ~ N (0, )

Level-2 Model:

0 i = 00 + 01TREAT i + 0 i 1i = 10 + 11TREAT i + 1i

0 2 0 i where ~ N , 0 0 1i 10

01 12

Notice how changing the value of the centering constant, c, changes the definition of the intercept in the level-1 model:

Yij = 0i + 1iTIMEij + ij

Yij = 0i + 1i (TIMEij 3.33) + ij

Yij = 0i + 1i (TIMEij 6.67) + ij

When c = 0:

When c = 3.33:

When c = 6.67:

0i is the individual mood at TIME=0 Usually called initial status

0i is the individual mood at TIME=3.33 Useful to think of asmid-experiment status

0i is the individual mood at TIME=6.67 Useful to think about as final status

(ALDA, Section 5.4, pp 182-183)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 28

Comparing the results of using different centering constants for TIME Comparing the results of using different centering constants for TIME
What are affected are the level-1 intercepts

00 assesses level of POS at time c for the


control group (TREAT=0)

01 assesses the diff. in POS between the


groups (TREATment effect) -3.11 (ns) at study beginning 15.35 (ns) at study midpoint 33.80 * at study conclusion

The choice of centering constant has no effect on: Goodness of fit indices Estimates for rates of change Within person residual variance

190.00 180.00 170.00 160.00 150.00

POS Treatment

Control

Betw person res variance in rate of change


140.00 0 1 2 3 Days 4 5 6 7

(ALDA, Section 5.4, pp 183-186)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 29

You can extend the idea of recentering TIME in lots of interesting ways You can extend the idea of recentering TIME in lots of interesting ways

Example: Instead of focusing on rate of change, Example: Instead of focusing on rate of change, parameterize the level-1 model so ititproduces one parameter for parameterize the level-1 model so produces one parameter for initial status and one parameter for final status initial status and one parameter for final status

6.67 TIMEij Yij = 0i 6.67


Individual Initial Status Parameter

TIMEij + 1i 6.67

+ ij

Individual Final Status Parameter

Advantage: You can use all your longitudinal data to analyze initial and final status simultaneously.

(ALDA, Section 5.4, pp 186-188)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 5, slide 30

Modeling discontinuous and nonlinear change


ALDA, Chapter Six

Things have changed Bob Dylan

Judith D. Singer & John B. Willett Harvard Graduate School of Education

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 1

Chapter 6: Modeling discontinuous and nonlinear change Chapter 6: Modeling discontinuous and nonlinear change

General idea: All our examples so far have assumed that individual growth is smooth and linear. But the multilevel model for change is much more flexible:
Discontinuous individual change (6.1)especially useful when discrete shocks or Discontinuous individual change (6.1)especially useful when discrete shocks or time-limited treatments affect the life course time-limited treatments affect the life course Using transformations to model non-linear change (6.2)perhaps the easiest Using transformations to model non-linear change (6.2)perhaps the easiest way of fitting non-linear change models way of fitting non-linear change models
Can transform either the outcome or TIME Can transform either the outcome or TIME We already did this with ALCUSE (which was aasquare root of aasum of 44items) We already did this with ALCUSE (which was square root of sum of items)

Using polynomials of TIME to represent non-linear change (6.3) Using polynomials of TIME to represent non-linear change (6.3)
While admittedly atheoretical, its very easy to do While admittedly atheoretical, its very easy to do Probably the most popular approach in practice Probably the most popular approach in practice

Truly non-linear trajectories (6.4) Truly non-linear trajectories (6.4)


Logistic, exponential, and negative exponential models, for example Logistic, exponential, and negative exponential models, for example AAworld of possibilities limited only by your theory (and the quality and amount of data) world of possibilities limited only by your theory (and the quality and amount of data)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 2

Example for discontinuous individual change: Wage trajectories & the GED Example for discontinuous individual change: Wage trajectories & the GED
Data source: Murnane, Boudett and Willett (1999),
Evaluation Review

Sample: the same 888 male high school Sample: the same 888 male high school dropouts (from before) dropouts (from before) Research design Research design
Each was interviewed between 11and 13 times Each was interviewed between and 13 times after dropping out after dropping out 34.6% (n=307) earned aaGED at some point 34.6% (n=307) earned GED at some point during data collection during data collection

OLD research questions OLD research questions


How do log(WAGES) change over time? How do log(WAGES) change over time? Do the wage trajectories differ by ethnicity and Do the wage trajectories differ by ethnicity and highest grade completed? highest grade completed?

Additional NEW research questions: What is the Additional NEW research questions: What is the effect of GED attainment? Does earning aa effect of GED attainment? Does earning GED: GED:
affect the wage trajectorys elevation? affect the wage trajectorys elevation? affect the wage trajectorys slope? affect the wage trajectorys slope? create aadiscontinuity in the wage trajectory? create discontinuity in the wage trajectory?

(ALDA, Section 6.1.1, pp 190-193)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 3

First steps: Think about how GED receipt might affect an individuals wage trajectory First steps: Think about how GED receipt might affect an individuals wage trajectory
Lets start by considering four plausible effects of GED receipt by imagining what the wage trajectory might look like for someone who got a GED 3 years after labor force entry (post dropout)

2.5

LNW

F: Immediate shifts in both elevation & rate of change D: An immediate shift in rate of change; no difference in elevation

GED
B: An immediate shift in elevation; no difference in rate of change
2.0

A: No effect of GED whatsoever

1.5 0 2 4 6 EXPER
(ALDA, Figure 6.1, p 193)

How do we model trajectories like these within the context of a linear growth model???

10

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 4

Including a discontinuity in elevation, not slope (Trajectory B) Including a discontinuity in elevation, not slope (Trajectory B)
Key idea: Its easy; simply include GED as aatime-varying effect at level-1 Key idea: Its easy; simply include GED as time-varying effect at level-1

2.4

LNW

Yij = 0i + 1i EXPERij + 2i GEDij + ij


Common rate of change Pre-Post GED, 1i

2.2

Post-GED (GED=1):

Yij = ( 0i + 2i ) + 1i EXPERij + ij
2.0

Pre-GED (GED=0):
1.8 Elevation differential on GED receipt, 2i

Yij = 0i + 1i EXPERij + ij

1.6 0

LNW at labor force entry, 0i 2 4 6 EXPER 8 10

(ALDA, Section 6.1.1, pp 194-195)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 5

Using an additional temporal predictor to capture the extra slope post-GED receipt Using an additional temporal predictor to capture the extra slope post-GED receipt

Including a discontinuity in slope, not elevation (Trajectory D) Including a discontinuity in slope, not elevation (Trajectory D)
Yij = 0i + 1i EXPERij + 3i POSTEXPij + ij

Post-GED (POSTEXP clocked in same cadence as EXPER):

Yij = 0i + 1i EXPERij + 3i POSTEXP + ij


LNW

2.4

2.2

Slope differential Pre-Post GED, 3i

POSTEXPij = 0 prior to GED POSTEXPij = Post GED experience, a new TV predictor that clocks TIME since GED receipt (in the same cadence as EXPER)

2.0 Rate of change Pre GED, 1i

Pre-GED (POSTEXP=0):
1.8

Yij = 0i + 1i EXPERij + ij
LNW at labor force entry, 0i 0 2 4 6 EXPER 8 10

1.6

(ALDA, Section 6.1.1, pp 195-198)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 6

Including a discontinuities in both elevation and slope (Trajectory F) Including a discontinuities in both elevation and slope (Trajectory F)
Simple idea::Combine the two previous approaches Simple idea Combine the two previous approaches

Yij = 0i + 1i EXPERij + 2i GED + 3i POSTEXPij + ij


2.4 LNW

2.2

Slope differential Pre-Post GED, 3i

Yij = ( 0i + 2i ) + 1i EXPER + 3i POSTEXP + ij

Post-GED

2.0 Rate of change Pre GED, 1i Constant elevation differential on GED receipt, 2i LNW at labor force entry, 0i 0 2 4 6 EXPER 8 10

1.8

Pre-GED

Yij = 0i + 1i EXPERij + ij

1.6

(ALDA, Section 6.1.1, pp 195-198)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 7

Many other types of discontinuous individual change trajectories are possible Many other types of discontinuous individual change trajectories are possible
What kinds of other complex trajectories could be used?
Effects on elevation and slope can depend upon timing of GED receipt (ALDA pp. 199-201) You might have non-linear changes before or after the transition point The effect of GED receipt might be instantaneous but not endure The effect of GED receipt might be delayed Might there be multiple transition points (e.g., on entry in college for GED recipients)

Just like a regular regression model,

the multilevel model for change can include discontinuities, nonlinearities and other nonstandard terms
Generally more limited by data, theory, or both, than by the ability to specify the model Extra terms in the level-1 model translate into extra parameters to estimate

Think carefully about what kinds of discontinuities might arise in your substantive context

How do we select among the alternative discontinuous models?

(ALDA, Section 6.1.1, pp199-201)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 8

Lets start with a baseline model (Model A) Lets start with a baseline model (Model A)
against which well compare alternative discontinuous trajectories against which well compare alternative discontinuous trajectories
(UERATE-7) is the local area unemployment rate (added in previous chapter as an example of a TV predictor), centered around 7% for interpretability

Benchmark against which well evaluate discontinuous models

Yij = 0i + 1i EXPERij + 2i (UERATE ij 7) + ij

0i = 00 + 01 ( HGC i 9) + 0i 1i = 10 + 11 BLACK i + 1i 2i = 20
0 2 01 ij ~ N (0, 2 ) and 0i ~ N , 0 0 12 1i 10

-7

To appropriately compare this deviance statistic to more complex models, we need to know how many parameters have been estimated to achieve this value of deviance
(ALDA, Section 6.1.2, pp 201-202)

4 random effects 5 fixed effects

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 9

How were going to proceed How were going to proceed


Instead of constructing tables of (seemingly endless) parameter estimates, were going to construct a summary table that presents the specific terms in the model
Baseline just shown

n parameters (for d.f.)

deviance statistic (for model comparison)

(ALDA, Section 6.1.2, pp 202-203)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 10

First steps: Investigating the discontinuity in elevation by adding the effect of GED First steps: Investigating the discontinuity in elevation by adding the effect of GED

B: Add GED as both a fixed and random effect (1 extra fixed parameter; 3 extra random) Deviance=25.0, 4 df, p<.001keep GED effect

C: But does the GED discontinuity vary across people? (do we need to keep the extra VCs for the effect of GED?) Deviance=12.8, 3 df, p<.01 keep VCs

What about the discontinuity in slope?


(ALDA, Section 6.1.2, pp 202-203)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 11

Next steps: Investigating the discontinuity in slope by adding the effect of POSTEXP Next steps: Investigating the discontinuity in slope by adding the effect of POSTEXP
(without the GED effect producing a discontinuity in elevation) (without the GED effect producing a discontinuity in elevation) D: Adding POSTEXP as both a fixed and random effect (1 extra fixed parameter; 3 extra random) Deviance=13.1, 4 df, p<.05 keep POSTEXP effect

E: But does the POSTEXP slope vary across people? (do we need to keep the extra VCs for the effect of POSTEXP?) Deviance=3.3, 3 df, nsdont need the POSTEXP random effects (but in comparison with A still need POSTEXP fixed effect)
(ALDA, Section 6.1.2, pp 203-204)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 12

What if we include both types of discontinuity?

Examining both discontinuities simultaneously Examining both discontinuities simultaneously

F: Add GED and POSTEXP simultaneously (each as both fixed and random effects)

comp. with B shows significance of POSTEXP comp. with D shows significance of GED

(ALDA, Section 6.1.2, pp 204-205)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 13

Can we simplify this model by eliminating the VCs for POSTEXP (G) or GED (H)? Can we simplify this model by eliminating the VCs for POSTEXP (G) or GED (H)?

Each results in a worse fit, suggesting that Model F (which includes both random effects) is better (even though Model E suggested we might be able to eliminate the VC for POSTEXP) We actually fit several other possible models (see ALDA) but F was the best alternativesohow do we display its results?

(ALDA, Section 6.1.2, pp 204-205)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 14

Displaying prototypical discontinuous trajectories Displaying prototypical discontinuous trajectories


(Log Wages for HS dropouts pre- and post-GED attainment) (Log Wages for HS dropouts pre- and post-GED attainment)
Race
At dropout, no racial differences in wages Racial disparities increase over time because wages for Blacks increase at a slower rate

LNW

2.4

White/ Latino

2.2
12th grade dropouts

earned a GED

Black

2
Highest grade completed
Those who stay longer have higher initial wages This differential remains constant over time

GED receipt has two effects

1.8
9th grade dropouts

Upon GED receipt, wages rise immediately by 4.2% Post-GED receipt, wages rise annually by 5.2% (vs. 4.2% pre-receipt)

1.6
0
(ALDA, Section 6.1.2, pp 204-206)

6 EXPERIENCE

10

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 15

Modeling non-linear change using transformations Modeling non-linear change using transformations
When facing obviously non-linear trajectories, we usually begin by trying transformation: When facing obviously non-linear trajectories, we usually begin by trying transformation:
A straight lineeven on a transformed scaleis a simple form with easily interpretable parameters A straight lineeven on a transformed scaleis a simple form with easily interpretable parameters Since many outcome metrics are ad hoc, transformation to another ad hoc scale may sacrifice little Since many outcome metrics are ad hoc, transformation to another ad hoc scale may sacrifice little

ALCUSE

COA = 1
High

The prototypical individual growth trajectories are now non-linear:


By transforming the outcome before analysis, we have effectively modeled non-linear change over time

PEER
Low High

COA = 0

PEER
Low

0 13 14 15 AGE
Earlier, we modeled ALCUSE, an outcome that we formed by taking the square root of the researchers original alcohol use measurement
(ALDA, Section 6.2, pp 208-210)

16

17

We can detransform the findings and return to the original scale, by squaring the predicted values of ALCUSE and replotting

Sohow do we know what variable to transform using what transformation?

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 16

The Rule of the Bulge and the Ladder of Transformations The Rule of the Bulge and the Ladder of Transformations
Mosteller & Tukey (1977): EDA techniques for straightening lines Mosteller & Tukey (1977): EDA techniques for straightening lines

Step 2: How do we know when to use which transformation? Step 1: What kinds of transformations do we consider?
1. 2. Plot many empirical growth trajectories You find linearizing transformations by moving up or down in the direction of the bulge

Generic variable V compress scale


(ALDA, Section 6.2.1, pp. 210-212)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 17

The effects of transformation for a single child in the Berkeley Growth Study The effects of transformation for a single child in the Berkeley Growth Study

Down in TIME

Up in IQ

expand scale

How else might we model non-linear change?

(ALDA, Section 6.2.1, pp. 211-213)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 18

Representing individual change using a polynomial function of TIME Representing individual change using a polynomial function of TIME
Polynomial of the zero order (because TIME0=1)
Like including a constant predictor 1 in the level-1 model Intercept represents vertical elevation Different people can have different elevations

Polynomial of the first order (because TIME1=TIME)


Familiar individual growth model Varying intercepts and slopes yield criss-crossing lines

Second order polynomial for quadratic change


Includes both TIME and TIME2 0i=intercept, but now both TIME and TIME2 must be 0 1i=instantaneous rate of change when TIME=0 (there is no longer a constant slope) 2i=curvature parameter; larger its value, more dramatic its effect Peak is called a stationary pointa quadratic has 1.

Third order polynomial for cubic change


Includes TIME, TIME2 and TIME3 Can keep on adding powers of TIME Each extra polynomial adds another stationary pointa cubic has 2

(ALDA, Section 6.3.1, pp. 213-217)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 19

Example for illustrating use of polynomials in TIME to represent change Example for illustrating use of polynomials in TIME to represent change Source: Margaret Keiley & colleagues (2000), J of Abnormal Child Psychology
st Sample: 45 boys and girls identified in 11stgrade: Sample: 45 boys and girls identified in thgrade:

Goal was to study behavior changes over time (until 66thgrade) Goal was to study behavior changes over time (until grade)

Research design Research design


At the end of every school year, teachers rated At the end of every school year, teachers rated each childs level of externalizing behavior using each childs level of externalizing behavior using Achenbachs Child Behavior Checklist: Achenbachs Child Behavior Checklist:
33 point scale (0=rarely/never; 1=sometimes; 2=often) point scale (0=rarely/never; 1=sometimes; 2=often) 24 aggressive, disruptive, or delinquent behaviors 24 aggressive, disruptive, or delinquent behaviors

Outcome: EXTERNALranges from 00to 68 Outcome: EXTERNALranges from to 68 (simple sum of these scores) (simple sum of these scores) Predictor: FEMALEare there gender Predictor: FEMALEare there gender differences? differences?

Research question Research question


How does childrens level of externalizing How does childrens level of externalizing behavior change over time? behavior change over time? Do the trajectories of change differ for boys and Do the trajectories of change differ for boys and girls? girls?

(ALDA, Section 6.3.2, p. 217)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 20

Examining empirical growth plots (which invariably display great variability in temporal complexity) Examining empirical growth plots (which invariably display great variability in temporal complexity)
Quadratic change (but with varying curvatures)

Selecting a suitable level-1 polynomial trajectory for change Selecting a suitable level-1 polynomial trajectory for change

Linear decline (at least until 4th grade)

Little change over time (flat line?)

Two stationary points? (suggests a cubic)

Three stationary points? (suggests a quartic!!!)

When faced with so many different patterns, how do you select a common polynomial for analysis?

(ALDA, Section 6.3.2, pp 217-220)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 21

Order optimized for each child (solid curves) and a common quartic across children (dashed line) Order optimized for each child (solid curves) and a common quartic across children (dashed line)
First impression: Most fitted trajectories provide a reasonable summary for each childs data Second impression: Maybe these ad hoc decisions arent the best?

Examining alternative fitted OLS polynomial trajectories Examining alternative fitted OLS polynomial trajectories

dra t

ic?

Third realization: We need a common polynomial across all cases (and might the quartic be just too complex)?

Using sample data to draw conclusions about the shape of the underlying true trajectories is trickylets compare alternative models

(ALDA, Section 6.3.2, pp 217-220)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 22

Would a

quadr

atic d o?

Qu a

Using model comparisons to test higher order terms in a polynomial level-1 model Using model comparisons to test higher order terms in a polynomial level-1 model

Add polynomial functions of TIME to person period data set

Compare goodness of fit (accounting for all the extra parameters that get estimated)

A: significant between- and within-child variation B: no fixed effect of TIME but significant var comps Deviance=18.5, 3df, p<.01 C: no fixed effects of TIME & TIME2 but significant var comps Deviance=16.0, 4df, p<.01

D: still no fixed effects for TIME terms, but now VCs are ns also Deviance=11.1, 5df, ns

Quadratic (C) is best choice and it turns out there are no gender differentials at all.
(ALDA, Section 6.3.3, pp 220-223)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 23

Example for truly non-linear change Example for truly non-linear change
Data source: Terry Tivnan (1980) Dissertation at Harvard Graduate School of Education

Sample: 17 1st and 2nd graders Sample: 17 1st and 2nd graders
During aa33week period, Terry repeatedly played aatwoDuring week period, Terry repeatedly played twoperson checkerboard game called Fox n Geese, person checkerboard game called Fox n Geese, (hopefully) learning from experience (hopefully) learning from experience
Fox is controlled by the experimenter, at one end of the board Fox is controlled by the experimenter, at one end of the board Children have four geese, that they use to try to trap the fox Children have four geese, that they use to try to trap the fox

Great for studying cognitive development because: Great for studying cognitive development because:

There exists a strategy that children can learn that will guarantee victory There exists a strategy that children can learn that will guarantee victory This strategy is not immediately obvious to children This strategy is not immediately obvious to children Many children can deduce the strategy over time Many children can deduce the strategy over time

Research design Research design


Each child played up to 27 games (each game is aa Each child played up to 27 games (each game is wave) wave) The outcome, NMOVES is the number of moves made by The outcome, NMOVES is the number of moves made by the child before making aacatastrophic error the child before making catastrophic error (guaranteeing defeat)ranges from 11to 20 (guaranteeing defeat)ranges from to 20

Research question: Research question:


How does NMOVES change over time? How does NMOVES change over time? What is the effect of aachilds reading (or cognitive) What is the effect of childs reading (or cognitive) ability?READ (score on aastandardized reading test) ability?READ (score on standardized reading test)

(ALDA, Section 6.4.1, pp. 224-225)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 24

Examining empirical growth plots (and asking what features should the hypothesized model display?) Examining empirical growth plots (and asking what features should the hypothesized model display?)
A lower asymptote,
because everyone makes at least 1 move and it takes a while to figure out whats going on

Selecting a suitable level-1 nonlinear trajectory for change Selecting a suitable level-1 nonlinear trajectory for change

An upper asymptote,
because a child can make only a finite # moves each game

A smooth curve joining the asymptotes,


that initially accelerates and then decelerates

These three features suggest a level-1 logistic change trajectory,which unlike our previous growth models will be non-linear in the individual growth parameters
(ALDA, Section 6.4.2, pp. 225-228)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 25

Understanding the logistic individual growth trajectory Understanding the logistic individual growth trajectory (which is anything but linear in the individual growth parameters) (which is anything but linear in the individual growth parameters)
Upper asymptote in this particular model is constrained to be 20 (1+19)

0i is related to, and


determines, the intercept When 1i is large, the trajectory rises more rapidly

19 + ij Yij = 1 + TIME 1 + 0i e 1i ij
25 NMOVES 25 NMOVES

1i determines the rapidity with which the trajectory approaches the upper asymptote

25

NMOVES

20

20

1 = 0.5 1 = 0.3

20

1 = 0.5 1 = 0.3

15

15

15

1 = 0.5
10

1 = 0.1 1 = 0.3
10 10

1 = 0.1
5 5 5

Higher the value of 0i, the lower the intercept

When 1i is small, the trajectory rises slowly (often not reaching an asymptote)
0 10 Game 20 30

1 = 0.1
0 0 10 Game 20 30 0 0 10 Game 20 30 0

0 = 150

0 = 15

0 = 1.5

Models can be fit in usual way using provided your software can do it
(ALDA, Section 6.4.2, pp 226-230)
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 26

Results of fitting logistic change trajectories to the Fox n Geese data Results of fitting logistic change trajectories to the Fox n Geese data

Begins low and rises smoothly and non-linearly

Not statistically significant (note small ns), but better READers approach asymptote more rapidly

(ALDA, Section 6.4.2, pp 229-232)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 27

A limitless array of non-linear trajectories awaits A limitless array of non-linear trajectories awaits (each is illustrated in detail in ALDA, Section 6.4.3) (each is illustrated in detail in ALDA, Section 6.4.3)

Yij = i

1 + ij 1i TIMEij

Yij = 0i e 1i

TIME ij

+ ij

Yij = i

1 + ij (1iTIME + 2iTIME2 ) ij ij

Yij =i (i 0i )e

1iTIME ij

+ ij

(ALDA, Section 6.4.3, pp 232-242)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, ALDA, Chapter 6, slide 28

Singer & Willett, page 28

Using SAS Proc Mixed to fit the multilevel model for change
Time is natures way of keeping everything from happening at once Woody Allen

Judith D. Singer & John B. Willett Harvard Graduate School of Education

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Using SAS Proc Mixed, slide 1

Resources to help you learn how to use SAS Proc Mixed Resources to help you learn how to use SAS Proc Mixed

Textbook Examples Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence by Judith D. Singer and John B. Willett

MLwiN

Mplus

SPlus

SPSS

Stata

Chapter
Table of contents A framework for investigating change over time Exploring longitudinal data on change Introducing the multilevel model for change Doing data analysis with the multilevel model for change Treating time more flexibly Modeling discontinuous and nonlinear change Examining the multilevel models error covariance structure Modeling change using covariance structure analysis A framework for investigating event occurrence Describing discrete-time event occurrence data Fitting basic discrete-time hazard models Extending the discrete-time hazard model Describing continuous-time event occurrence data Fitting the Cox regression model Extending the Cox regression model

HLM

SAS

Datasets Ch 1 Ch 2 Ch 3 Ch 4 Ch 5 Ch 6 Ch 7 Ch 8 Ch 9 Ch 10 Ch 11 Ch 12 Ch 13 Ch 14 Ch 15

What well do now: Using the specific models we just What well do now: Using the specific models we just fit in Chapter Four to demonstrate how to use fit in Chapter Four to demonstrate how to use SAS PROC MIXED to fit these models to data SAS PROC MIXED to fit these models to data Model A: The unconditional means model Model A: The unconditional means model Model B: The unconditional growth model Model B: The unconditional growth model Model C: The uncontrolled effects of COA Model C: The uncontrolled effects of COA Model D: The controlled effects of COA Model D: The controlled effects of COA
Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Using SAS Proc Mixed, slide 2

Using SAS Proc Mixed to fit Model A (the unconditional means model) Using SAS Proc Mixed to fit Model A (the unconditional means model)
Level-1 Model: Y ij = 0 i + ij , where ij ~ N ( 0 , 2 )
2 Level-2 Model: 0i = 00 + 0i , where 0i ~ N (0, 0 )

Composite Model:

Y ij = 00 + 0 i + ij
proc mixed data=one method=ml covtest; class id; model alcuse = /solution; random intercept/subject=id;

The proc mixed statement invokes the procedure, here using the dataset named one. The method = ml option tells SAS to use full maximum likelihood estimation. If you omit this option, by default SAS uses restricted maximum likelihood (as discussed on Chapter 4, slide 27) The covtest option tells SAS to display tests for the variance components. By default, SAS omits these tests (as discussed on Chapter 4, slide 23).

The class id statement tells SAS to treat the variable ID as a categorical (in SAS terms, a classification) variable. If you omit this statement, by default, SAS would treat ID as a continuous variable.

The model statement specifies the structural portion of the multilevel model for change. This specification model alcuse = may seem unusual but its the way SAS represents the unconditional means model (see Chapter 4, slide 9). The model includes no explicit predictor, but like any regression model, includes an implicit intercept by default. The /solution option on the model statement tells SAS to display the estimated fixed effects (as well as the associated standard errors and hypothesis tests).

The random statement specifies the stochastic portion of the multilevel model for change. By default, SAS always includes a variance component for the level-1 residuals. In this unconditional means model, the random intercept option tells SAS to also include a variance component for the intercept (allowing the means to vary across people). The /subject=id option tells SAS that the intercepts (the means in this unconditional means model) should be allowed to vary randomly across individuals (as identified by the classification variable ID)

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Using SAS Proc Mixed, slide 3

Results of fitting Model A (the unconditional means model) to data Results of fitting Model A (the unconditional means model) to data
Level-1 Model: Y ij = 0 i + ij , where ij ~ N ( 0 , 2 ) Level-2 Model: 0i = 00 + 0i , where 0i ~ N (0, Composite Model:
2 0)

Y ij = 00 + 0 i + ij

proc mixed data=one method=ml covtest; class id; model alcuse = /solution; random intercept/subject=id;

Model A: Unconditional means model The Mixed Procedure


Covariance Parameter Estimates Standard Error 0.1191 0.06203 Z Value 4.73 9.06

Cov Parm Intercept Residual

Subject ID

Estimate 0.5639 0.5617

Pr Z <.0001 <.0001

Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 670.2 676.2 676.3 683.4

Solution for Fixed Effects Standard Error 0.09571

Effect Intercept

Estimate 0.9220

DF 81

t Value 9.63

Pr > |t| <.0001

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Using SAS Proc Mixed, slide 4

Using SAS Proc Mixed to fit Model B (the unconditional growth model) Using SAS Proc Mixed to fit Model B (the unconditional growth model)
Level-1 Model:

Yij = 0 i + 1i ( AGE 14 ) ij + ij , where ij ~ N ( 0 , 2 )


0i = 00 + 0i 1i = 10 + 1i
0 2 where 0i ~ N , 0 0 1i 10

Level-2 Model: Composite Model:

01 12

Yij = 00 + 10 ( AGE 14 ) ij + [ 0 i + 1i ( AGE 14 ) ij + ij ]


proc mixed data=one method=ml covtest; class id; model alcuse = age_14/solution; random intercept age_14/type=un subject=id; As before, SAS implicitly assumes a variance component for the level-1 residuals. But because Model B includes a second random effect to capture the hypothesized level-2 stochastic variation, the random statement must be modified to include this second termdenoted by the temporal predictor AGE_14. The /type=un, which stands for unstructured, is crucial, telling SAS to not impose any structure on the variance covariance matrix for the level-2 residuals.

Model B, the unconditional growth model, includes a single predictor, age_14, representing the slope of the level-1 individual growth trajectory. As before, SAS implicitly understands that the user wishes to include an intercept term. Because the predictor age_14 is centered at age 14 (the first wave of data collection), the intercept now represents initial status.

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Using SAS Proc Mixed, slide 5

Results of fitting Model B (the unconditional growth model) to data Results of fitting Model B (the unconditional growth model) to data
Yij = 0 i + 1i ( AGE 14 ) ij + ij , where ij ~ N ( 0 , 2 )
0i = 00 + 0i 1i = 10 + 1i
0 2 where 0i ~ N , 0 0 1i 10
proc mixed data=one method=ml covtest; class id; model alcuse = age_14/solution; random intercept age_14/type=un subject=id;
Parameter #1 Parameter #2

01 12

Yij = 00 + 10 ( AGE 14 ) ij + [ 0i + 1i ( AGE 14 ) ij + ij ]

Model B: Unconditional growth model The Mixed Procedure


Covariance Parameter Estimates Standard Error Z Value

Cov Parm UN(1,1) UN(2,1) UN(2,2) Residual

Subject ID ID ID

Estimate 0.6244 -0.06844 0.1512 0.3373

Pr Z <.0001 0.3288 0.0037 <.0001

0.1481 4.22 0.07008 -0.98 0.05647 2.68 0.05268 6.40

Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 636.6 648.6 649.0 663.1

Solution for Fixed Effects Standard Error 0.1051 0.06245

Effect Intercept AGE_14

Estimate 0.6513 0.2707

DF 81 81

t Value 6.20 4.33

Pr > |t| <.0001 <.0001

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Using SAS Proc Mixed, slide 6

Using SAS Proc Mixed to fit Model C (Uncontrolled effects of COA) Using SAS Proc Mixed to fit Model C (Uncontrolled effects of COA)
2 Level-1 Model: Yij = 0 i + 1i ( AGE 14 ) ij + ij , where ij ~ N ( 0 , )

Level-2 Model: Composite Model:

0i = 00 + 01COAi + 0i 1i = 10 + 11COA i + 1i

0 2 01 where 0i ~ N , 0 2 0 1i 10 1

Yij = 00 + 01COAi + 10 ( AGE 14 ) ij + 11COAi * ( AGE 14 ) ij + [ 0 i + 1i ( AGE 14 ) ij + ij ]

proc mixed data=one method=ml covtest; class id; model alcuse = coa age_14 coa*age_14/solution; random intercept age_14/type=un subject=id;

Like the companion Level-2 model, Model C adds two terms to register the uncontrolled effects of COA: (1) a main effect of COA, which captures the effect on the intercept (initial status); and (2) the cross-level interaction, COA*AGE_14, which captures the effect of COA on the rate of change All other statements, including the random statement, are unchanged from Model B because we have only added new fixed effects (for COA) and not any new random effects.

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Using SAS Proc Mixed, slide 7

Results of fitting Model C (the uncontrolled effects of COA) to data Results of fitting Model C (the uncontrolled effects of COA) to data
Yij = 0 i + 1i ( AGE 14 ) ij + ij , where ij ~ N ( 0 , 2 )
0i = 00 + 01COAi + 0i 1i = 10 + 11COA i + 1i
0 2 01 where 0i ~ N , 0 2 0 1i 10 1
proc mixed data=one method=ml covtest; class id; model alcuse = coa age_14 coa*age_14/solution; random intercept age_14/type=un subject=id;

Yij = 00 + 01COAi + 10 ( AGE 14 ) ij + 11COAi * ( AGE 14 ) ij + [ 0 i + 1i ( AGE 14 ) ij + ij ]

Model C: Uncontrolled effects of COA The Mixed Procedure


Covariance Parameter Estimates Standard Error 0.1278 0.06573 0.05639 0.05268 Z Value 3.81 -0.90 2.67 6.40

Cov Parm UN(1,1) UN(2,1) UN(2,2) Residual

Subject ID ID ID

Estimate 0.4876 -0.05934 0.1506 0.3373

Pr Z <.0001 0.3666 0.0038 <.0001

Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 621.2 637.2 637.8 656.5

Solution for Fixed Effects Standard Error 0.1307 0.1946 0.08423 0.1254

Effect Intercept COA AGE_14 COA*AGE_14

Estimate 0.3160 0.7432 0.2930 -0.04943

DF 80 82 80 82

t Value 2.42 3.82 3.48 -0.39

Pr > |t| 0.0179 0.0003 0.0008 0.6944

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Using SAS Proc Mixed, slide 8

Using SAS Proc Mixed to fit Model D (Controlled effects of COA) Using SAS Proc Mixed to fit Model D (Controlled effects of COA)
Level-1 Model: Yij = 0 i + 1i TIME ij + ij , where ij ~ N ( 0 , 2 ) Level-2 Model: Composite Model:
0i = 00 + 01COAi + 02 PEERi + 0i 1i = 10 + 11COA i + 12 PEERi + 1i
0 2 01 where 0i ~ N , 0 2 0 1i 10 1

Yij = 00 + 01COAi + 02 PEER i + 10 ( AGE 14 ) ij + 11COAi * ( AGE 14 ) ij + 12 PEER i * ( AGE 14 ) ij + [ 0 i + 1i ( AGE 14 ) ij + ij ]

proc mixed data=one method=ml covtest; class id; model alcuse = coa peer age_14 coa*age_14 peer*age_14/solution; random intercept age_14/type=un subject=id;

Like the companion Level-2 model, Model D adds two terms to register the controlled effects of PEER: (1) a main effect of PEER, which captures the effect on the intercept (initial status); and (2) the cross-level interaction, PEER*AGE_14, which captures the effect of PEER on the rate of change All other statements, including the random statement, are unchanged from Model C because we have only added new fixed effects (for PEER) and not any new random effects.

Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Using SAS Proc Mixed, slide 9

Results of fitting Model D (the controlled effects of COA) to data Results of fitting Model D (the controlled effects of COA) to data

Model D: Controlled effects of COA The Mixed Procedure


Covariance Parameter Estimates Standard Error 0.09259 0.05500 0.05481 0.05268 Z Value 2.60 -0.11 2.54 6.40

Cov Parm UN(1,1) UN(2,1) UN(2,2) Residual

Subject Estimate ID ID ID 0.2409 -0.00612 0.1391 0.3373

Pr Z 0.0046 0.9115 0.0056 <.0001

Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 588.7 608.7 609.6 632.8

Solution for Fixed Effects Standard Error 0.1481 0.1625 0.1115 0.1137 0.1248 0.08564

Effect Intercept COA PEER AGE_14 COA*AGE_14 PEER*AGE_14

Estimate -0.3165 0.5792 0.6943 0.4294 -0.01403 -0.1498

DF 79 82 82 79 82 82

t Value -2.14 3.56 6.23 3.78 -0.11 -1.75

Pr > |t| 0.0356 0.0006 <.0001 0.0003 0.9107 0.0840

Go to resources to help you use SAS


Judith D. Singer & John B. Willett, Harvard Graduate School of Education, Using SAS Proc Mixed, slide 10

S-ar putea să vă placă și