DDDG

The social and phenotypic heterogeneity of autism: identifying clusters in a large
population-based sample
Christine Fountain
Department of Sociology and Anthropology

Fordham University
Address Correspondence to: Christine Fountain, Department of Sociology and Anthropology,

Fordham University, 113 W 60th St., New York, NY 10023, Phone: 646-293-3959, email:
cfountain1@fordham.edu.
Acknowledgements: I thank Peter Bearman, Ka-Yuet Liu, Marissa King, Keely Cheslack-
Postava, Soumya Mazumdar, and Alix Winter for their contributions to this work. This research
was supported by the NIH Director's Pioneer Award program, part of the NIH Roadmap for
Medical Research, through grant number 1 DP1 OD003635-01 and the National Institutes of
Mental Health award number R21MH096122. Partial computing support for this research came
from a Eunice Kennedy Shriver National Institute of Child Health and Human Development
research infrastructure grant, R24 HD042828, to the Center for Studies in Demography &
Ecology at the University of Washington.
Working Paper Draft: Do not cite or distribute

Abstract
Autism is a spectrum disorder characterized by myriad combinations of behavioral symptoms

and severity, and a heterogeneous set of risk factors. Although the heterogeneity of autism is a
truism, most research fails to methodologically account for it. In this paper, I identify and
describe five typical autism subgroups in a population-level dataset consisting of the birth
records of all children with autism born in California in 1992-2005. Using cluster analysis, I find
groups of children with similar attributes on socioeconomic, biological, and autism symptom
variables. These clusters represent consistent and coherent groups and reveal important
associations between sets of characteristics. These clusters are also shown to have clear and
meaningful temporal patterns, and particular autism subtypes have risen and fallen in relative
size as the diagnostic context has changed. Administrative boundaries relevant to the diagnosis
of and provision of services for autism also show variability in their cluster composition. Cluster
analysis reveals not only way that social and biological factors combine to jointly create this
heterogeneous disorder, but also how diagnostic patterns vary over space and time
Working Paper Draft: Do not cite or distribute

The social and phenotypic heterogeneity of autism: identifying clusters in a large
population-based sample
Introduction
Autism is a neurodevelopmental disorder characterized by deficits of communication and
social interaction, as well as repetitive and stereotyped behaviors, typically diagnosed in early
childhood. Autism is considered a spectrum disorder, including persons with varying
constellations of symptoms of varying severity. Even within the core Autism diagnosis, there is
substantial heterogeneity of autistic symptoms and associated features. A diagnosis based on the
Diagnostic and Statistical Manual (DSM) requires a minimum of symptoms from a list of twelve,
distributed among social, communication, and repetitive behavior dimensions, as well as delayed
or abnormal functioning in at least one of these dimensions. Aside from variations in the severity
of these symptoms, this leaves a great deal of room for variability in the portfolio of symptoms
and behaviors captured by an autism diagnosis. Comorbidities are also common; persons with
autism may also carry diagnoses of mental retardation, ADHD, epilepsy, anxiety disorders, or
other conditions. In addition, there are several symptoms and traits, experienced by many
persons with autism, aside from the core features. These include gastrointestinal disorders, food
preferences and sensitivities, sensory hypersensitivities or abnormalities, motor coordination
abnormalities, and sleep problems as well as unusual skills.
Thus, the variety of core and secondary symptoms that may be expressed by persons with
autism is enormous. The multiplicity of autism phenotypes, not to mention autism etiologies,
points to the heterogeneity of this disorder. Even more complex is how this heterogeneity
unfolds over time, as childrens symptoms change through childhood and adolescence (Fountain,
Winter, and Bearman 2012).
Working Paper Draft: Do not cite or distribute 1

The process by which autism is identified and diagnosed has also changed over the past
three decades. Public awareness of autism has increased, and the stigma associated with autism
arising from the mistaken belief that autism was a psychogenic disorder has decreased. Against
this background, autism prevalence has risen dramatically, if unevenly.
Of particular sociological interest is the possibility for social factors to intervene in the
diagnostic process. Children are typically diagnosed at a pre-school age, and diagnoses are based
solely on behavioral symptoms. Further, there is some ambiguity in distinguishing autism from
other communication disorders and developmental delays, particularly at the high and low
functioning ends of the autism spectrum and at the youngest ages. This ambiguity helps create
the potential for the understanding and resources of parents to play a role in which children get
which diagnoses. This has been amplified by changes in the diagnostic criteria, and increasing
awareness of autism among parents, teachers, and caregivers.
In this paper we explore the social and phenotypic heterogeneity in autism using a unique
population-level data set on the annual evaluations of all children with autism born in California
in 1992-2005 and enrolled with the Department of Developmental Services. Linking birth
certificate records to autism caseload records, we are able to connect parental socioeconomic
background, neighborhood characteristics, and information on the autism diagnosis and
symptoms. We use clustering methods to describe the subgroups of children with autism we find
in the data. These coherent groups reveal the ways that social and behavioral factors combine in
complex and shifting ways to produce the oft-observed heterogeneity of autism.
Roadmap
In this paper, I will describe what is known about the heterogeneity of autism and its
association with key social and biological risk factors. Next, I apply a k-medoids clustering

algorithm to identify the sub-groups within this population of children with autism. After
describing these groups, I examine their correlates: when, and in what contexts, are these
subtypes likely to be found? Finally, I explore the implications of our findings for autism
research in general.
Background
Autism prevalence and heterogeneity
In recent years, autism has become more visible as both incidence and prevalence
increased. Once considered a very rare disorder, with a rate of about 4 per 10,000 population
until the 1970s (Fombonne 2005), recent prevalence estimates have reached 1 in 68 children
aged eight (Developmental Disabilities Monitoring Network Surveillance Year 2010 Principal
Investigators 2014). There are many reasons -- and much disagreement about their relative
importance -- for this increase.
Risk factors for autism
Autisms heterogeneity is likely associated with different etiologies. Recent studies using
high resolution analysis techniques found inherited and de novo deletions and duplications in a
wide range of locations on the genome in autism cases as compared to controls, suggesting
heterogeneous genetic predispositions. Consistent with this idea, family aggregation studies
suggest that the three main characteristics of autism have different levels of heritability(Freitag
2006; Georgiades et al. 2007; Ronald et al. 2006).
Autism has also been associated with various factors relating to the prenatal and perinatal
environment and socioeconomic status that may also contribute to varying symptom
presentations, including: gestation length, birth weight, labor and birth complications, short inter-
pregnancy intervals, and multiple births (Larsson et al. 2005; King and Bearman 2011; Hultman,

Sparen, and Cnattingius 2002; Croen, Grether, and Selvin 2002; Larsson et al. 2005). Parental
characteristics such as advanced parental age, education, and history of schizophrenia are also
associated with increased autism risk (King et al. 2009; Croen et al. 2007; Durkin et al. 2008;
Larsson et al. 2005; King and Bearman 2011; Croen, Grether, and Selvin 2002) and may point to
mechanisms that lead to heterogeneous symptom presentation.
Socioeconomic Status and Autism
Although many of the substantial set of risk factors are biological, many others are at
least partly social in nature. It has been firmly established that children born to older mothers and
fathers are at higher risk of autism (Croen et al. 2007; Durkin et al. 2008; King et al. 2009).
Although this is certainly in large part for biological reasons (e.g. the higher risk of de novo copy
mutations and riskier pregnancies that come with older parental age), there may also be a social
component as children born to older parents are scrutinized more carefully for these symptoms.
For example, one analysis comparing autistic children conceived with assisted reproductive
technologies (ART) (who tended to have significantly older parents) to those without found that
the assisted reproduction group were diagnosed earlier and with milder symptoms, but that this
difference disappeared when controlling for socioeconomic factors, particularly parental age and
education. (Schieve et al. 2015) This suggests that the key difference between these groups may
be in ascertainment, not phenotype.
Further, the increasing prevalence of autism may then be in part a result of the broad
sociological phenomenon of increasing parental ages (Kayuet Liu, Noam Zerubavel, and Peter
Bearman 2010). The rising use of ART may also be contributing to this trend, by pushing on the
upper bound of the fertile age range. Recent evidence suggests that ART conceptions are at
higher risk of autism, as well as other developmental disorders (Fountain et al. 2015; Hvidtjrn

et al. 2011; Hvidtjorn et al. 2009). In addition to resulting in older parents and smaller families,
this may also contribute to shifts in the spacing of children. Some research has suggested that
short inter-pregnancy intervals, perhaps arising from delayed fertility, may also be associated
with increased autism risk (Cheslack-Postava, Liu, and Bearman 2011).
Increased parental education has been associated with autism risk (Croen, Grether, and
Selvin 2002; Durkin et al. 2010; King and Bearman 2011; Larsson et al. 2005). Many studies
have also found racial and ethnic disparities in autism rates (ADDMN 2007; Centers for Disease
Control and Prevention 2006; Fountain and Bearman 2011; Liptak et al. 2008; Mandell et al.
2009; Shattuck et al. 2009). There is no current evidence for a genetic explanation for these
genetic differences. Rather, most researchers consider race and ethnicity to be a proxy for other
socioeconomic variables, including wealth and income, education, and culture (Burchard et al.
2003; Link et al. 1998). In some cases cultural or language differences may contribute to these
differences; in addition, teachers and caregivers can interpret and make decisions on symptoms
based partly on race (Mandell et al. 2009; Palmer et al. 2009).
In fact, unlike almost every other known disease or disorder, autism shows a reversed
socioeconomic gradient, such that the socioeconomically disadvantaged tend to have lower risk,
likely due to under diagnosis. One of the most stable social facts is the socioeconomic gradient
for health and mortality. Specifically, higher status people -- whether status is based on education,
income, occupation, or Nobel Prizes and Academy Awards have better health and lower risk of
death (Marmot 2004; Rablen and Oswald 2008; Redelmeier and Singh 2001). This stylized fact
has been true across all societies and groups where it has been studied, across time, and persisted
in the face of enormous social change and medical advances. The reasons for this are myriad,

including differential access to health care and insurance, knowledge and efficacy, health
behaviors, cultural and linguistic barriers, and social capital (Pescosolido 1992).
The resistance of this pattern to change in time, technology, and scale has motivated
some medical sociologists to argue that socioeconomic status is a fundamental cause of health
(Link and Phelan 1995). This concept is in contrast to the mainstream approach of epidemiology,
which focuses on modifiable, proximate risk factors. The fundamental cause theory argues that
risk factors such as nutrition, exercise, and exposure to toxic substances directly affect health
outcomes, but that they themselves are fundamentally caused by the social conditions in which
these individuals are embedded. They point to the fact that reducing or eliminating some of the
proximate causes of poor health or even the diseases disproportionately affecting the poor does
little to eliminate the association between SES and mortality. For example, the 19th century poor
often experienced poor nutrition and sanitation, as well as overcrowding. These risk factors made
them especially vulnerable to typhoid, smallpox, tuberculosis, diphtheria, and other infectious
diseases. However, neither the improvement in social conditions that came with economic
development nor the effective eradication of these diseases due to widespread vaccination and
medical treatments, has reduced the socioeconomic gradient for mortality (Phelan, Link, and
Tehranifar 2010). In developed countries, infectious diseases have been replaced by cancers and
chronic conditions like heart disease and diabetes. Thus, they argue, the focus on modifiable
intervening risk factors will not reduce the mortality gap (although it may reduce mortality, as
did the war against infectious diseases) if there is no change in the underlying social conditions.
Because resources can be used in an adaptable and flexible way as conditions change,
there is no one mechanism linking SES to health, and interventions that focus on proximate
causes are likely to be ineffective as new mechanisms will replace the old ones. Perversely, new

information and technologies can actually exacerbate the SES gradient, if high-status people are
better able to harness these advances to improve their health. One example is smoking behavior.
Prior to 1954, when scientists began to definitively establish that smoking caused cancer, there
was little socioeconomic gradient to smoking or to the knowledge that smoking was unhealthy.
After this time, however, a strong socioeconomic gradient in both knowledge and behavior
opened up as more educated people were more likely to believe that smoking causes cancer and
to change their behavior in accordance (Link and Phelan 2009).
In the case of autism, this type of mechanism may account for the peculiarly reversed
socioeconomic gradient, much as it has done previously for certain types of cancers (Link et al.
1998). If the risk factors for a disease are not easily modifiable or understood, as is true of breast
cancer and autism, then people are unable to effectively use the resources that come with high
socioeconomic status to avoid the disease. However, SES may translate into differential
screening and identification of the disease, as it has with mammography. Similarly, although no
one knows how to reduce the risk of having a child with autism, parental resources do make a
difference when it comes to obtaining a diagnosis (Durkin et al. 2010; Fountain, King, and
Bearman 2010; King and Bearman 2011; Mandell et al. 2009; Russell, Steer, and Golding 2010).
Thus, the reversed SES gradient pattern for autism is consistent with the theory of SES as a
fundamental cause of health.
Spatial Distribution and Social Influence
Research on social influence and the social diffusion of health behaviors and outcomes
has a long history in sociology and has experienced a recent resurgence (Cacioppo, Fowler, and
Christakis 2009; Christakis and Fowler 2007, 2008; Fowler and Christakis 2008; Liu, King, and
Bearman 2010). This vein of research emphasizes the influence of the social context

particularly the web of social relationships -- in which one is embedded on health. Some of the
latest work has found that smoking, obesity, and happiness, as well as loneliness, spread through
social networks (Christakis and Fowler 2007, 2008; Fowler and Christakis 2008), although there
has also been some criticism of the methodology of this work (Cohen-Cole and Fletcher 2008;
Lyons 2010; Shalizi and Thomas 2011).
Although autism is not, strictly speaking, a contagious disease1, knowledge and
information about the existence and symptoms of this once-rare disorder may spread through
social ties. Similarly, the ways that one is influenced by the behaviors, values, desires, and
persuasion of ones social connections is known as social influence. This may also play an
important role in the spread of autism, from the decline in stigmatization of the disorder to the
receipt of advice and counsel from kin, friends, neighbors, and teachers. In a recent paper, Liu,
King and Bearman (2010) find that living close to a child with autism increases the chance, all
else equal, that a child will be diagnosed in the next year. They argue that the reason for this
relationship is the sharing of information locally on symptoms as well as finding doctors and
obtaining diagnoses and services.
Aside from social influence, the characteristics of local neighborhoods can have
important consequences for autism diagnoses. The density and visibility of autism in an area, the
availability of professionals qualified to diagnose and treat autism, the resources and experience
located the local school system, and the socioeconomic composition of a neighborhood, among
other factors, can affect the rate of diagnosis as well as the timing of those diagnoses (Fountain et
al. 2010; King and Bearman 2011). Autism cases are not spread evenly over space (Mazumdar et

1 There is some evidence that a subset of autism cases may be caused by prenatal exposure to
viruses, including congenital rubella or, less compellingly, influenza (Chess 1971, 1977; Shi et al.
2003).

al. 2010, 2013), and the social influence processes identified by Liu et al. (2010) can amplify the
disparities in local autism resources, contributing to local variability in local diagnosis regimes
(Liu and Bearman 2012).
Prior Research on Autism Clusters
There has been a small amount of past research using clustering methods to identify
subtypes of autism (Eaves, Ho, and Eaves 1994; Prior et al. 1998; Stevens et al. 2000). However,
in addition to covering only short time periods, these studies have been based on small and non
representative samples (Eaves et al. 1994; Stevens et al. 2000), and focus exclusively on
behavioral symptoms without including socioeconomic or familial variables (Prior et al. 1998).
This research has identified coherent groups within their samples, although these symptom
groupings do not necessarily map onto standard autism diagnostic categories. In a previously
published paper I have used a related strategy, group based trajectory modeling, to identify sub-
groups with similar longitudinal symptom trajectories (Fountain et al. 2012). In this work I found
six unique trajectories, one of which was characterized by a surprisingly large amount of
improvement. Although the groups were identified based solely on the patterns of change in
symptoms, these trajectories were highly correlated with socioeconomic factors, such that more
advantaged children were higher functioning and more likely to display this pattern of marked
improvement.
In summary, different contexts can lead to different autism risk factors as well as
different kinds of children being diagnosed. If we think of a risk factor as something that shifts a
childs change of being diagnosed with autism by x amount, then we will have an incomplete
understanding of the phenomenon. Risk is specific to particular social contexts in ways that
matter greatly to our measured estimates of the prevalence of conditions like autism (Fountain

and Bearman 2011; King and Bearman 2011). We need to understand the diversity of autism in
not just its symptom expression, but also its social demography. This must include attention to
how the salience of risk factors changes over space and time. However it also requires applying a
more diverse set of methods to this problem in order to appreciate the variability in children
diagnosed with autism.
Data
The data for this paper consist of birth and administrative records for all California
children with autism who were born from 1992 through 2007. The California Departmental of
Developmental Services (DDS) provides diagnoses and services to persons with developmental
disabilities including autism through its system of 21 Regional Centers. Although enrollment is
voluntary, the strong financial incentive to obtain services through the DDS means that the vast
majority of persons with autism in California are enrolled, making the DDS the largest
administrative source of data on autism diagnoses (Croen, Grether, Hoogstrate, et al. 2002).
Services and support are provided to persons with Autistic Disorder (DSM-IV code 299.0), but
not to those with other spectrum disorders or pervasive developmental disorders unless they have
another qualifying condition or substantial disability.
DDS caseload records were matched to birth certificate records, resulting in 42,362
children born 1992-2007 and diagnosed with autism before 2011. Linkage was conducted using
deterministic and probabilistic matching in Link Plus (Division of Cancer Prevention and
Control 2007); uncertain matches were reviewed manually. Ninety-one percent of DDS files for
children ever diagnosed with autism were successfully linked to birth records; typically, those
not linked were born outside of California and moved in later.

Variables extracted from birth records include maternal age at birth and education level,
childs sex, birth weight (less than 2,500 grams was considered low birth weight). From the DDS
records I obtained characteristics on the features of the autism at diagnosis, including presence of
comorbid intellectual disability, age of child at diagnosis, and symptom severity. Information on
symptoms comes from the Client Development Evaluation Report (CDER), which is given to
each client at entry into the DDS system and approximately every year while they remain on the
caseload. Through the CDER, DDS clients are evaluated for symptom severity and function
across a variety of dimensions. The evaluative element of the CDER is designed to help
determine appropriate services and needs, not as a diagnostic instrument. However, these items
contain useful information on the presence and severity of core autism symptoms. In this
analysis, I use items measuring verbal communication and social interaction, respectively, two of
the three main dimensions of autism symptoms. To create scores for communication and social
function, we summed the five communication and three social items, collected at entry into the
DDS system, weighting each item equally in the index (further information can be obtained from
author). The highest and lowest quintiles on each index are categorized as high and low
functioning, respectively. In 2008 the DDS revised the CDER, reducing the items used to create
these indices to a single item each. This lack of comparability for those diagnosed before and
after 2008 is problematic, but these items are still the main source of information on social and
communication function. Collapsing the five ordered response options for each item, I combine
responses to created categories consisting of the approximately 20% highest and lowest
functioning children on these items. Robustness checks are conducted to confirm that these
categories are substantively similar across the years.

I exclude all children born after 2005 in order to ensure that there is complete
ascertainment through at least age six for all cases. Listwise deletion was also used to eliminate
cases with missing data on key clustering variables, resulting in a final analysis sample of 36,180
children. Summary statistics on this sample are presented in Table 1. Since this sample is
composed of children with autism, is differs from the general population of children; for
example, the proportion of female children is much lower, and low birithweight births much
higher, than for California births in general.
Table 1. Sample Description for Key Clustering Variables

N %
Female 6148 16.99
Low Birth weight (<2500 g) 3001 8.29
Maternal Age
Young (<25) 8330 23.02
Middle (25-35) 19665 54.35
Older (>35) 8185 22.62
Maternal Education -
< High School 6500 17.97
HS or Some College 18919 52.29
College Grad 10761 29.74
Diagnosed Late (Age 5+) 12653 34.97
Intellectual Disability Dx 8957 24.76
Social Functioning -
Low (<20 percentile) 8442 23.33
Medium (20-80 percentile) 20686 57.18
High (>80 percentile 7052 19.49
Communication Functioning -
Low (<20 percentile) 6892 19.05
Medium (20-80 percentile) 20827 57.56
High (>80 percentile 8461 23.39
Total 36180 100.00
Methods
Motivation for Clustering Methods

The main approach used in this paper is cluster analysis, which is used to identify
subgroups within a population which tend to have similar attributes (Everitt et al. 2011a). The
idea is to find and group together similar observations in order to describe the subgroups in the
data. This method is used widely in many fields, including biology, genetics, marketing, and
computer science, although in the social sciences it is much less common. Clustering analysis
differs from regression analysis in that the purpose is mainly descriptive and not intended to
capture causal effects, but rather to show what characteristics of observations tend to hang
together. These groups can then be used to find associations with other outcomes, in conjunction
with regression or other methods, such as the particular contexts or mechanisms that produced
each subtype.
Clustering is especially useful for heterogeneous data in which there may be no typical
or average case. An excellent sociological example is the analysis of the diverse causation of
migration from Mexico to the US (Garip 2012). Autism is a perfect example of this sort of
phenomenon. During the study period, the diagnostic criteria for autism have changed multiple
times (King and Bearman 2009), the visibility and awareness of autism has increased and stigma
decreased, treatment options have expanded, and prevalence has risen, accompanied by a rise in
milder cases. Thus, there are likely to be multiple paths to an autism diagnosis during this period,
resulting in multiple types of children with autism. Regression, which assumes normal variation
around a mean, cannot capture this heterogeneity.
Steps in Cluster Analysis
The clustering analysis was accomplished using the cluster package for R. The first step
in identifying clusters is to choose a set of relevant variables on which the researcher expects the
data to cluster. I have created 12 dummy variables that capture the characteristics of the child,

the mother, and the autism symptoms and diagnosis (see Table 1 for a description of these
variables). Although paternal age has been shown in many studies to be an important risk factor
for autism, due to high rates of missing data on this variable and high correlation with maternal
age I do not include it as a clustering variable. [Note: this selection is still in progress, and the
final variables and clustering solution may change.] Next, these variables are used to produce a
distance matrix. This is an N x N matrix containing, for each observation, its dissimilarity from
every other observation. This dissimilarity can be thought of as a social distance, or a measure of
how many or how few variables each of the observations have in common. The distance matrix
was calculated based on manhattan distances, which are well-suited to binary variables.
This distance matrix is then plugged into a clustering algorithm. I chose the k-medoids
method as implemented in the R pam function, a robust version of the popular k-means algorithm
(Everitt et al. 2011b). Briefly, the basic approach is to use the dissimilarities to identify the
center of each cluster. Observations are iteratively moved to the cluster whose center it is closest
to, and the distances are recalculated and a new center identified. The final clustering solution
then minimizes the distances between all the observations and the centers of their respective
clusters.
An important step is determining the number of clusters. The analyst must specify the
number of clusters as an input to the k-medoids algorithm, but this quantity is a key question of
interest. I ran the clustering algorithm across a range of k values from 2 through 12. The best-
fitting solution, based on the criteria of mean silhouette index (which compares heterogeneity of
a cluster to its separation, and is higher when clusters are well-characterized) as well as
parsimony, was five clusters (Everitt et al. 2011b).2

2 More information on cluster validation is available from the author.

After creating the clusters, the analyst must examine them to see if they make substantive
sense given what is known about the context. To do this, I examine the composition of each
cluster in order to understand what kinds of children have been aggregated into each group, and
which variable values are most salient to each. Finally, I plot these clusters over time and
administrative boundaries in order to reveal the temporal and spatial patterning of each subtype
of autism case, and to assess whether they are more prevalent in particular time periods or
regional centers.
Results
Table 2 contains the composition of each cluster for the five cluster solution. Some
variables are more salient to some clusters than others; next I discuss the key characteristics of
each cluster. [To do: calculate significance tests comparing each cluster to the overall sample
means.]
Beginning with Cluster 1 (about 21% of the group), it is immediately obvious that the
mothers are particularly salient to this classification. All of the children in this group have
mothers who are college graduates, and they are substantially older than the other clusters. 37%
are over 35 (compared to 23% for the population), and very few (<3%) are under 25. Children in
this group are unlikely to be diagnosed late, and have about half the rate of intellectual
disabilities as the population in general. Although there are not especially likely to be high
functioning, they are unlikely to be low functioning (especially on the communication domain).
We can think of these children as the autism cases associated with delayed fertility among the
highly educated. These cases tend to be less severe than others, unaccompanied by intellectual
disabilities, and identified earlier rather than later.

Table 2. Composition of Clusters
Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5
N % N % N % N % N %
Female 1,302 0.17 1,367 0.16 747 0.14 747 0.16 1,768 0.18
Low Birthweight (<2500 g) 673 0.09 689 0.08 482 0.09 351 0.08 806 0.08
Maternal Age
Young (<25) 196 0.03 1,242 0.14 989 0.19 3,620 0.79 2,283 0.23
Middle (25-35) 4,591 0.61 5,500 0.62 3,051 0.59 734 0.16 5,789 0.58
Older (>35) 2,777 0.37 2,074 0.24 1,167 0.22 219 0.05 1,948 0.19
Maternal Education
< High School 0 0.00 619 0.07 826 0.16 3,502 0.77 1,553 0.15
HS or Some College 0 0.00 6317 0.72 3118 0.60 1017 0.22 8467 0.85
College Grad 7,564 1.00 1,880 0.21 1,263 0.24 54 0.01 0 0.00
Diagnosed Late (Age 5+) 1,069 0.14 7,709 0.87 741 0.14 3,134 0.69 0 0.00
Intellectual Disability Dx 954 0.13 1,505 0.17 3,492 0.67 1,461 0.32 1,545 0.15
Social Functioning
Low (<20 percentile) 1,071 0.14 704 0.08 3,986 0.77 1,070 0.23 1,611 0.16
Medium (20-80 percentile) 5,211 0.69 4,935 0.56 1,098 0.21 2,608 0.57 6,834 0.68
High (>80 percentile 1,282 0.17 3,177 0.36 123 0.02 895 0.20 1,575 0.16
Communication Functioning
Low (<20 percentile) 544 0.07 69 0.01 4,254 0.82 784 0.17 1,241 0.12
Medium (20-80 percentile) 5,858 0.77 1,925 0.22 941 0.18 3,324 0.73 8,779 0.88
High (>80 percentile 1,162 0.15 6,822 0.77 12 0.00 465 0.10 0 0.00
Total (% of sample) 7,564 0.21 8,816 0.24 5,207 0.14 4,573 0.13 10,020 0.28
Cluster 2 (24%) are quite mild, high functioning cases. In particular they have mild
communication symptoms (77% are categorized as high functioning on this domain) and only a
handful are low functioning on either domain. They were also very likely to be diagnosed at age
5 or later, which likely reflects the difficulty in identifying their less severe symptoms as autism.
Although the DDS does not provide services to those with an Aspergers diagnosis3, these
children may be closer to that part of the autism spectrum. Their mothers do not differ so
dramatically as those in cluster 1, however they are unlikely to have very young or poorly
educated mothers.
Cluster 3 (14%), in contrast, are more severe cases. 77% and 82% are low functioning on
the social and communication domains, respectively, and 2/3 have a diagnosis of intellectual
disability. Although the differences are not enormous, they are also more likely to be male and to
have been born with low birthweight than the overall sample. These children have mothers who
are neither particularly old nor young, and are of average education (although a bit less likely
than average to have a college degree).
Cluster 4 (13%) is the smallest cluster, and as with cluster 1 the characteristics of the
mother are highly salient to classification. These mothers tend to be young (80% are under 25)
and poorly educated (77% did not complete high school and only 1% graduated from college).
This group has a higher than average rate of intellectual disability diagnosis, but their autism was
frequently identified at an older age (69%). Although their social and communication symptoms
are not especially likely to be categorized as low functioning, they are unlikely to be in the high
functioning social category. [Note to self: should investigate the role of race for this category.] It

3 The distinction between Aspergers Syndrome and Autistic Disorder is that people with
Aspergers have social deficits combined with restricted interests or repetitive behaviors,
but do not have sufficient communication deficits to be diagnosed with Autism.

seems likely that the cognitive deficits of this group, combined with the lack of maternal
resources, delayed the identification of autism. These group is composed mainly of relatively
disadvantaged children whose autism is comorbid with intellectual disability.
Finally, Cluster 5 (28%) is the largest cluster and is closest to the sample mean on many
variables. The childrens symptoms are similar to those of cluster 1, in that they have few
intellectual disability diagnoses, tend to be neither very high nor low functioning, and were all
diagnosed by age 4. The main difference is in the characteristics of their mothers: 85% graduated
from high school but not from college, and they are much less likely to be over age 35. More
than any other cluster, these are sort of the typical modern autism case, and as Figure 1 shows
below, they have risen most dramatically during the study period.
Figure 1 displays the cluster proportions of all children in the sample by birth year. By
tracing the rise (or fall) or particular sub-groups in the California autism population we can begin
to understand how the composition of autism cases has changed over time, and perhaps find
clues to the mechanisms producing each type of diagnosis. The first thing to notice about this
graph is that clusters 1 and 5 have risen the most dramatically and steadily during this period.
Recall these are the two groups with non-severe symptoms, no ID, and early diagnosis. These
kinds of cases are making up an increasing proportion of the population; combined they are 18%
of 1992 births but 66% of 2005 births.

0.5
0.45
5 "Typicals"
0.4
Proportion of Annual Cases
0.35
1 Older Educated
0.3 Mothers
0.25 2 Late Dx, Good

Communication
0.2
0.15 3 Severe Cases with ID
0.1
0.05
4 Young Low
Education Mothers
0
1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
Birth Year
Figure 1. Proportion of All Cases in Each Cluster, By Birth Year

On the other hand, the pattern for cluster 2 shows a steady decline from the most
common cluster in 1992 to the least common in 2005. This reflects the declining age of diagnosis
over time as autism is more likely to be identified at a pre-school age, as has been documented in
other studies (Fountain, King, and Bearman 2011). This group also tends to be high functioning
on communication, and although the symptom severity of the autism population in California has
tended to become milder, this may reflect the changing association of symptoms with
socioeconomic status. In recent years, higher function, particularly on the communication
domain, has become increasingly associated with socioeconomic status. Thus, one possibility is
that during the study period less severe symptoms have become more closely associated with
maternal age and education, leading these children to be assigned to groups 1 and 5 rather than 2.
Clusters 3 and 4 show less dramatic temporal patterns. Both have remained relatively
steady, with perhaps a slight decline over time as a proportion of the population. Group 3,
composed of the most severe cases with comorbid ID, has a slight bump in 1995. At the risk of
over interpreting what may be a random fluctuation, this bump comes directly after the 1994
revision to autism diagnostic criteria in the DSM-IV, which other researchers have linked to an
increase in autism cases among those with ID (King and Bearman 2009). Group 4, which has
young and poorly educated mothers, also shows a slight downward trend. Although the number
of individuals in this cluster has risen over time, as a proportion of the population it has
diminished as the total autism population grew more quickly. These children continue to exist in
the population, although it has become increasingly a diagnosis for the children of more educated
parents.
Next, we examine how these clusters map onto the system of regional centers. These are
the local institutions that are responsible for establishing or confirming autism diagnosis,

monitoring symptoms and needs, and coordinating service provision for clients. Although all
regional centers use standardized diagnostic and evaluative instruments and follow the same
best practices guidelines, they can also differ in important ways, such as the personalities and
preferences of local leadership as well as the demographic composition of the regional center
catchment area.
In Figure 2 below, each bar depicts the cluster composition of a regional center (RC).
Bar width is scaled by the size of the autism caseload of that RC and they are presented in
alphabetical order. There is substantial variability in the kinds of autism clients diagnosed in
serviced in each RC. For example, cluster 1 (older, highly educated mothers) is much more
common in RCs located in well-off areas such as Orange County, Golden Gate (San Francisco)
and LAs West Side. On the other hand, cluster 4 (young, poorly educated mothers) is most
common among RCs in the agricultural Kern and Central Valley as well as East and South
Central LA. The severe cases with comorbid ID in cluster 3 are most common in the agricultural
Valley Mountain RC (containing Stanislaus and San Joaquin Counties) as well as San Diego and
Orange County. It is unclear why there is such a large cluster in southern California; more
analysis of the association between neighborhood characteristics and these clusters is needed to
understand the contextual production of autism subtypes.

Figure 2. Cluster Composition of Client Population by Regional Center
[Additional analyses to come:
Clusters by the characteristics of their neighborhoods: density of pediatricians and child
psychiatrists, density of autism cases, socioeconomic composition.]
Discussion
In this paper I have identified and described a heterogeneous set if five sub-groups within
the population of children with autism born in California between 1992 and 2005. These groups
differ in phenotypic, biological, and sociological ways. Although this analysis is descriptive in
nature, the varying associations among these sets of factors combine to reveal important patterns,
such as the way that symptom severity and maternal education are often linked.
Moreover, the production of children into these subgroups is not even over space and
time. Some sub-types are more common in the early years of the autism epidemic and other
groups have become dominant as the prevalence of autism has risen. This is no coincidence: as
visibility has increased, treatments options improved, and stigma diminished, autism has become
less a very severe disorder often accompanied by ID, and more common in its milder forms. As
the autism population has changed, its association with socioeconomic variables has changed as
well. So as parents increasingly delay fertility, we also observe the rapid rise of non-severe
autism among older, highly-educated mothers.
These findings also have important implications for autism research in general. If there is
no average or typical autism case, with others normally distributed about the center, but rather
distinct and variable then we need to be careful about the inferences made based on regression
analysis. At the very least, we need to take time and place more seriously in our analyses and
avoid clumping together individuals who were diagnosed under very different regimes. The
context in which autism is being diagnosed is changing, and so the kinds and magnitudes of risk

factors are changing as well. Several other papers in this area have exemplified the importance of
the shifting context of autism, with respect to immigration policy (Fountain and Bearman 2011),
parental age (Kayuet Liu et al. 2010; King et al. 2009), diagnostic change (King and Bearman
2009), and social influence (Liu et al. 2010).
This research does have some limitations. The data come from linked administrative
datasets that were collected for non-scientific purposes, and thus are not always ideal. Although
maternal education does appear to be highly salient to the identification of these clusters, it is
also only a proxy for socioeconomic status, and more detailed information on this would be
preferable. Similarly, the variables capturing severity of autism symptoms were chosen to assess
service needs, and are not consistent across the study period. Finally, as descriptive research
clustering methods cannot valuate causal explanations, but can only find patterns in the data and
suggest avenues for future study.
The next step in this research is to look at the spatial and socioeconomic distribution of
these autism sub-groups in a more detailed way, by locating these children in their
neighborhoods of birth and diagnosis. Then, we are able to link in data from other sources on the
density of pediatricians and child psychiatrists, number of other autism cases, as well as census
data on the socioeconomic composition of neighborhoods in order to understand how the
characteristics of places, including neighborhood resources, contribute to the production of these
clusters.
Conclusion
This is the first cluster analysis of autism on a large and representative state-wide
population. It is also the first to jointly consider the phenotypic, biological, and social factors that

contribute to autism diagnoses. As such, it has revealed not only the patterns of association
between these factors among children with autism, but the variation over space and time.

References
ADDMN. 2007. Prevalence of Autism Spectrum Disordersautism and Developmental

Disabilities Monitoring Network, 14 Sites, United States, 2002. Morbidity and Mortality
Weekly Report Surveillance Summaries 56:12:28.
Burchard, Esteban Gonzalez et al. 2003. The Importance of Race and Ethnic Background in
Biomedical Research and Clinical Practice. N Engl J Med 348(12):117075.
Cacioppo, John T., James H. Fowler, and Nicholas A. Christakis. 2009. Alone in the Crowd:
The Structure and Spread of Loneliness in a Large Social Network. Journal of
personality and social psychology 97(6):97791.
Centers for Disease Control and Prevention. 2006. Mental Health in the United States: Parental
Report of Diagnosed Autism in Children Aged 4-17 Years--United States, 2003-2004.
MMWR. Morbidity and Mortality Weekly Report 55(17):48186.
Cheslack-Postava, Keely, Kayuet Liu, and Peter S. Bearman. 2011. Closely Spaced Pregnancies
Are Associated With Increased Odds of Autism in California Sibling Births. Pediatrics
127(2):246 253.
Chess, Stella. 1971. Autism in Children with Congenital Rubella. Journal of Autism and
Childhood Schizophrenia 1(1):3347.
Chess, Stella. 1977. Follow-up Report on Autism in Congenital Rubella. Journal of Autism
and Childhood Schizophrenia 7(1):6981.
Christakis, Nicholas A. and James H. Fowler. 2007. The Spread of Obesity in a Large Social
Network over 32 Years. New England Journal of Medicine 357(4):37079.
Christakis, Nicholas A. and James H. Fowler. 2008. The Collective Dynamics of Smoking in a
Large Social Network. New England Journal of Medicine 358(21):224958.
Cohen-Cole, E. and J. M. Fletcher. 2008. Detecting Implausible Social Network Effects in Acne,
Height, and Headaches: Longitudinal Analysis. BMJ 337(dec04 2):a2533a2533.
Croen, Lisa A., Judith K. Grether, Jenny Hoogstrate, and Steve Selvin. 2002. The Changing
Prevalence of Autism in California. Journal of Autism and Developmental Disorders
32(3):20715.
Croen, Lisa A., Judith K. Grether, and Steve Selvin. 2002. Descriptive Epidemiology of Autism
in a California Population: Who Is at Risk? Journal of Autism and Developmental
Disorders 32(3):21724.
Croen, Lisa A., Daniel V. Najjar, Bruce Fireman, and Judith K. Grether. 2007. Maternal and
Paternal Age and Risk of Autism Spectrum Disorders. Archives of Pediatrics &
Adolescent Medicine 161(4):33440.

Developmental Disabilities Monitoring Network Surveillance Year 2010 Principal Investigators.
2014. Prevalence of Autism Spectrum Disorder among Children Aged 8 Years - Autism
and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2010.
Morbidity and mortality weekly report. Surveillance summaries (Washington, D.C.:
2002) 63 Suppl 2:121.
Division of Cancer Prevention and Control. 2007. Link Plus. Atlanta, GA: Center for Disease
Control and Prevention.
Durkin, Maureen S. et al. 2008. Advanced Parental Age and the Risk of Autism Spectrum
Disorder. Am. J. Epidemiol. 168(11):126876.
Durkin, Maureen S. et al. 2010. Socioeconomic Inequality in the Prevalence of Autism

Spectrum Disorder: Evidence from a U.S. Cross-Sectional Study. PLoS ONE
5(7):e11551.
Eaves, Linda C., Helena H. Ho, and David M. Eaves. 1994. Subtypes of Autism by Cluster
Analysis. Journal of Autism and Developmental Disorders 24(1):322.
Everitt, Brian S., Sabine Landau, Morven Leese, and Daniel Stahl. 2011a. An Introduction to
Classification and Clustering. Pp. 113 in Cluster Analysis. John Wiley & Sons, Ltd.
Retrieved January 2, 2016
(http://onlinelibrary.wiley.com/doi/10.1002/9780470977811.ch1/summary).
Everitt, Brian S., Sabine Landau, Morven Leese, and Daniel Stahl. 2011b. Optimization
Clustering Techniques. Pp. 11142 in Cluster Analysis. John Wiley & Sons, Ltd.
Retrieved January 2, 2016
(http://onlinelibrary.wiley.com/doi/10.1002/9780470977811.ch5/summary).
Fombonne, E. 2005. Epidemiology of Autistic Disorder and Other Pervasive Developmental

Disorders. The Journal of clinical psychiatry 66:3.
Fountain, C., M. D. King, and P. S. Bearman. 2011. Age of Diagnosis for Autism: Individual
and Community Factors across 10 Birth Cohorts. Journal of epidemiology and
community health 65(6):50310.
Fountain, C., A. S. Winter, and P. S. Bearman. 2012. Six Developmental Trajectories

Characterize Children with Autism. Pediatrics 129(5):e1112e1120.
Fountain, Christine et al. 2015. Association between Assisted Reproductive Technology

Conception and Autism in California, 1997-2007. American Journal of Public Health
105(5):96371.
Fountain, Christine and Peter Bearman. 2011. Risk as Social Context: Immigration Policy and
Autism in California. Sociological Forum (Randolph, N.J.) 26(2):21540.

Fountain, Christine, Marissa D. King, and Peter S. Bearman. 2010. Age of Diagnosis for
Autism: Individual and Community Factors across 10 Birth Cohorts. Journal of
Epidemiology and Community Health.
Fowler, James H. and Nicholas A. Christakis. 2008. Dynamic Spread of Happiness in a Large
Social Network: Longitudinal Analysis over 20 Years in the Framingham Heart Study.
BMJ 337(dec04 2):a2338a2338.
Freitag, C. M. 2006. The Genetics of Autistic Disorders and Its Clinical Relevance: A Review
of the Literature. Mol Psychiatry 12(1):222.
Garip, Filiz. 2012. Discovering Diverse Mechanisms of Migration: The MexicoUS Stream
19702000. Population and Development Review 38(3):393433.
Georgiades, STELIOS et al. 2007. Structure of the Autism Symptom Phenotype: A Proposed
Multidimensional Model. Journal of the American Academy of Child & Adolescent
Psychiatry 46(2):18896.
Hultman, Christina M., Par Sparen, and Sven Cnattingius. 2002. Perinatal Risk Factors for
Infantile Autism. Epidemiology 13(4):41723.
Hvidtjorn, D. et al. 2009. Cerebral Palsy, Autism Spectrum Disorders, and Developmental
Delay in Children Born after Assisted Conception: A Systematic Review and Meta-
Analysis. Archives of Pediatrics and Adolescent Medicine 163(1):72.
Hvidtjrn, D. et al. 2011. Risk of Autism Spectrum Disorders in Children Born after Assisted
Conception: A Population-Based Follow-up Study. Journal of Epidemiology and
Community Health 65(6):497502.
Kayuet Liu, Noam Zerubavel, and Peter Bearman. 2010. Social Demographic Change and
Autism. Demography 47(2):32743.
King, Marissa D. and Peter S. Bearman. 2009. Diagnostic Change and the Increased Prevalence
of Autism. Int. J. Epidemiol. 38(5):122434.
King, Marissa D. and Peter S. Bearman. 2011. Socioeconomic Status and the Increased
Prevalence of Autism in California. American Sociological Review 76(2):32046.
King, Marissa D., Christine Fountain, Diana Dakhlallah, and Peter S. Bearman. 2009. Estimated
Autism Risk and Older Reproductive Age. Am J Public Health 99(9):167379.
Larsson, Heidi Jeanet et al. 2005. Risk Factors for Autism: Perinatal Factors, Parental
Psychiatric History, and Socioeconomic Status. Am. J. Epidemiol. 161(10):91625.
Link, B. G., M. E. Northridge, J. C. Phelan, and M. L. Ganz. 1998. Social Epidemiology and
the Fundamental Cause Concept: On the Structuring of Effective Cancer Screens by
Socioeconomic Status. The Milbank Quarterly 76(3):375402.

Link, B. G. and J. Phelan. 1995. Social Conditions as Fundamental Causes of Disease. Journal
of Health and Social Behavior 35:8094.
Link, Bruce G. and Jo Phelan. 2009. The Social Shaping of Health and Smoking. Drug and
Alcohol Dependence 104(Supplement 1):S6S10.
Liptak, Gregory S. et al. 2008. Disparities in Diagnosis and Access to Health Services for
Children with Autism: Data from the National Survey of Childrens Health. Journal of
Developmental and Behavioral Pediatrics: JDBP 29(3):15260.
Liu, Ka-Yuet, Marissa D. King, and Peter S. Bearman. 2010. Social Influence and the Autism
Epidemic. American Journal of Sociology 115(5):13871434.
Liu, Kayuet and Peter S. Bearman. 2012. Focal Points, Endogenous Processes, and Exogenous
Shocks in the Autism Epidemic. Sociological Methods & Research. Retrieved October 1,
2012 (http://smr.sagepub.com/content/early/2012/09/17/0049124112460369).
Lyons, Russell. 2010. The Spread of Evidence-Poor Medicine via Flawed Social-Network
Analysis. 1007.2876. Retrieved August 1, 2011 (http://arxiv.org/abs/1007.2876).
Mandell, David S. et al. 2009. Racial/ethnic Disparities in the Identification of Children with
Autism Spectrum Disorders. American Journal of Public Health 99(3):49398.
Marmot, M. G. 2004. The Status Syndrome: How Social Standing Affects Our Health and
Longevity. Macmillan.
Mazumdar, Soumya, Marissa D. King, Ka-Yuet Liu, Noam Zerubavel, and Peter S. Bearman.
2010. The Spatial Structure of Autism in California, 1993-2001. Health & Place
16(3):53946.
Mazumdar, Soumya, Alix Winter, Ka-Yuet Liu, and Peter Bearman. 2013. Spatial Clusters of
Autism Births and Diagnoses Point to Contextual Drivers of Increased Prevalence.
Social Science & Medicine (1982) 95:8796.
Palmer, Raymond F., Tatjana Walker, David S. Mandell, Bryan Bayles, and Claudia S. Miller.
2009. Explaining Low Rates of Autism Among Hispanic Schoolchildren in Texas. Am
J Public Health AJPH.2008.150565.
Pescosolido, Bernice A. 1992. Beyond Rational Choice: The Social Dynamics of How People
Seek Help. American Journal of Sociology 97(4):1096.
Phelan, Jo C., Bruce G. Link, and Parisa Tehranifar. 2010. Social Conditions as Fundamental
Causes of Health Inequalities. Journal of Health and Social Behavior 51(1 suppl):S28
S40.
Prior, Margot et al. 1998. Are There Subgroups within the Autistic Spectrum? A Cluster
Analysis of a Group of Children with Autistic Spectrum Disorders. The Journal of Child
Psychology and Psychiatry and Allied Disciplines 39(06):893902.

Rablen, Matthew D. and Andrew J. Oswald. 2008. Mortality and Immortality: The Nobel Prize
as an Experiment into the Effect of Status upon Longevity. Journal of Health Economics
27(6):146271.
Redelmeier, Donald A. and Sheldon M. Singh. 2001. Survival in Academy AwardWinning

Actors and Actresses. Annals of Internal Medicine 134(10):955 962.
Ronald, Angelica et al. 2006. Genetic Heterogeneity between the Three Components of the
Autism Spectrum: A Twin Study. Journal of the American Academy of Child and
Adolescent Psychiatry 45(6):69199.
Russell, Ginny, Colin Steer, and Jean Golding. 2010. Social and Demographic Factors That
Influence the Diagnosis of Autistic Spectrum Disorders. Social Psychiatry and
Psychiatric Epidemiology. Retrieved December 15, 2010
(http://www.springerlink.com/content/a67371l826m1xl76/).
Schieve, Laura A. et al. 2015. Does Autism Diagnosis Age or Symptom Severity Differ Among
Children According to Whether Assisted Reproductive Technology Was Used to Achieve
Pregnancy? Journal of Autism and Developmental Disorders 45(9):29913003.
Shalizi, C. R. and A. C. Thomas. 2011. Homophily and Contagion Are Generically Confounded
in Observational Social Network Studies. Sociological Methods & Research 40(2):211
39.
Shattuck, Paul T. et al. 2009. Timing of Identification among Children with an Autism
Spectrum Disorder: Findings from a Population-Based Surveillance Study. Journal of
the American Academy of Child and Adolescent Psychiatry 48(5):47483.
Shi, Limin, S. Hossein Fatemi, Robert W. Sidwell, and Paul H. Patterson. 2003. Maternal
Influenza Infection Causes Marked Behavioral and Pharmacological Changes in the
Offspring. J. Neurosci. 23(1):297302.
Stevens, Michael C. et al. 2000. Subgroups of Children With Autism by Cluster Analysis: A
Longitudinal Examination. Journal of the American Academy of Child & Adolescent
Psychiatry 39(3):34652.

Copyright of Conference Papers -- American Sociological Association is the property of
American Sociological Association and its content may not be copied or emailed to multiple
sites or posted to a listserv without the copyright holder's express written permission.
However, users may print, download, or email articles for individual use.

DDDG

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

DDDG

Încărcat de

Drepturi de autor:

Formate disponibile

The social and phenotypic heterogeneity of autism: identifying clusters in a large

Department of Sociology and Anthropology

Address Correspondence to: Christine Fountain, Department of Sociology and Anthropology,

Working Paper Draft: Do not cite or distribute

Autism is a spectrum disorder characterized by myriad combinations of behavioral symptoms

Working Paper Draft: Do not cite or distribute

Autism is a neurodevelopmental disorder characterized by deficits of communication and

childhood. Autism is considered a spectrum disorder, including persons with varying

preferences and sensitivities, sensory hypersensitivities or abnormalities, motor coordination

abnormalities, and sleep problems as well as unusual skills.

Winter, and Bearman 2012).

Working Paper Draft: Do not cite or distribute 1

this background, autism prevalence has risen dramatically, if unevenly.

awareness of autism among parents, teachers, and caregivers.

background, neighborhood characteristics, and information on the autism diagnosis and

complex and shifting ways to produce the oft-observed heterogeneity of autism.

Working Paper Draft: Do not cite or distribute 2

Autism prevalence and heterogeneity

importance -- for this increase.

Risk factors for autism

2006; Georgiades et al. 2007; Ronald et al. 2006).

Working Paper Draft: Do not cite or distribute 3

mechanisms that lead to heterogeneous symptom presentation.

Socioeconomic Status and Autism

be in ascertainment, not phenotype.

Working Paper Draft: Do not cite or distribute 4

with increased autism risk (Cheslack-Postava, Liu, and Bearman 2011).

based partly on race (Mandell et al. 2009; Palmer et al. 2009).

Working Paper Draft: Do not cite or distribute 5

Working Paper Draft: Do not cite or distribute 6

to change their behavior in accordance (Link and Phelan 2009).

fundamental cause of health.

Spatial Distribution and Social Influence

Working Paper Draft: Do not cite or distribute 7

Lyons 2010; Shalizi and Thomas 2011).

Although autism is not, strictly speaking, a contagious disease1, knowledge and

obtaining diagnoses and services.

Working Paper Draft: Do not cite or distribute 8

(Liu and Bearman 2012).

Prior Research on Autism Clusters

Working Paper Draft: Do not cite or distribute 9

diagnosed with autism.

another qualifying condition or substantial disability.

not linked were born outside of California and moved in later.

Working Paper Draft: Do not cite or distribute 10

categories are substantively similar across the years.

Working Paper Draft: Do not cite or distribute 11

higher, than for California births in general.

Table 1. Sample Description for Key Clustering Variables

Total 36180 100.00

Motivation for Clustering Methods

Working Paper Draft: Do not cite or distribute 12

around a mean, cannot capture this heterogeneity.

Steps in Cluster Analysis

Working Paper Draft: Do not cite or distribute 13

parsimony, was five clusters (Everitt et al. 2011b).2

Working Paper Draft: Do not cite or distribute 14

disabilities, and identified earlier rather than later.

Working Paper Draft: Do not cite or distribute 15

than average to have a college degree).

Working Paper Draft: Do not cite or distribute 17

disadvantaged children whose autism is comorbid with intellectual disability.

of 1992 births but 66% of 2005 births.