Sunteți pe pagina 1din 19

3

READ MY LIPSMEASURING PERSONALIT Y THOUGH


L E G I S L AT I V E S P E E C H

The key to studying the role of each dimension of the Big Five on legislator behavior lies in
measuring legislator personality in a consistent, reliable manner. While psychologists have
developed a number of questionnaire inventories to assess Big Five traits for respondents in
both surveys and the laboratory, there are several reasons why such techniques would likely
be inappropriate to study the personalities of elected officials. Thus, the task at hand is to find
an appropriate way to measure personality traits over time and, to the extent possible with
elected officials, to generate estimates for all sitting members of the U.S. Congress.
In this chapter, we present a method that links traditional psychometric approaches with
advances in machine learning in order to assess personality traits based on speeches and text.
First, we review the traditional psychometric methods for assessing personality traits. We
argue that these on there own are inappropriate for the study of elected officials. Leveraging a
unique psychological study (Pennebaker & King 1999) that measures both traditional survey
inventories and provides written respondent corpora, we show how personality traits may
be culled from text. Thereafter, we apply this method to a body of legislative speeches in
the U.S. Congress over the past two decades. Last, in light of political science research that
extracts ideology from legislative speech, we show that this method is not simply regurgitating
a measure of legislator preferences.

63

3.1

L I M I T A T I O N S O F E X I S T I N G A P P R O AC H E S F O R E L E C T E D O F F I C I A L S

Despite the tremendous advances in the measurement of the Big Five, most political science applications using these metrics involve surveys of voters (e.g. Caprara, Barbaranelli &
Zimbardo 2002, Gerber et al. 2011a). While enlightening, these studies tell us precious little
(if anything) about the personality traits of elected officials. An obvious, simple solution to
this deficit would be to survey legislators ourselves. Unfortunately, such an approach would
likely be impractical for a number of reasons. First, survey or lab-based instruments are only
implementable in the present, thus precluding us from being able to look at the dynamics
of personality over time. Second, even if we restrict ourselves to contemporary Congresses,
there is no reason to believe that legislators would be willing to take such inventories. Even
if responses were possible to obtain, such estimates would be subject to selection bias and,
possibly, strategic responses to questions.
We are certainly not the first scholars to recognize this limitation, which is almost certainly
to blame for the lack of virtually any systematic study of legislator personality traits. To
our knowledge, only one studyand a recent one at thathas attempted to apply traditional
survey-based inventories with legislators (Dietrich et al. 2012). The study was focused on
state legislators from only three states: Maine, Arizona, and Connecticut. In line with our
concerns above, response rates were lowranging from 17% to 26% of legislators. Perhaps
even more troubling, the responses of legislators to the Big Five displayed huge desirability
bias. On each of the dimensions, in excess of 77% (and in many cases, more than 90%) of
legislators responded in a way to convey the positive side of each dimension. Indeed, for
the cases of Agreeableness, Openness, and Emotionally stability, the percentage of legislators
identifying as non-agreeable, closed, and neurotic are in the low single-digits.

64

While these results may be true, they instead suggest that survey instruments, even when
possible to conduct, will lead to low response rates and uninformative personality profile
estimates. Instead, other methods are needed.1

3.2

USING TEXT TO MEASURE PERSONALIT Y TRAITS

To deal with the limitations discussed in the previous section, we draw on a very recent literature in machine learning that seeks to connect personality traits with both written and spoken
words. (Golbeck, Robles, Edmondson & Turner 2011, Li & Chignell 2010, Mairesse, Walker,
Mehl & Moore 2007, Mairesse & Walker 2008, Mairesse & Walker 2010, Schuller, Steidl, Batliner, Burkhardt, Devillers, Mller & Narayanan 2013). This new and exciting literature uses
traditional psychometric personality inventories in conjunction with written texts, Tweets,
and auditory transcriptions to train predictive models for personality. Once the known personalities of a subset of authors are calibrated with their linguistic usage, virgin texts can be
assessed for personality content, even in the absence of the true personalities as measured by
traditional inventories.
In a foundational piece in this literature, Mairesse et al. (2007) develop a widely-applicable
method for generating personality estimates from speech and text. Using Pennebaker &
Kings (1999) corpus of nearly 1.9 million words from laboratory experiments, and Mehl,
Gosling & Pennebakers (2006) corpus of approximately 100,000 words from recorded conversations, Mairesse et al. (2007) train a number of machine learning models to best predict
personality traits. Machine learning methods are a class of models that seek to predict an observed output with optimal combinations of features. The models are trained on a subset
of data and the estimates from this process are used to predict the rest of the data using only
right-hand-side variables. A simple example would be to perform a linear regression of some
known dependent variable y on a collection of independent variables x1 , x2 , ..., xn for, say,
1 Also

see Ramey, Klingler & Hollibaugh (Forthcoming).

65

half a sample of data. Then, using the estimated regression coefficients, we would generate
predictions of y using only the x j s for the remaining sample.
In this case, Mairesse et al. (2007) use the language used in both written and spoken language
to predict personality traits. Crucial for our purposes, words are categorized according to Pennebaker, Francis & Booths (2001) Linguistic Inventory and Word Count (LIWC) dictionary
(2001 edition), as well as Colthearts (1981) MRC Psycholinguistic Database (MRCPD).2 Doing
so allows scholars to generalize to different domains. Both the LIWC and MRCPD search
for linguistic features in a collection of texts, such the number of second person pronouns,
punctuation, six letter words, and more. After preprocessing the data using these dictionaries, Mairesse et al. (2007) train several machine learning algorithms on a random subsample
of the data. They find that Support Vector Machines for Regression (SMOreg) best recovers
personality measures of respondents in written trials. Therefore, we opt for using SMOreg
in what follows. That said, for our purposes, which model we employ seems to make little
substantive difference.
Again, it is crucial to note that the Mairesse et al. (2007) approach does not rely on specific
words, but rather the psycholinguistic properties of the words as measured through the LIWC
and MRCPD. To illustrate this more concretely, consider an example from popular culture.
It is well known that the transition from the rock music of the 1980s to that of the 1990s was
a stark and difficult one. Whilst the glam rock of the late 1980s was focused on living life and
having fun, the 1990s world of grunge was a darker, heavier type of sound (Witmer 2010,
24). One might even say that the personalities conveyed by the singers and authors of the
songs of the different eras were quite different.
Though the content of any two songs will be inevitably different, the use of the LIWC
allows us to abstract away from those specifics and hone in on the relative frequencies of
word categories. These categories are broad and known to be correlated with personality
traits (Pennebaker, Francis & Booth 2001). To that end, Figure 3.1 presents bar graphs of
2 The

categories are found in the appendix to this chapter.

66

Social

Social

Present

Pronoun

Affect

Comm

Pronoun

Self

Othref

Affect

You

Present

LIWC Category

LIWC Category

Figure 3.1: LIWC Comparison of Bon Jovi and Nirvana

Posemo
Physcal
Negemo
Comm

Cogmech
Article
Space
Incl

Sexual

Negemo

Posfeel

Othref

Article

Self

Preps

We
0

20

40

60

Feature Count

20

40

Feature Count

(a) You Give Love a Bad Name

(b) Smells Like Teen Spirit

the top fifteen LIWC categories for two archetypal songs from each era: Bon Jovis You
Give Love (a Bad Name) and Nirvanas Smells Like Teen Spirit. We notice readily that the
top word categories are quite different across the songs. Bon Jovi tends to sing a lot about
the present tense and emotions, with references to others, sex, and doing things. Nirvana
uses a lot of cognitive mechanisms, relativistic language, andwhen emotions are discussed at
allnegative emotion words. Notably, Bon Jovi focuses more on the second person whereas
Nirvana focuses on the first person.
This example, however simple it may be, illustrates how two distinct texts about two different subjects may be processed using the LIWC to gain insight on the personalities and

67

cognition of the singers and songwriters. It is this principle that we rely on as we seek to
measure the manifest personalities of politicians.

3.3

M E A S U R I N G P E R S O N A L I T Y: F R O M S P E E C H E S T O S C O R E S

Given the generality of the Mairesse et al. (2007) approach, we seek to apply their method to
legislative speech. To do this, we need legislator speeches to feed into the pre-trained models.
Since we desire personality traits for legislators over as wide a time period as possible, the
Congressional Record is perhaps the single best option available. There are, of course, issues
with this. Since speeches in the Congressional Record are public, they might be written and
delivered strategically. Specifically, legislators speeches may seek to convey ideological preferences, constituency preferences, or to mimic some sort of generic leadership profile. The first
two concerns are easily dismissed; we demonstrate below that our measures of personality,
while correlating with standard measures of legislator ideal points, explain small proportions
of the overall variance. This suggests that (a) personality traits are not simply a recitation of
ideology in different terms and (b) that they able to explain additional facets of behavior. The
last concern is only an issue if legislators try to portray insincere personality profiles in their
speeches. Again, this likely not an issue. Since we are using the entire corpus of speeches delivered by every legislator, it would be considerably difficult to maintain a profile over thousands
of words of speech. Indeed, one-shot surveys are likely more susceptible to such short-term
strategic manipulation. Even if these problems plague our data, they would attenuate our
results. Additionally, if some legislators were being sincere whilst others were being strategic,
the attenuation would be even worse and our estimated personality traits would likely explain
little of legislator behavior.
These concerns aside, we might still worry that the psycholinguistic content of legislator
floor speeches might differ substantially from the Pennebaker essay corpus. To address this,
Figure 3.2 plots the hyperbolic arcsine-transformed mean LIWC category usage from the Pen-

68

nebaker data against the hyperbolic arcsine-transformed mean usage of members of the 114th
House of Representatives. The hyperbolic arcsine can be thought of as a logarithmic transformation for data with zeros; we use it because some categories are much larger than others and,
as such, visualization of the relationship is greatly improved. Each variable label is a LIWC
2001 category. As we see, the correlation on the untransformed scale is approximately 0.99;
the rank correlation coefficient across the two is 0.72 ( p < 0.001). Consequently, though the
substantive content is surely different, legislators speeches do not differ psycholinguistically
from the laboratory participants.

Pennebaker and King Corpus Mean LIWC Usage


(Hyperbolic Arcsin Transform)

Figure 3.2: Comparing LIWC (2001) Usage between the Pennebaker Corpus and Floor
Speeches
120
90

Dic

60

Unique
30

Correlation = 0.99

Pronoun
Present
Self
I
Cogmech
Period
Social
Incl
Affect
Article
ExclTime
Othref
Tentat
DiscrepPast
SpaceOccup
Apostro
Comma
Posemo
Insight
Negate
Senses Other
Negemo
School
Leisure
Future
Physcal
Motion
Certain
Comm
Cause Up
Seg
Hear
Humans
Achieve
Number
Home
See
Posfeel
Body
Optim
Anger
Touch
WeJob
You
Friends
Sad
Fillers
Family
Numerals
Dash
QMark
Anx
Money
Exclam
Eating
Sport
Inhib
Sexual
Sleep
Music
Metaph
Down
Swear
Quote
Assent
Relig
TV
Colon
Groom
OtherP
Parenth
Death
SemiC
0 Nonfl
0

WPS
AllPct
Preps Sixltr

20

House Member Mean LIWC Usage in 2014


(Hyperbolic Arcsin Transform)

69

40

60 80

All of these concerns having been addressed, we apply Mairesse et al.s (2007) SMOreg
model to the entire corpus of legislative floor speech by every sitting member of the House
of Representatives and Senate from the 104th113th Congresses. The estimation procedure
is straightforward. First, we process legislators speeches in a given Congress (or year, as appropriate) through both the LIWC 2001 and MRCPD to get counts and proportions of word
usage across all LIWC and MRCPD categories. We process the speeches by time period so as
to account for any time-dependent language (e.g., references to terrorism in the aftermath of
September 11, 2001). Second, we standardize the LIWC and MRCPD results from the previous step and plug these into the Mairesse et al. (2007) corpus-level models. This process allows
for better within-domain comparability. For example, all legislators might use more six letter
words than any lab respondent. Standardizing allows us to compare usage relative to the mean
legislator.
Third, we repeat this process for every Congress (or year).

Fourth, and most

critically, we jackknife legislator i 0 s personality on dimension d in Congress c

( c = 1, 2, ..., Ci where Ci is the number of Congresses in which legislator i served). Specifi1 P


cally, legislator is jackknifed score on dimension d in Congress c is idc = C 1 c 0 6=c idc 0 . In
i

words, a legislators personality during a particular Congress is the average of their estimated
personality during all other time periods. This correction addresses potential endogeneity
between language and desired behavior within a specific timeframe. For example, a legislator might use agreeable language to acquire cosponsors in a given Congress whether or not
s/he is actually Agreeable. We have tried other ways of getting around this issue, including
computing a single lifetime arithmetic mean for all members. Doing this is not substantively
different from our approach and unnecessarily complicates the empirical tests in subsequent
chapters.
Last, we generate measures of uncertainty using a sentence-level bootstrap (Lowe &
Benoit 2011). Specifically, assume that legislator i uses Ni c sentences during Congress c. For
each legislator, we resample Ni c sentences with replacement from their corpus of language

70

during the given timeframe (Efron & Tibshirani 1994). At the Congress level, we conduct
100 bootstraps per member and compute the empirical 95% confidence interval. To measure
uncertainty in the jackknifed estimates, we take the legislators estimates by Congress across
each of the 100 bootstraps, calculate the jackknife as described above, and then compute the
empirical 95% confidence interval.
Figure 3.3: Senate Scores over Time (selected members)
Openness

Conscientiousness

Extraversion
5.0

4.5

4
3.5

4.0
3

3.5

3.0

Agreeableness

Emotional Stability

4.4

10
104
105
106
107
108
119
110
111
112
3

10
104
105
106
107
108
119
110
111
112
3

3.0
10
104
105
106
107
108
119
110
111
112
3

Jackknifed Scores (with 95% Confidence Intervals)

4.0

Senate Members
Boxer
Grassley
McConnell
Reid

3.8
4.0
3.6
3.6
3.4

3.2
10
104
105
106
107
108
119
110
111
112
3

10
104
105
106
107
108
119
110
111
112
3

3.2

Congress

The resulting jackknifed scores are henceforth referred to as Elite LingUistiC Individual
Difference EstimATION (ELUCIDATION) scores. Crucially, we are agnostic as to whether
ELUCIDATION scores are measures of sincere legislator personality. Like with ideal-point

71

estimates based on roll calls, we simply consider these estimates as revealed and potentially
strategic preferences. Estimates of the traits and confidence intervals are presented for key
Senate members (who served during the entire period of our data) in Figure 3.3. The scale
for each trait ranges from 1 to 7. As we see, the estimates are stable and precisely estimated.
Moreover, the results make intuitive sense with both our core cognitive constraint framework
as well as traditional understandings of the Big Five. For example, Senate Minority Leader
Harry Reid (D-NV) is considerably more Extraverted than his Republican counterpart, Mitch
McConnell (R-KY), a finding consistent with the perspectives of even cursory observers of
American politics. Critically, Reids (McConnells) assuming the post of Democratic (Republican) leader in the 109th (110th) Congress failed to produce any noticeable differences in
their trait estimates. This suggests that our estimates are robust to concerns about potential
strategic conveyance of artificial leadership profiles.
Since some Congressmembers are certainly more talkative than others, we might be concerned that the precision of our estimates is strong only for the most vocal members. To
assuage these concerns, Figure 3.4 presents a plot of the empirical confidence interval width
for the House of Representatives against speech word counts by legislator; the associated figure for the Senate is qualitatively similar. On average, a typical member has around 11,000
words per year documented in the Congressional Record. As the graph shows, after around
5,000 words, the width of the confidence interval is about 0.3 or less; in other words, the 95%
interval is the legislators point estimate 0.15 on a 7-point scale. Consequently, the estimates
are extremely precise for almost all legislators during the time period analyzed.

72

Figure 3.4: Word Count and Precision


Openness

Conscientiousness

Extraversion

4
4
3

2
2
1

Agreeableness

8e
+0
5

6e
+0
5

4e
+0
5

8e
+0
5
0e
+0
0

6e
+0
5

4e
+0
5

2e
+0
5

8e
+0
5
0e
+0
0

6e
+0
5

4e
+0
5

2e
+0
5

2e
+0
5

0
0e
+0
0

95% Confidence Interval Width

Emotional Stability
3

3
2
2
1

5
8e

+0

5
+0
6e

5
4e

+0

5
+0
2e

0
+0

0e

5
+0
8e

5
+0
6e

5
+0
4e

5
+0
2e

0e

+0

Word Count

Note: Points are individual members. The smoothed line is a loess-smoothed trend. The
vertical red lines are the median members word count (approximately 11,000 words per year).
Horizontal red lines are the average confidence interval width (approximately 0.3). Thus,
most members personality estimates are their point estimates 0.15.
3.4

3.4.1

VA L I D I T Y O F T H E E S T I M A T E S

Strategic Misrepresentation and Authorship Concerns

As we noted above, these estimates are based on revealed information. Since they are strategic
actors, legislators have the opportunity to manipulate their language so as to convey fake
personality profiles. While this sort of strategic misrepresentation could prove problematic,

73

we ultimately do not think it is much of an issue. In this section, we explore why our approach
is, at worst, not inferior to survey-based approaches and, at best, strictly preferred to them.
The issue of misrepresentation is not unique to either political elites or to our text-based
personality estimation procedure. The NEO-PI-R personality inventory (Costa & McCrae
1992a) discussed in Chapter 2 is often criticized for the fact that respondents are keenly aware
what the positive response is for each item; virtually any respondent would desire to be seen
as Open, Conscientious, Extraverted, Agreeable, and Emotionally Stable. In the political
science literature, this phenomenon manifested in recent work by Dietrich et al. (2012). In
this piece, the authors administer a Big Five questionnaire to state legislators in three states.
Unsurprisingly, overwhelming majorities of legislators pool on the positive responses for
each Big Five dimension. While any observer of American politics knows that legislators
are not a random slice of the American electorate in any sense, such extreme skew in the
distribution of Big Five traits suggests a major flaw in the survey approach.
That said, our language-based approach is not necessarily free from the same concerns. If
legislators actively respond to Big Five inventories knowing which side of the scale conveys
positive valence, could they not do the same with their language? Surely, legislators are aware
that using certain sorts of words could convey positive or negative information about themselves and screen their speech accordingly. That said, it is unlikely that legislators (or anyone,
for that matter) know of all the ties between their speech and their latent personality traits.
For example, it seems unlikely that legislators screen their speeches for too many (or too few)
six letter words, commas, or first person pronouns. As a result, we should expect that all
legislators would avoid (or emphasize) certain words that are commonly associated with positive valence but they would differ in their usage on the countless set of words whose valence
value is unknown. Combining features for which legislators collectively pool on positive
valence and those in which they unconsciously convey their true traits should produce estimates that skew toward valence pooling. Empirically, it turns out that legislators are largely
unaware of the psycholinguistic properties of their speech. Our analysis of legislative speech

74

using LIWC and MRCPD categories shows substantial variation across legislators. As a result,
this concern does not have much bite.
Perhaps more worrisome is the issue of speechwriting. Even if the author is not gaming his
or her language, s/he may not even be the legislator delivering the speech. Since legislators
almost surely farm out some of their speechwriting to staffers, how can we be certain that our
estimates are even capturing legislator personality? This concern, unlike the strategic issue,
is fairly easy to address (and refute) empirically. First, few members have a full-time speechwriter as members of their professional staff; Congressional disbursements show that only the
party leaders and few more senior Congressmembers have paid speechwriters.3 Of course, a
legislator could farm out speechwriting to any staffer, not just a dedicated speechwriter. At
the same time, however, staff turnover is extremely high in Congress. For example, we examined both the Q3 2009 and Q3 2011 Congressional disbursement reports to quantify these
rates. The average House member lost more than one-third of his or her professional staff
in this two-year window; the interquartile range for retention was [0.55, 0.74]. Given such
high levels of churn, we should expect that who is writing speeches for Congressmembers
will change considerably over time. Since different speechwriters have different personalities
and writing styles, we should see considerable variation in our estimates. Fortunately, we do
not. As Figure 3.3 shows, for some of the longest serving and highest ranking members of
the Senate during our time period, there is clear stability in the jackknifed personality scores.
Indeed, as we noted above, neither Reid nor McConnell were party leaders at the start of our
data and yet their estimates are consistent over time. This suggests that whoever was writing their speeches maintained a clear, discernible linguistic pattern over time. This finding is
consistent with the advice that the Congressional Research Service (CRS) provides to speechwriters: Congressional speechwriters should make every effort to become familiar with the
speaking style of the Member for whom they are writing, and adjust their drafts accordingly
(Neale 1998).
3 See,

e.g., http://disbursements.house.gov/2013q4/2013q4_singlevolume.pdf.

75

3.4.2

Face Validity

Having addressed the potential theoretical issues with our approach, it is important to examine the face validity of our estimates. Since we do not know the true personalities of
Congressmembers and that surveying them is problematic (or impossible, in the case of the
deceased), we will need to validate our estimates indirectly. To that end, we follow a long tradition in the political psychology literature examining the linkages between personality and
ideology at the mass level (Gerber et al. 2010, Gerber et al. 2011a, Mondak 2010). This literature has found strong and consistent links between Openness and liberalism and between
both Conscientiousness and Emotional Stability and conservatism. Findings on the linkages
between Extraversion and Agreeableness and ideology are more mixed. Mondak (2010) finds
no links between Extraversion and ideology, and only a weak and model-dependent relationship between Agreeableness and liberalism. Gerber et al. (2011a) shed some light on this
by showing that, when ideology is separated into two dimensionseconomic and social
Agreeableness is linked with social conservatism but not economic liberalism. Similarly, Gerber et al. (2010) find strong linkages between ideology and Openness, Conscientiousness, and
Emotional Stability in the same directions as the rest of the literature, but the connections
between left-right ideology and both Agreeableness and Extraversion are more nuanced.
Given the findings of this literature, we can indirectly assess the face validity of our estimates
by replicating their analyses at the elite level. To do this, we measure ideology using Groseclose, Levitt & Snyder (1999) inflation-adjusted Americans for Democratic Action (ADA)
scores. ADA scores are measures of legislator liberalism that come from the ADAs annual
scorecard, which is itself a compilation of around twenty votes that the organization views as
being critical to assessing legislator liberalism in a given year. The Groseclose, Levitt & Snyder
(1999) adjustment makes the scores comparable over time by allowing for distortions in the
scale over time. Since ADA scores measure liberalism, we should expect a positive relationship
between Openness and the ADA score and negative relationships between Conscientiousness,

76

Table 3.1: OLS Models of Personality and House ADA Score (1996-2008)
Model 1
Model 2
Model 3
Model 4
Openness
22.606
22.455
21.315
21.339
(1.695)
(1.669)
(1.689)
(1.667)
Conscientiousness
20.545 17.932 20.615 18.161
(1.448)
(1.448)
(1.437)
(1.439)

Extraversion
15.436
13.958
15.468
14.081
(1.097)
(1.089)
(1.088)
(1.082)

Agreeableness
15.940
13.478
16.389
14.028
(2.923)
(2.889)
(2.900)
(2.872)
Emotional Stability 21.465 17.533 22.022 18.272
(1.821)
(1.831)
(1.808)
(1.823)
Male
17.257
16.145
(1.630)
(1.628)

Age
0.467
0.407
(0.060)
(0.060)

Constant
14.588
6.644
34.908
24.885
(8.230)
(8.140)
(8.575)
(8.521)
2
R
0.104
0.131
0.119
0.142
2
Adj. R
0.103
0.130
0.117
0.141
Num. obs.
3602
3602
3602
3602
Standard errors in parentheses.
Two-tailed tests: p < 0.01, p < 0.05, p < 0.1

Emotional Stability, and the ADA score. Additionally, as ADA scores tap into the underlying
economic conflict between the Democratic and Republican parties (approximate to the contemporary liberal-conservative dimension), we should find a positive relationship between
Agreeableness and the ADA score (Crespin & Rohde 2010, Poole & Rosenthal 1997).4 Finally, since the literature is mixed on the relationship between ideology and Extraversion, we
remain agnostic regarding the expected sign (or significance). Table 3.1 presents several OLS
models of ideology as a product of personality and demographic traits.
All four traits (i.e., all the Big Five save Extraversion) with expected relationships have statistically significant coefficients in the expected directions in each model. Additionally, in line
with previous literature, the coefficients on Extraversion are of smaller magnitudes than those
4 No

other trait has these divergent effects.

77

for Openness, Conscientiousness, and Emotional Stability, and this holds for all four models
(though the relationship between ideology and Extraversion is still a point of contention).
Moreover, these results allay one natural concern with using potentially ideologically-tinged
legislative speeches to estimate personality, in that our personality estimates may be simply
summaries of legislator ideology. However, our model R2 s in Table 3.1 are not large, suggesting personality traits alone do not account for large proportions of the variance in ideology.
This suggests whatever theoretical concerns about the dependence of the Big Five on ideology
or the ideological content of the legislative record are not a problem.

3.4.3

Read My Lips: Conclusion

In this chapter, we introduced a method to measure legislator personality using speech. While
almost certainly imperfect, all the major issues with this approachnamely, strategic concerns
and speechwriter effectswere shown to be essentially inconsequential. Moreover, given the
strategic, temporal, and practical problems with the only main rival approach to measuring
elite personalitysurveyswe remain confident in the utility of our approach. Given the
flexibility and generality of the linguistic method used herein, we expect that scholars of other
legislative institutions (e.g., state legislatures, non-American legislatures) and, more generally,
elite behavior, will be able to apply our technology to those other settings.
That said, the key to understanding and analyzing elite behavior is not just the methodological approach described in this chapter. The theoretical framework from Chapter 2 is
equally important, as it provides scholars of elite institutional behavior with a structure for
forming precise predictions for elite behavior. To that end, in the rest of the manuscript (Parts
II and III), we show how the theoretical framework from Chapter 2 and the empirical framework in the current chapter can be used to systematically analyze legislator behavior across
the Congressional lifecycle.

78

APPENDIX TO CHAPTER

LIWC (2001) Categories

Category

Abbreviation

Examples

Word count
Words/sentence
Sentences ending in ?
Unique words
Dictionary words
Words>6 letters
Total pronouns
1st pers singular
1st pers plural
Total 1st person
Total 2nd person
Total 3rd person
Negations
Assents
Articles
Prepositions
Numbers

wc
wps
qmarks
unique
dic
sixltr
pronoun
i
we
self
you
other
negate
assent
article
prep
number

I, them, itself
I, me, mine
We, us, our
I, we, me
You, youll
She, their, them
No, not, never
Yes, OK, mmhmm
A, an, the
To, with, above
Second, thousand

Occupation
School
Job or work
Achievement
Leisure
Home
Sports
Television and movies
Music
Money
Metaphysical issues
Religion
Death and dying
Physical states and functions
Body states, symptoms
Sex and sexuality
Eating, drinking, dieting
Sleeping, dreaming
Grooming

occup
school
job
achieve
leisure
home
sports
tv
music
money
metaph
relig
death
physcal
body
sexual
eating
sleep
groom

Work, class, boss


Class, student, college
Employ, boss, career
Earn, hero, win
Cook, chat, movie
Apartment, kitchen, family
Football, game, play
TV, sitcom, cinema
Tunes, song, cd
Cash, taxes, income
God, heaven, coffin
Altar, church, mosque
Bury, coffin, kill
ache, breast, sleep
ache, heart, cough
lust, penis, sex
eat, swallow, taste
asleep, bed, dreams
wash, bath, clean

Standard Linguistic Dimensions

Personal Concerns

79

LIWC (2001) Categories (Continued)

Category

Abbreviation

Examples

Affective processes
Positive emotion
Positive feelings
Optimism and energy
Negative emotion
Anxiety or fear
Anger
Sadness or depression
Cognitive processes
Causation
Insight
Discrepancy
Inhibition
Tentative
Certainty
Sensory and perceptual processes
Seeing
Hearing
Feeling
Social processes
Communication
Other references to people
Family
Friends
Humans

affect
posemo
posfeel
optim
negemo
anx
anger
sad
cogmech
cause
insight
discrep
inhib
tentat
certain
senses
see
hear
feel
social
comm
othref
family
friend
human

Happy, cried, abandon


Love, nice, sweet
Happy, joy, love
Certainty, pride, win
Hurt, ugly, nasty
Worried, fearful, nervous
Hate, kill, annoyed
Crying, grief, sad
cause, know, ought
because, effect, hence
think, know, consider
should, would, could
block, constrain, stop
maybe, perhaps, guess
always, never
Observing, heard, feeling
View, saw, seen
Listen, hearing
Feels, touch
Mate, talk, they, child
Talk, share, converse
1st pl, 2nd, 3rd person pronouns
Daughter, husband, aunt
Buddy, friend, neighbor
Adult, baby, boy

Time
Past tense verb
Present tense verb
Future tense verb
Space
Up
Down
Inclusive
Exclusive
Motion

time
past
present
future
space
up
down
incl
excl
motion

Hour, day, oclock


Walked, were, had
Walk, is, be
Will, might, shall
Around, over, up
Up, above, over
Down, below, under
And, with, include
But, without, exclude
Arrive, car, go

Swear words
Nonfluencies
Fillers

swear
nonflu
fillers

Damn, piss, fuck


Er, hm, umm
Blah, Imean, youknow

Psychological Processes

Relativity

Experimental Dimensions

80

MRCPD Categories

Category
Age of Acquisition
Brown Frequency
Concreteness
Familiarity
Imagability
K F Frequency
K F Number of Categories
K F Number of Samples
Meaningfulness (Colorado)
Meaningfulness (Paivo)
Number of letters
Number of phonemes
Number of syllables
T L Frequnecy

81

Abbreviation
AOA
BROWN_FREQ
CONC
FAM
IMAG
K_F_FREQ
K_F_NCATS
K_F_NSAMP
MEANC
MEANP
NLET
NPHON
NSYL
T_L_FREQ

S-ar putea să vă placă și