Documente Academic
Documente Profesional
Documente Cultură
doi:10.1093/jopart/muy048
Article
Abstract
In this study, we examine organizational responses to performance management in the public sec-
tor by studying Korean public agencies’ responses to their annual performance feedback. In doing
so, we employed a regression discontinuity design that exploits the relationship between perfor-
mance grades and the numeric inputs that determine the grades to uncover the impact of perfor-
mance management on performance. Evidence suggests that the social and historical aspirations
of public organizations significantly influence their performance improvement, as predicted by
behavioral theory. We also report evidence supporting the switching aspiration hypothesis; organi-
zations performing below the mean performance of similar others aspire to the average, whereas
organizations performing above the mean aspire to improve performance relative to their own his-
torical positions. Overall, our findings provide broad support for the existence of negativity bias in
public managers’ decision making as well as for the relevance of behavioral theory and bounded
rationality in the context of public administration.
© The Author(s) 2018. Published by Oxford University Press on behalf of the Public Management Research Association. 1
All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
2 Journal of Public Administration Research and Theory, 2018, Vol. xx, No. xx
Since an organization’s behavior is shaped by the dis- letter grades provide an opportunity to examine the
crepancy between performance and aspiration, under- effectiveness of performance feedback in motivating
standing how organizations set this reference point is public organizations. We take advantage of this oppor-
central to studying the decision making of any organi- tunity and employ a regression discontinuity (RD)
and often-conflicting incentives during the process. We 2001) or politicians’ responsibility attribution process
thus used the term “organization,” leaving the identifi- to public managers (Nielsen and Moynihan 2017).
cation of the complex dynamics within organizations Lastly, a body of scholarship has used behavioral
as a topic for future research. theory in the tradition of Cyert and March (1963) to
in Korea. These agencies include state-owned utilities evaluation is based on the following predefined for-
(electricity, gas, coal, railway, airport, seaport, high- mula agreed upon between the regulator and the
way, water, etc.) and other organizations that are de agencies before the evaluation. Both qualitative and
facto controlled by the central government ministries quantitative evaluations produce scores on a 0- to
of Korea.3 The institutional structure of the Korean 100-point scale, the average of which produces the
public sector is unique in that the size of these agen- final score on a 0–100 scale. In this study, we linearly
cies as measured by the sum of their budget is even transformed this final score so that it has a mean of
larger than the size of the central government’s entire zero and a standard deviation of one. We performed
budget. In 2013, the Korean central government’s this “standardization” to make the final scores com-
budget expenditure amounted to about 356,000 bil- parable over time.
lion Korean won, whereas the budget expenditure The results of the performance evaluation are pub-
of all these agencies combined amounted to about lished annually, and the performance of each agency
575,000 billion Korean won (which comprises 44.3% is evaluated based on its final score as A (excellent),
of the total gross domestic product). Due to their unu- B (average), C (average), D (poor), or E (very poor).
sually large size, discussions on how to make these Each of these grades shows the position of an agen-
organizations more efficient, effective, and innovative cy’s performance relative to the average score of all
were often part of the top policy agendas of the Korean the evaluated agencies. Figure 2 shows the process of
government. determining the grading. As seen in figure 2, the aver-
In tandem with the “new public management” age score of all the evaluated agencies lies between the
(NPM) wave of management reforms across the scores for grades B and C.4 Grade A is given only if
world, the Korean government introduced a perfor- the score is greater than the average plus one standard
mance evaluation system for these agencies in 1983. deviation,5 grade D if the score is below the average
This public management system underwent a major minus one standard deviation, and grade E if the score
change in 2007, but it continues to remain the central is below the average minus 2 standard deviations. As
government’s primary instrument for holding these the final grade is determined by considering the aver-
organizations accountable (Hong 2016; Hong and age and standard deviation of all scores, an organi-
Kim 2017). The performance of these agencies is evalu- zation should do better than the others to receive a
ated both qualitatively and quantitatively. Qualitative
evaluation is conducted by an independent commit-
tee of experts including professors, accountants, and
4 Of note, the average and the standard deviation are calculated for
lawyers, all appointed by the regulator. Quantitative two separate groups. Among the 116 agencies, 55 had less than 500
employees, and were smaller than the rest. The average, the standard
3 As of August 2014, there were 303 agencies in Korea and 116 of them deviation, and therefore the grades are determined separately for
were subject to performance evaluation. The list of agencies subject these small entities and the rest.
to performance evaluation is determined by the Korean central 5 If the score is greater than the average plus two standard deviations,
government based on the size of the agencies, the nature of business grade S is given. However, no agency has received an S grade since
they operate, and the government’s policy considerations. 2012.
6 Journal of Public Administration Research and Theory, 2018, Vol. xx, No. xx
satisfactory grade. As of 2014 and 2015, 116 agencies of 116 agencies is evaluated as a continuous score, but
were subject to the evaluation.6 the results are published as five grades. As the regulator
The results of the performance evaluation have both does not publish the performance scores on a continu-
symbolic and substantive importance for the agencies’ ous scale, the agencies know only their grades but not
executives and employees. First, it has an impact on their scores. In this set-up, we first show a simple asso-
the executives who usually aim to climb to significant ciation between the performance-aspiration gap calcu-
positions such as becoming a minister or a key advisor lated using the grades the agencies received and their
to the president. They show a keen interest in receiving subsequent performance improvements. However, a
a “good” grade. To these ambitious executives, perfor- legitimate concern here is that agencies that received a
mance feedback has critical symbolic value; an A grade certain grade (e.g., an A grade) may be significantly dif-
validates their expertise and management skills. ferent from agencies that received another grade close
Second, the regulator provides monetary rewards to to A (e.g., a B grade) in many characteristics unobserv-
executives and employees of the evaluated agencies able to researchers.
that receive A (excellent), B (average), or C (average). To resolve this problem, we implement a RD
The monetary reward for achieving grade A is twice approach by exploiting the sharp discontinuities
as much as that for achieving average grades. There between grades. Essentially, we assume that the dis-
is no financial incentive attached to receiving grades tribution of unobservable characteristics of agencies
D and E. In fact, agencies that received a D or E may changes smoothly across grades. Based on this assump-
face formal sanctions; if a chief executive had worked tion, we examine whether a meaningful difference
for longer than six months as of the year-end, he or she exists in terms of future performance improvement
may be fired upon receiving either an E or D grade for between the agencies located just above and just below
two consecutive years. each of the thresholds (i.e., cutoffs or discontinuities)
between grades. For instance, we estimate the impact
Empirical Strategy of positive feedback by looking at the difference in
The method by which the final grade is assigned pro- future performance improvement between organiza-
vides a unique opportunity for researchers to estimate tions that received an A grade with their scores barely
the impact of performance feedback. The performance above the cutoff and those that failed to receive an
A (i.e., among those that received B) as their scores fell
6 One observation (Korea Securities Depository) was removed from slightly short of it.
our sample as it was subject to evaluation only in 2014, which made In our case, the RD approach can produce a cred-
it impossible to construct the dependent variable (i.e., performance ible estimate of performance feedback since the cut-
improvement). offs that classify the five grades are plausibly randomly
Journal of Public Administration Research and Theory, 2018, Vol. xx, No. xx 7
and exogenously set, and so is the resulting change in those that received a D or E (i.e., D = 1), and the rest
future performance. The five grades are determined (i.e., A = 0 and D = 0).
by the distribution (i.e., average scores and standard Meanwhile, historical aspiration is measured using
deviation) of the evaluated scores of the 116 agencies. the performance data of the previous 2 years. In the
relative to historical aspiration. The model includes at the webpage of the regulator. However, the perfor-
a set of control variables X and an error term ν . mance score on a 0–100 scale remains confidential
In equation (1), we estimate the impact of the dis- and is not disclosed to the public or to the evaluated
crepancy between performance and social aspiration agencies. We obtained this confidential data set of the
Variables Mean SD 1. 2. 3. 4. 5. 6. 7. 8. 9.
organizations that received a higher grade. As we follows, we consider “performance above historical
defined grades D or E as “performance below social aspiration” if an organization improved in terms of its
aspiration (D = 1)” while grade A was “performance letter grade. Thus, in figure 3B, three categories (i.e.,
above social aspiration (A = 1),” figure 3A shows that “1 up,” “2 up,” and “3 up”) qualify for “performance
an improvement in performance is driven by organiza- above historical aspiration (H = 0),” whereas the rest
tions whose performance lies below social aspiration. are coded “performance below historical aspiration
In figure 3B, the horizontal axis measures the degree (H = 1).”
to which an organization’s grade improved in relation
to its historical aspiration. As explained, the level of OLS Results
historical aspiration is the weighted average of the We now present the results of the OLS model with con-
performance of the previous 2 years. For instance, in trol variables and fixed effect. Equations (1) and (2)
figure 3B, “3 down” is the case when an organization’s are estimated and the results are reported in table 2. In
grade went down three grades from the measured level column 1 of table 2, there is clear evidence that organi-
of historical aspiration (for instance, from grade A to zations with performance below social aspiration level
D), whereas “3 up” is the case when an organization’s experience a larger subsequent performance improve-
grade went up three grades from the historical refer- ment, whereas organizations with performance above
ence point (for instance, from grade D to A). As can be social aspiration experience a decrease in their perfor-
seen, if an organization receives a grade lower than the mance. This result is consistent with our findings in
historical reference point, there is generally a greater figure 3A. In column 2 of table 2, we include the per-
increase in performance in the following year. In what formance-historical aspiration gap and its interactions
10 Journal of Public Administration Research and Theory, 2018, Vol. xx, No. xx
Dependent Variable:
Performance Improvement
Variables Following Feedback
Note: Robust standard errors in parentheses. All models include year fixed effect and dummy variables indicating three distinctive types of
agencies (i.e., state-owned enterprises, quasi-governmental organizations, and small institutions with less than 500 employees).
*p < .10, **p < .05.
HA, historical aspiration; SA, social aspiration.
with the performance-social aspiration gap variables. Two things are immediately apparent from
The coefficients suggest that organizations with per- figure 4. First, all else being constant, the level of per-
formance below historical aspiration show a greater formance improvement is higher when performance
degree of subsequent performance improvement, and lies below social aspiration. This is evident from the
this improvement is largest and clearest among organi- negative slopes of the two lines. Second, all else being
zations with performance above social aspiration level. constant, the level of performance improvement is
In figure 4, we show the estimates from column 2 higher when performance lies below historical aspi-
of table 2 graphically to refocus attention on the theo- ration. This can be verified visually by comparing
retical claim about switching aspirations. The graph the heights of the two lines; the line indicating per-
shows the relationship between performance improve- formance below historical aspiration lies above the
ment and the various indicators of the performance- line indicating performance above historical aspira-
social aspiration gap and the performance-historical tion. The most important observation is, however,
aspiration gap. Specifically, the vertical axis repre- that the gap between the two lines becomes greater
sents the dependent variable, the level of performance as organizational performance improves. Overall,
improvement, whereas the horizontal axis shows the this result supports the switching aspiration hypoth-
three subgroups of organizations depending on their esis. Evidence suggests that low- and high-performing
performance-social aspiration gap: organizations that organizations generally consider different aspira-
received an A (i.e., A = 1), those that received a D or tions; organizations performing below the average of
E (i.e., D = 1), and the rest. The two lines describe the their social comparison group aspire to the average,
relationship between these two variables separately for whereas organizations performing above it aspire to
the organizations with performance above social aspi- improve performance relative to their own historical
ration (i.e., H = 0) and those with performance below positions. Overall, figure 4 resembles the theoretical
social aspiration (i.e., H = 1). expectations that appear in figure 1.
Journal of Public Administration Research and Theory, 2018, Vol. xx, No. xx 11
Discussion
9 In the case of the cutoff that distinguishes grades D and E, we had no
observations in the vicinity of the cutoff, which makes it impossible to
The results presented here contribute to recent schol-
produce a meaningful RD estimate. arly discussions regarding the effectiveness of perfor-
10 For the nonparametric smoothing, we used lowess command in STATA. mance management systems. First, our findings offer
12 Journal of Public Administration Research and Theory, 2018, Vol. xx, No. xx
empirical support for some of the hypotheses formu- case, the low-performing agencies that received grades
lated in Meier et al. (2015) and Nicholson-Crotty D and E received sanctions; if a chief executive had
et al. (2017). For instance, Nicholson-Crotty et al. worked for longer than 6 months by the year-end, he
(2017) presents a model in which high performers take or she may be fired upon receiving either an E grade or
greater risks than average performers. However, rely- D grade for two consecutive years. During the studied
ing on a single measure of performance that bundles period, the chief executive of several agencies received
social and historical aspirations, they were not clear such sanctions. In figure 7B, we show that the large
on how a “satisficing” agent can have greater motiva- performance improvement of low performers is not
tion for innovation although its performance exceeds driven by these agencies. In fact, agencies that received
the average. In contrast, our model is fully consistent sanctions performed worse than the rest (figure 7A).
with this behavioral assumption, yet open to the pos- This evidence provides support for the notion that
sibility that high performers may experience a greater performance management may significantly affect per-
level of motivation. Specifically, as in figures 1 and 4, formance without the presence of extrinsic rewards or
the association between an organization’s motivation accountability pressure (Kelman and Friedman 2009;
for improvement and performance level (relative to its Kelman et al. 2012; Moynihan and Pandey 2010; but
peers) is U-shaped whenever its performance is below see Rouse et al. 2013).
historical aspirations. However, if the provision of sanctions has lim-
Second, our results also shed light on the debate on ited explanatory power, then what may explain the
whether performance improvement of low-performing negativity bias observed in the public organizational
organizations is largely due to the system’s formal behaviors? As Simon (1947) proposed, the bounded
threats or sanctions placed on them. In this studied rationality of public managers may be one source of
Journal of Public Administration Research and Theory, 2018, Vol. xx, No. xx 13
Dependent Variable:
Performance Improvement
Variables Following Feedback
Note: Robust standard errors in parentheses. All models include a linear trend of the forcing variable, year fixed effect, and dummy variables
indicating three distinctive types of agencies (i.e., state-owned enterprises, quasi-governmental organizations, and small institutions with less
than 500 employees).
*p < .10, **p < .05.
HA, historical aspiration; SA, social aspiration.
Dependent Variable:
Performance Improvement
Variables Following Feedback
(1) (2)
Note: Robust standard errors in parentheses. All models include a linear trend of the forcing variable, year fixed effect, and dummy variables
indicating three distinctive types of agencies (i.e., state-owned enterprises, quasi-governmental organizations, and small institutions with less
than 500 employees).
*p < .10, **p < .05.
HA, historical aspiration; SA, social aspiration.
Journal of Public Administration Research and Theory, 2018, Vol. xx, No. xx 15
Before we conclude, we note three issues regard- system creates such perverse incentive structures. We
ing our empirical analyses. First, we concede that the add the caveat, therefore, that performance grades may
studied agencies vary in terms of size, affiliated indus- be influenced, at least to some extent, by factors other
try, core public functions, and so on. Moreover, our than organizational learning and innovation, such as
results are based on the analyses of a relatively short goal displacement or impression management (e.g.,
2-year panel data. Although we addressed this issue by Bohte and Meier 2000; Wayne and Liden 1995).
including key control variables, it is possible that the Nevertheless, we continue to believe that future
proposed model might suffer from omitted variable work may further advance our understanding on
bias. However, we believe that this bias would not be performance management by addressing the above-
significant due to the unique quasi-experimental set- mentioned issues. Specifically, empirical evidence of
ting of our empirical strategy. That is, although these long-run impacts of performance management system
agencies are different, they all have equal chances to be (based on long panel data set) is rare. Scholars may
located either just above or just below the threshold; in also fill this gap in the literature by studying the mid- or
other words, whether an agency is located just above long-term dynamics of the relationship among public
or just below (i.e., the treatment) is almost randomly managers, elected officials, and voters. In this regard,
assigned. We argue that such a randomized assign- our study’s findings will be of greater use if supported
ment of the treatment may significantly reduce the by future works that could provide deeper insights into
possibility of bias. Second, we need to reiterate that the ways in which public organizations behave.
this study explored the short-run impacts of perfor-
mance feedback. One may question whether a year is
long enough for public organizations to respond to the References
feedback. However, given that previous studies in edu- Ammons, David N., and Dale. J. Roenigk. 2014. Benchmarking and interor-
ganizational learning in local government. Journal of Public Administration
cation accountability systems investigate the impact of
Research and Theory 25:309–35.
performance feedbacks by allowing only 4–6 months Andrews, Rhys. 2014. Performance management and public service improve-
for school administrators to respond, and yet find sig- ment. PPIW Report, (3). Public Policy Institute for Wales.
nificant impacts (e.g., Rockoff and Turner 2010), we Askim, Jostein, Åge Johnsen, and Knut-Andreas Christophersen. 2007.
believe that organizations can produce meaningful Factors behind organizational learning from benchmarking: Experiences
from Norwegian municipal benchmarking networks. Journal of Public
changes over the course of a year. Third, we abstain
Administration Research and Theory 18:297–320.
from making a normative judgment on whether the Behn, Robert D. 2003. Why measure performance? Different purposes require
improvement in performance grades is necessarily a different measures. Public Administration Review 63:586–606.
socially desirable outcome. Prior research has shown Behn, Robert D. 2014. The PerformanceStat potential: A leadership strategy
evidence of “performance paradox,” whereby per- for producing results. Washington, DC: Brookings Institution Press.
Bohte, John, and Kenneth J. Meier. 2000. Goal displacement: Assessing the
formance management systems produce unintended
motivation for organizational cheating. Public Administration Review
consequences at the expense of improved short-term 60:173–82.
outcomes (Dias and Maynard-Moody 2007; Hong Bourdeaux, Carolyn, and Grace Chikoto 2008. Legislative influences on per-
2016). This study could not verify whether the studied formance management reform. Public Administration Review 68:253–65.
16 Journal of Public Administration Research and Theory, 2018, Vol. xx, No. xx
Boyne, George A., Oliver James, Peter John, and Nicolai Petrovsky. 2009. Heinrich, Carolyn J., and Gerald Marschke. 2010. Incentives and their dynam-
Democracy and government performance: Holding incumbents account- ics in public sector performance management systems. Journal of Policy
able in English local governments. The Journal of Politics 71:1273–84. Analysis and Management 29:183–208.
Brewer, Gene A., and Sally Coleman Selden. 2000. Why elephants gallop: Holm, Jakob Majlund. 2017. Double standards? How historical and political
McDermott, Kathryn A. 2004. Incentives, capacity, and implementation: group advocacy. Journal of Public Administration Research and Theory
Evidence from Massachusetts education reform. Journal of Public 27:269–83.
Administration Research and Theory 16:45–65. Olsen, Asmus Leth. 2015. Negative performance information causes asym-
Meier, Kenneth J., Nathan Favero, and Ling Zhu. 2015. Performance gaps and metrical evaluations and elicits strong responsibility attributions. Paper