Sunteți pe pagina 1din 30

Technical report

December 2016

Rapid evidence assessment of the


research literature on the effect of
performance appraisal
on workplace performance
The CIPD is the professional body for HR and people
development. The not-for-profit organisation champions
better work and working lives and has been setting the
benchmark for excellence in people and organisation
development for more than 100 years. It has more than
140,000 members across the world, provides thought
leadership through independent research on the world of
work, and offers professional training and accreditation for
those working in HR and learning and development.
Rapid evidence assessment of the research
literature on the effect of performance
appraisal on workplace performance
Technical report

Contents
Foreword 2
Introduction 3
1 Methodology 4
2 Findings 6
3 Synthesis 16
Conclusion 18
Limitations 19
Endnotes 20
References 21
Appendix 1 26
Appendix 2 27

Acknowledgements
We would like to thank the team behind this research at the Center for Evidence-Based Management (CEBMa).
The report was written by Dr Eric Barends, Barbara Janssen and Pietro Marenco, with the support of Professor
Rob Briner and Professor Denise Rousseau, all of CEBMa.

About CEBMa
The Center for Evidence-Based Management is a non-profit member
organisation dedicated to promoting evidence-based practice in the field of
CEBMa
management. It provides support and resources to managers, consultants, center for
organisations, teachers, academics and others interested in learning more about Evidence-Based Management

evidence-based management.

1 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Foreword

As a professional body, the In publishing this technical report,


CIPD helps HR professionals and we provide a step-by-step account
functions to develop effective of the evidence our REA uncovered
strategies and practices in people on what is meant by performance
management. We believe that appraisal, how it is assumed to work
applied research is a crucial and what influences its effectiveness.
step to achieving this. We thus A number of the academic papers
see an important part of our referenced are accessible through
role as making quality research the EBSCO online journals portal for
available, distilling it into accessible CIPD members.2
forms and drawing out practical
implications. We hope this report provides
a useful reference and pointer
This technical report presents the to further reading on this
methods and findings of a rapid important aspect of performance
evidence assessment (REA), a management.
truncated form of systematic
review, on the topic of performance Jonny Gifford
appraisal. It is accompanied by Adviser, Organisational Behaviour
another technical report of an REA CIPD
on goal setting (Barends et al 2016).
The insight and implications of
both technical reports, which are
written by the Center for Evidence-
Based Management (CEBMa), are
discussed in the discussion report,
Could do Better: Assessing what
works in performance management
(Gifford 2016).1

2 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Introduction

This technical report presents the Rationale for this review Main question: What does
methods and findings of a rapid the review answer?
evidence assessment (REA), a Despite the relevance of
truncated form of systematic performance ratings within What is known in the scientific
review, on the topic of performance the domain of human resource literature about the impact
appraisal. It is accompanied management, both academics of performance appraisal on
by another technical report of and practitioners have always had workplace performance?
an REA on goal setting. The a somewhat uneasy relationship
insight and implications of both with them. Some academics Supplementary questions
technical reports are discussed in question whether performance Other issues raised, which will form
the discussion report, Could do appraisals provide meaningful the basis of our conclusion to the
Better: Assessing what works in information, whereas others main question above, are:
performance management. have even suggested that
All reports are available at undertaking such reviews should 1 What is meant by performance
cipd.co.uk/coulddobetter be discontinued entirely (Hoffman appraisal? (What is it?)
et al 2012). Given the widespread 2 What is the assumed causal
use of performance appraisals mechanism? (How is it supposed
within management practice, the to work?)
CIPD approached the Center for 3 What is the effect of
Evidence-Based Management performance appraisal on
(CEBMa) to undertake a review workplace performance?
to understand what is known 4 What is known about possible
in the scientific literature about moderators and/or mediators
the reliability and validity of that affect the relationship
performance appraisal and the between performance appraisal
way in which this may impact and workplace performance?
workplace performance. This 5 What is known about the
review will present an overview of reliability and validity of
this evidence. performance appraisal?


3 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
1 Methodology

Search strategy: How was the for inclusion. The decision of the Critical appraisal:
research evidence sought? third reviewer was final. What is the quality of
the studies included?
The following four databases were Selection took place in two
used to identify studies: ABI/ phases. First, the titles and In almost any situation it is possible
INFORM Global, Business Source abstracts of the 250+ studies to find a scientific study to support
Premier, PsycINFO and Web of identified were screened for their or refute a theory or a claim,
Science. The following generic relevance to this review. In case and sometimes to quite a large
search filters were applied to all of doubt or lack of information, degree. It is therefore important
databases during the search: the study was included. Duplicate to determine which studies are
publications were removed. This trustworthy (that is, valid and
1 scholarly journals, peer-reviewed first phase yielded 41 secondary reliable) and which are not. The
2 published 1980–2016 for meta- studies (meta-analyses) trustworthiness of a scientific
analyses, and 2000–16 for and 48 primary studies. study is first determined by its
primary studies methodological appropriateness.
3 articles in English. Second, studies were selected
based on the full text of For cause-and-effect claims (that
A search was conducted using the article according to the is, if we do A, will it result in B?), a
combinations of different search following inclusion criteria: study has a high methodological
terms, such as ‘performance appropriateness when it fulfils the
appraisal’, ‘performance review’, 1 type of studies: quantitative, three conditions required for causal
‘performance evaluation’, empirical studies inference: co-variation, time–order
‘annual review’ and ‘employee 2 measurement: (a) studies in relationship, and elimination
evaluation’. In addition, the which the effect of performance of plausible alternative causes
references listed in the studies appraisal on organisational (Shaughnessy and Zechmeister
retrieved were screened in order outcomes was measured, 1985). A study that uses a control
to identify additional articles for or (b) studies in which the group, random assignment and
possible inclusion in the REA. effect of moderators and/ a before-and-after measurement
or mediators on performance is therefore regarded as the ‘gold
We conducted 17 different search appraisal was measured standard’.3 Non-randomised
queries and screened the titles and 3 context: studies related to studies and before–after
abstracts of more than 250 studies. workplace settings studies come next in terms of
An overview of all search terms and 4 level of trustworthiness: appropriateness. Cross-sectional
queries is provided in Appendix 1. studies that were graded level studies (surveys) and case studies
C or above. are regarded as having the greatest
Selection process: How were the chance of showing bias in the
studies selected? In some cases where influential outcome and therefore sit lower
studies were referenced that had down in the ranking in terms of
Two reviewers worked not been identified in our search appropriateness. Meta-analyses
independently to identify which because of the search terms used, in which statistical analysis
studies should be included. The we included these additional techniques are used to pool the
inter-rater agreement was 92.5%. studies as ‘bycatch’. results of controlled studies are
Where the reviewers disagreed on therefore regarded as the most
selection, a third reviewer – with This second phase yielded 23 appropriate design.
no prior knowledge of the initial secondary studies and 37 primary
reviewers’ assessments – assessed studies. An overview of the selection To determine the methodological
whether the study was appropriate process is provided in Appendix 2. appropriateness of the research

4 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
design of the studies included, the strengths and weaknesses). For applied. According to Cohen,
classification system of Shadish instance, was the sample size a ‘small’ effect is an effect that
et al (2002) and Petticrew and large enough and were reliable is visible only through careful
Roberts (2006) was used. The four measurement methods used? To examination. A ‘medium’ effect,
levels of appropriateness used for determine methodological quality, however, is one that is ‘visible
the classification are shown all the studies included were to the naked eye of the careful
in Table 1. systematically assessed on explicit observer’. Finally, a ‘large’ effect
quality criteria. Based on a tally is one that anybody can easily see
It should be noted, however, of the number of weaknesses, the because it is substantial.
that the level of methodological trustworthiness was downgraded
appropriateness as explained in and the final level was determined Outcome of the critical appraisal
Table 1 is relevant only in assessing as follows: a downgrade of one
the validity of a cause-and-effect level if two weaknesses were The overall quality of the studies
relationship that might exist identified; a downgrade of included was moderate to high. Most
between an intervention (for two levels if four weaknesses were of the 23 secondary studies were
example performance appraisal) identified, and so on. based on cross-sectional studies
and its outcomes (performance), and were therefore graded level B or
which is the purpose of this review. Finally, the effect sizes were lower, with only seven qualified as
A case study, for instance, is a identified. An effect (for example level A. Of the 37 primary studies, 20
strong design for assessing why a correlation, Cohen’s d or omega) qualified as randomised controlled
an effect has occurred or how can be statistically significant but studies and were therefore graded
an intervention might be (un) may not necessarily be of practical level A. The remaining 17 studies
suitable in a particular context; relevance: even a trivial effect can concerned quasi-experimental
it does a poor job of assessing be statistically significant if the or longitudinal designs and were
the existence or strength of a sample size is big enough. For this graded level B or lower.
cause-and-effect relationship reason, the effect size – a standard
(Donnelly and Trochim 2007). measure of the magnitude of the
effect – of the studies included
In addition, a study’s was assessed. To determine the
trustworthiness is determined by magnitude of an effect, Cohen’s
its methodological quality (its rules of thumb (Cohen 1988) were

Table 1: Four levels of appropriateness used for classification

Design Level
Systematic review or meta-analysis of randomised controlled studies 4
AA

Systematic review or meta-analysis of non-randomised controlled before–after studies A

Randomised controlled study

Systematic review or meta-analysis of non-randomised controlled or before–after studies B

Non-randomised controlled before–after study

Interrupted time series

Systematic review or meta-analysis of cross-sectional studies C

Controlled study without a pre-test or uncontrolled study with a pre-test

5 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
2 Findings

Figure 1: Links in the performance appraisal chain

Future
Past performance Objectives Rating Judgement Feedback
performance

‘Research on Question 1: What is meant by appraisal is a process composed of


performance appraisal? several elements. When examined
performance closely, most definitions seem to
appraisal Performance appraisal is one of
the most widely studied topics
have the common elements (see
Figure 1):
dates back at least in the domain of management.
Research on performance appraisal • past performance
as far as the dates back at least as far as the • establishing goals/objectives
early 1920s and has continued to • rating based on predetermined
early 1920s and the present day. A search in ABI/ criteria
INFORM on the term ‘performance • judgement
has continued to appraisal’ in the title or abstract • formal feedback of judgement
the present day.’ yields more than 1,200 results of
peer-reviewed papers published
• future performance.

in scholarly journals, spanning Although the primary purpose


a period of six decades. One of of such an appraisal is to enhance
the earliest academic papers the performance or productivity
that explicitly uses the term of employees (and thus the
performance appraisal is ‘Appraisal organisation), most organisations
of Job Performance’ by Stephen use them for either administrative
Halbe, published in 1951. Since or developmental reasons.
then many definitions have been Developmental performance
put forward. One of the most appraisals are used to identify
widely used definitions is provided an employee’s strengths and
by Griffin and Ebert (2004, who weaknesses and their training
describe performance appraisal needs, whereas performance
as the ‘formal evaluation of an appraisals for administrative
employee’s job performance in reasons are used to decide on
order to determine the degree salary and promotion issues, to
to which the employee is validate selection criteria, to
performing effectively’ (p216). decide on termination of contracts
Other definitions point out that it and redundancies, or to meet
is typically an evaluation process legal requirements.
in which quantitative scores
based on predetermined criteria
are assigned and shared with the
employee being evaluated (for
example, DeNisi and Pritchard
2006). In addition, most authors
emphasise that performance

6 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Question 2: What is the assumed whereas underperformers suggests that the relationship
causal mechanism? (How is it are motivated to put in between performance appraisal
supposed to work?) more effort to achieve on and performance outcome is often
a higher level. complicated and is contingent
The assumed causal mechanism of upon a large number of moderators
performance appraisal is based on Question 3: What is the effect and mediators.7 As a consequence,
three theories: social comparison of performance appraisal on the key question is not, ‘What is
theory (Festinger 1954), feedback workplace performance? the effect of performance appraisal
intervention theory (Kluger and on workplace performance?’,
DeNisi 1996), and equity theory To measure the effect of but ‘Given the target group,
(Adams 1965). performance appraisal on workplace the objectives and the context
performance would require an involved, what are the factors
Social comparison theory suggests evaluation of a large number of moderating or mediating the effect
that individuals tend to compare populations and contexts where of performance appraisal that need
themselves with others to make performance appraisal was applied, to be taken into account?’
judgements regarding their and the measurement of a wide
performance. They are concerned range of performance outcomes,
not only about their performance preferably by means of a meta-
in an absolute sense, but also about analysis of a large number of
how they measure up in relation double-blind, randomised controlled
to relevant peers. In addition, studies. Such studies do not exist,
this theory posits that individuals and might well be too difficult to
have a strong desire to improve carry out.
their performance when faced
with unfavourable comparative However, there is wide consensus
information. among both scholars and
practitioners that performance
Feedback intervention theory appraisal, in general, can have a
suggests that when confronted with positive impact on a wide range
a discrepancy between what they of organisational outcomes,
wish to achieve and the feedback such as task performance,
received, individuals are strongly productivity, organisational
motivated to attain a higher level citizenship behaviour, satisfaction
of performance.5 The practice of and commitment. As stated
performance appraisal therefore above, both social comparison
assumes that informing an employee theory and feedback theory
about the discrepancies between posit that providing feedback
the organisation’s standard and their to employees regarding their
current performance – implying that relative performance can enhance
they are achieving lower than most employee productivity.6
other colleagues – will motivate the
employee to achieve a higher level The scientific literature on feedback
of performance. performance interventions,
however, suggests a caveat. Several
Finally, equity theory states that researchers have pointed out
employees compare themselves that feedback may not always be
with each other in terms of input effective. In fact, several meta-
and outcomes (Walster et al 1978). analyses have demonstrated that
High-performers, seeing that poor feedback interventions have highly
performers get lower appraisal variable effects on performance – in
scores – and, as a consequence, some situations, feedback improves
receive lower rewards – might feel performance, but in other situations
that an equitable balance is being it has no apparent effect or even
established and be motivated to harms it (Kluger and DeNisi 1996,
continue their high-quality work, Smither et al 2005). This finding

7 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Figure 2: ‘Distribution (histogram) of 607 effects (ds) of feedback intervention on performance’ (Kluger and DeNisi 1996)
Adapted with permission from the American Psychological Association

Frequency of d

16

14

12

10

0
-3 -2 -1 0 1 2 3 4 5 6 7

Negative effect Positive effect

Figure 3: Moderators and mediators

Administrative Developmental

Purpose Personality variables

Rating Judgement Feedback Employee reaction Future performance

Perceived

Relationship Rating Perceived


Participation
quality method usefulness

8 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Question 4: What is known have several behavioural 3 The perceived fairness of the
about possible moderators and/ options when confronted with a performance appraisal process
or mediators that affect the discrepancy between what they has a medium to large moderating
relationship between performance wish to achieve and the appraisal effect on future performance
appraisal and workplace feedback received. For example, (level A)
performance? 8 they can accept the feedback and A fair process is widely regarded as
put in more effort to improve their a prerequisite for the effectiveness
As previously stated, one of the performance, but they can also of performance appraisal, a
primary purposes of performance reject the feedback, feel angry and/ construct that in academia is often
appraisals is to provide employees or disappointed, and shift their referred to as procedural justice.
with clear feedback, which is attention away from their tasks. In This reflects ‘the perceived fairness
intended to positively affect the meta-analysis by Kluger and of decision-making processes
workplace performance. However, DeNisi (1996), it was found that and the degree to which they are
in their meta-analysis, Kluger the last option is likely when the consistent, accurate, unbiased, and
and DeNisi (1996) found that feedback threatens an employee’s open to voice and input’ (Colquitt
although performance feedback self-esteem. A similar finding is et al 2013). Empirical research
generally improves performance, found in the meta-analysis by has demonstrated that when
in more than one third of studies, Smither et al (2005): employees procedures are perceived as fair,
feedback actually lowered who express positive emotions reactions are favourable, largely
performance. Similar results have immediately after receiving irrespective of the outcome. This
been reported in meta-analyses feedback show higher performance interaction effect is called the fair
of multi-source feedback: some ratings, but those who express process effect and has been shown
of the studies included reported negative emotions show lower empirically in several studies in
performance improvements, while performance ratings. different contexts (for a review, see
some did not, and others reported Brockner and Wiesenfeld 1996).
inconclusive results (Seifert et 2 Personality variables moderate
al 2003, Smither et al 2005). reaction to the feedback Surprisingly this REA yielded
These findings suggest that the (level n/a)9 only two studies that directly
effect of performance appraisal is There is no doubt that personality examined the relationship between
moderated and/or mediated by variables moderate the reaction the perceived fairness of the
several factors (see Figure 3). to (negative) feedback, but they performance appraisal procedure
fall outside the focus of this REA. and future performance. A before–
1 Reactions to feedback, rather Among the personality variables after study found that performance
than the feedback itself, influence that are known to be involved appraisal incorporating the
performance (level A) in the reaction to feedback are principles of fairness and due
As previously stated, research self-esteem (for example, Ilgen process tends to positively affect
has found that although feedback et al 1979), locus of control employees’ reactions to feedback
generally improves performance, in (for example, Ilgen et al 1979), and their resulting overall job
more than one third of the studies, tendency for cognitive interference performance (Jawahar 2010). In
feedback lowered performance. (Kuhl 1992, Mikulincer 1989), addition, a recent randomised
Several theoretical models propose altruism (Korsgaard et al 1994) and controlled study confirmed this
that employees’ reactions to openness to feedback (Smither finding and demonstrated that
feedback likely determine the et al 2005). employees’ perceptions of fairness
extent to which they will use it had an effect on the relationship
to improve performance (for between feedback and overall task
example, Ilgen et al 1979, Murphy performance (Budworth et al 2015).
and Cleveland 1995). Employees

9 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
However, as mentioned above, 6 Negative feedback adversely 8 The quality of the relationship
several meta-analyses have affects perceived fairness (level between manager and employee
demonstrated that perceived C*), whereas feedback that has a substantial moderating effect
procedural justice more generally focuses only on positive aspects on the perceived fairness of the
has a medium to large moderating has a medium positive effect performance appraisal (level B)
effect on organisational outcomes, on both perceived fairness and A recent meta-analysis of 69
such as performance,10 productivity, overall job performance (level A*) studies demonstrates that
satisfaction and commitment The outcome of a longitudinal the quality of the relationship
(Cohen-Charash and Spector 2001, study suggests that employees between the manager and the
Viswesvaran et al 2002). who receive negative performance- employee is strongly related to
appraisal feedback report lower the employee’s reaction to the
4 Both rating format and rating perceptions of fairness. This performance appraisal (Pichler
method have small to large effect even persists six months 2012). In addition, a longitudinal
moderating effects on perceived after the performance appraisal study finds that the quality of
fairness, self-efficacy and ability (Lam et al 2002). In addition, a leader–member exchange (LMX)
to improve (level A) recent randomised controlled is a strong predictor for perceived
A randomised controlled study study demonstrates that fairness (Elicker et al 2006). LMX
(Bartol et al 2001) found that employees who receive feedback theory states that managers
rating segmentation (that is, the that focuses only on positive often have a special relationship
number of alternative appraisal aspects (such as the employee’s with an inner circle of trusted
categories available for rating strength and accomplishments)11 employees, to whom they give
employee performance) affects perform significantly better higher levels of responsibility,
employees’ perception of fairness. on the job four months later decision influence and access
More specifically, moderate than employees who receive a to resources. In return, these
segmentation (five categories) traditional performance appraisal employees work harder and are
resulted in higher self-efficacy interview (Budworth et al 2015). more committed to task objectives.
regarding employees’ ability The findings suggest that they
to improve their performance 7 Participation has a medium are more likely to perceive the
and higher goals than a low to large moderating effect on performance appraisal as fair.
segmentation (three categories). perceived fairness (level B)
Another randomised controlled A meta-analysis of 32 studies Question 5: What is known about
study demonstrated that a (Cawley et al 1998) suggests that the reliability and validity of
substantially lower degree of participation in the performance- performance appraisal?
fairness was reported when a appraisal process has a large,
forced distribution rating system positive effect on perceived Performance appraisal is assumed
was used for administrative fairness of the appraisal, perceived to improve individual performance
purposes, especially when utility and motivation to improve and organisational outcomes.
there was reduced variability in after the appraisal. This effect To do this, it is essential that
ratees’ (actual) task performance was most strongly for value- performance ratings are accurate
(Schleicher et al 2009). expressive participation (that is, and unbiased. However, there is
for the sake of having one’s ‘voice’ a substantial body of research
5 Feedback perceived as useful heard) than for instrumental demonstrating that the accuracy
improves perceptions of fairness participation (that is, for the of performance ratings can be
(level B*) purpose of influencing the end influenced by a large number of
A four-year longitudinal controlled result). This finding is consistent different factors, bringing about
study found that feedback with a large meta-analysis of more poor rating quality and, as a
interviews perceived as useful than 200 studies, demonstrating result, affecting the reliability and
improve perceptions of fairness. that voice has a medium validity of performance appraisal.
In contrast, when such interviews positive effect on an employee’s The reliability and validity of
are perceived to be unhelpful, the perception of procedural performance ratings, also referred
impact on justice perception is justice (Colquitt et al 2001) to as rating accuracy, have
negative (Linna et al 2012). traditionally been assessed within
the following three categories:

10 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Figure 4: Rating accuracy

Past performance Rating Judgement Feedback

Rater-centric errors

Ratee-centric errors

System-centric
errors

1 rater-centric rating errors (for example, organisational fixed, whereas others think that
(errors in judgement that citizenship behaviour, extra-role personal attributes can change or
occur when a person evaluates performance or non-job-specific develop over time. A randomised
another person’s performance, task performance), all refer to placebo-controlled study finds that
for example rater bias and types of behaviour that go beyond managers who implicitly believe
contrast effects) the formally prescribed work that personal attributes are fixed
2 ratee-centric rating errors goals (Koopmans et al 2011). tend to give lower ratings for good
(errors in judgement that occur Several controlled studies and performance when their employees
because the person being meta-analyses have demonstrated have previously been given a
evaluated deliberately influences that an employee’s contextual negative performance rating
the rater’s perception) performance influences the (Heslin et al 2005). This finding
3 system-centric rating errors perception of their overall job suggests that such managers tend
(errors in judgement that are performance by their managers to pay less attention to the actual
due to flawed procedures or and, as a result, boosts performance of an employee once
inaccurate rating scales). performance ratings. Employees they have formed an impression.
who voluntarily help others with Conversely, managers who
Rater-centric rating errors work-related problems, make implicitly believe that personal
constructive suggestions to attributes can change or develop
1 Employees’ contextual improve the efficiency of work tend to give higher ratings for
performance has a large positive processes, or co-operate with good performance when their
effect on job performance ratings others to serve the interests employees have previously been
(level A) of the organisation therefore given a negative performance
Although task performance tend to receive substantially rating, suggesting that they
has been the traditional focus higher performance ratings base the grading on a more
of research, individual work (Podsakoff et al 2013). conscientious consideration of the
performance is considered to be performance (instead of their initial
more than meeting prescribed 2 Managers’ implicit person impression). This finding explains
work goals. Researchers therefore theory regarding the malleability why some managers acknowledge
distinguish an additional dimension of personal attributes has a large an improvement in an employee’s
of performance that is referred effect on how they rate their performance more than others.
to as ‘contextual’ performance: employees (level A*)
extra-role behaviours in which Implicit person theory (IPT) 3 Managers’ power level has
employees go beyond their concerns a person’s implicit a large to moderate effect on
formal job requirements, such as beliefs about the malleability of how they rate both others and
taking on extra tasks, showing personal attributes. Put differently, themselves (level A)
initiative or helping colleagues. some people implicitly believe A meta-analysis of 46 studies
Although several labels for that personal attributes such as indicates that as a manager’s
this type of performance exist ability or behaviour are largely power level grows, their evaluation

11 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
of others becomes increasingly 6 The outcome of managers’ own of a person they had originally
negative, whereas evaluations of performance appraisal has a large selected (Slaughter and Greguras
themselves become ever more effect on the way in which they 2008). This finding suggests that
positive (Georgesen and Harris evaluate their employees (level A) supervisors who are responsible
1998). This finding suggests that A combination of studies (including for hiring employees rate
performance evaluations by a randomised controlled study) them more favourably than (a)
supervisors should be considered demonstrated that managers who candidates whose pre-selection
in light of their hierarchical position receive positive feedback about information they never viewed
and power level. their performance subsequently and (b) candidates they did not
rate their employees significantly recommend for hiring. This bias in
4 Rater training has medium to higher than managers who receive evaluations may unfairly reward or
large positive effects on rating negative feedback regarding their promote some employees, in that
accuracy (level A) own performance (Latham et al such rewards are not based on
A meta-analysis of 29 studies 2008). Surprisingly, this effect even actual performance.
demonstrates that rater training occurred when managers knew
may have positive effects on rating their own evaluation was bogus. 9 Rater liking has small to
accuracy (Woehr and Huffcutt moderate effects on performance
1994). Rater-error training and 7 Introverted employees evaluate rating (level B)
performance-dimension training their extroverted and disagreeable A recent meta-analysis of 40
both appear to be moderately colleagues’ performance studies demonstrates that raters
effective at reducing halo substantially lower (level A) tend to evaluate those they like
error (the tendency to make A cross-sectional study suggests substantially more positively than
inappropriate generalisations based that introverted (but not those they dislike (Sutton et al
on one aspect of a person’s job extroverted) peers consistently 2013). The relationship between
performance) and somewhat less evaluate extroverted and likeability and performance ratings,
effective with respect to leniency disagreeable (but not introverted however, was weaker for ratings
(the tendency to evaluate all and agreeable) colleagues’ of organisational citizenship
employees as outstanding and to performance as lower (Erez et al behaviour (OCB) than for ratings
give inflated ratings rather than 2015). This finding is replicated of task performance. In addition,
true assessments of performance). by the same researchers in a the degree of likeability was more
Frame-of-reference training and randomised controlled study, strongly related to supervisor
behavioural-observation training which in addition demonstrated and subordinate ratings than to
appear to be the most effective that introverts’ sensitivity to the peer ratings. Surprisingly, the
(single) types of training, with personal traits of other people and relationship between likeability
results indicating a large positive general impressions mediate this and performance ratings was
effect on rating accuracy. A effect. This finding suggests that not moderated by the purpose
combination of different types of employees high in extroversion (for example administrative vs
rater training showed mixed effects. and disagreeableness12 should be developmental) of the rating.
made aware that their trait-relevant These findings suggest that
5 Male employees who experience behaviour may have a profoundly rater liking, for good or ill,
a conflict between family and negative impact on how they are appears to play a key role in the
work receive lower performance perceived by their introverted performance-rating process.
ratings (level A*) colleagues, which as a result may
A randomised controlled study lead to reduced performance 10 A rater’s personality13 has small
demonstrates that men who are evaluation for collective to medium effects on performance
involved in family caretaking accomplishments. rating (level A/B)
events that result in an absence A recent meta-analysis indicates
from the workplace receive lower 8 Whether or not an employee that personality traits such as
overall performance ratings and was hired or recommended by agreeableness, extroversion and
lower reward recommendations the rater has a large effect on emotional stability have a small
than men who do not, whereas performance rating (level A*) to moderate positive effect on
ratings of women are unaffected Results from a randomised performance ratings (Harari et al
(Butler and Skattebo 2004). controlled study show that 2015). In addition, a randomised
The gender of the rater(s) does ratings are upwardly biased when controlled study shows that highly
not moderate the sex bias. participants rate the performance agreeable individuals tend to be

12 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
more lenient when rating people bias tended to be more likely in 2 Employees’ organisational
with poor performance, especially organisational settings where citizenship behaviour has a
when they anticipate feedback/ black employees composed a small moderate to large positive effect
future collaboration (Randall and percentage of the workforce. on performance ratings (level A)
Sharples 2012). A controlled study, A recent meta-analysis based
however, indicates that higher 13 There is a small positive effect on 81 studies and a combined
levels of conscientiousness are of disability on performance rating sample size of more than
associated with lower performance for employees with a disability 31,000 employees indicates
ratings (Spence and Keeping 2010). (level A+) that organisational citizenship
In the meta-analysis the effect A meta-analysis of 13 controlled behaviour (OCB) has a medium
of rater personality is moderated studies shows an overall (small) to large effect on managers’
by both purpose (attenuated positive effect of disability on evaluation of job performance
when ratings are collected for performance evaluations of (Podsakoff et al 2013). This effect
administrative purposes and people with disabilities (Ren et is assumed to occur because:
strengthened when ratings are al 2008). However, an overall (a) managers may provide
collected for developmental (small) negative effect was higher evaluations to employees
purposes) and accountability found on both performance exhibiting OCBs as a form of
(attenuated when accountability expectations and hiring decisions reciprocity (Podsakoff et al
is high and strengthened when for people with disabilities. 1993); (b) OCBs are interpreted
accountability is low). Cumulatively, as behavioural manifestations of
raters’ personality traits account Ratee-centric rating errors commitment and/or loyalty (Allen
for between 6% and 22% of the and Rush 2001); and (c) managers
variance in performance ratings. 1 Employees’ tactics for tend to like these individuals more
influencing raters such as (Lefkowitz 2000). Although this
11 Gender bias has small to ingratiation or self-promotion effect can be regarded as a form
moderate effects on performance have a moderate effect on of rater bias, it is categorised as
rating (level B) performance ratings (level C) a ratee-centric effect, as OCB is
A meta-analysis of 32 studies Several studies find that employees sometimes used by employees to
shows little evidence of overall deliberately or unconsciously positively influence the outcome
gender bias in performance try to influence their manager of the performance appraisal
appraisals in actual work settings to achieve a higher performance (Dulebohn et al 2005).
(Bowen et al 2000). However, rating, especially when they face
when only men served as raters, job insecurity (Huang et al 2013). 3 Employees’ political skills
there were substantial pro- Whereas theory has specified a have a small positive effect on
male biases. When the raters number of types of tactics designed performance ratings (level C*)
were a mix of men and women, to influence, most of the research A recent longitudinal study
the latter rated slightly more (Gordon 1996, Higgins et al 2003) suggests that an employee’s
highly. In addition, measures tends to focus on the tactics of ability to effectively understand
considered masculine (for example ingratiation (flattery and carrying others at work, and use such
leadership, implementation) out favours in order to enhance knowledge to influence others
produced a pro-male bias, while managerial liking) and self- to act in ways that enhance their
measures viewed as feminine promotion (appearing competent personal and/or organisational
(for example communication, on the job and making managers objectives, is positively related to
interpersonal sensitivity) aware of one’s performance). performance rating (Hung et al
produced a pro-female bias. A longitudinal study, however, 2012). This ability, also referred
indicates that self-promotion to as political skills, was even
12 Race bias has a small effect on tactics have a negative effect on a found to moderate negative
performance rating (level C) manager’s liking of the employee, effects of employee behaviour.
A meta-analysis of 74 studies which tends to result in a lower For instance, it was found that
indicates that white raters assigned performance rating, whereas employees’ voice behaviour
higher ratings to white ratees ingratiation tactics have a positive (proactively challenging the status
than to black ratees. Black raters effect on a manager’s liking of quo and making constructive
also assigned higher ratings to an employee, and subsequently suggestions) may have a negative
black ratees than to white ratees result in a higher performance effect on performance ratings.
(Kraiger and Ford 1985). This race rating (Dulebohn et al 2004). This negative effect, however,

13 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
was found to be moderated by 2 The reliability of individual composite rating method is used (a
an employee’s political skills, performance measures depends rating made on a multi-item scale
suggesting that employees who on the type of measurement, the with scores averaged to a final
have low levels of political skill are source of measurement and the grade) than when an overall rating
likely to experience the negative level of job complexity (level A/B) method is used (a rating made
effects of voice behaviour on their Meta-analyses have demonstrated on a one-item scale) (Heneman
performance ratings. that, in general, subjective 1986). Finally, a recent randomised
and objective measures of controlled study demonstrates that
System-centric rating errors employee performance are not performance ratings based on the
interchangeable (Bommer et al consensus of multiple raters tend
1 The purpose of the performance 1995). For instance, in cases of to be more accurate than individual
appraisal moderates the complex jobs, objective measures ratings (Picardi 2015).
performance rating (level A) lack test–retest reliability (Sturman
More than 60 years ago, Taylor and et al 2005). In addition, it was 4 The medium used to report
Wherry (1951) hypothesised that found that employees’ self-ratings the outcome of the performance
performance-appraisal ratings for tend to be higher than the rating appraisal has a large effect on the
administrative purposes, such as pay of managers and peers (Harris and rating (level A*)
rises or promotions, would be more Schaubroeck 1988, Heidemeijer and A randomised controlled study
lenient than ratings obtained for Moser 2009). This finding confirms demonstrates that raters tend to
employee-development purposes. the outcome of previous research, give more negative appraisals
Over the past decades this indicating that employees tend when using email than when
hypothesis has been confirmed in a to overestimate their own level using traditional paper-form
large number of studies. A meta- of performance relative to that of methods (Kurtzberg et al 2005). In
analysis of 22 studies demonstrates others in the organisation (Harris addition, a controlled study shows
that performance-appraisal and Schaubroeck 1988, Mabe and that employees evaluated with
ratings obtained for administrative West 1982). Finally, several meta- traditional paper-form methods
purposes are, on average, one analyses demonstrate that the report higher levels of quality
third of a standard deviation inter-rater reliability of (subjective) for the ratings than employees
larger than those obtained for performance ratings by peers tend evaluated with an online system
employee-development purposes to be lower than the reliability (Payne et al 2009).
(Jawahar and Williams 1997). of ratings by managers, and that
In fact, randomised controlled these reliabilities are even lower 5 Accountability substantially
studies have demonstrated that for complex jobs (Conway and affects both rating outcomes and
managers tend to apply a different Huffcutt 1997, Heidemeijer and rating accuracy (level A)
decision process when making Moser 2009) or for job dimensions Several randomised controlled
performance-evaluation decisions that are difficult to measure (for studies demonstrate that rating
for administrative reasons (Pesta example leadership, interpersonal outcomes are affected when raters
et al 2005). When presented competence) (Viswesvaran et al are (or feel) accountable for their
with a performance-appraisal 1996, 2002). rating. For instance, raters whose
judgement for developmental rating is to be checked by an
reasons, managers are more likely 3 Both rating format and rating expert provide substantially lower
to consider all the performance- method have small to large effects ratings relative to control raters
related behaviours of the on the rating (level A/B) (Roch 2005, Roch and McNall
employee, and will use examples A meta-analysis of 23 studies 2007). However, the opposite
of both bad performance and demonstrates that the correlation effect is found when managers
good performance to reach between (subjective) ratings have to justify their rating in a
their conclusions. When making of managers and (objective) face-to-face meeting with the
promotion-related decisions, performance measures is higher employee – these managers rate
however, managers tend to when a relative rating format is their employees substantially more
consider only examples of poor used (comparing the employee positively (Klimoski and Inks 1990,
performance and use a threshold with other employees) than Spence and Keeping 2010). Finally,
to make the yes/no decision. As a when an absolute rating format it has been found (Palmer and
result, employees may perceive the is used (comparing the employee Feldman 2005) that (perceived)
evaluative decision as less accurate with a standard). In addition, a accountability has positive effects
and thus as unfair. higher correlation is found when a on rating accuracy, such as contrast

14 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
effects (the tendency for a rater to of the target performance in a
evaluate a person relative to other positive direction. The finding
individuals rather than on-the-job of contrast effects is particularly
requirements, see also below) robust in situations in which the
and halo effects (the tendency to anchor performance is either very
make inappropriate generalisations poor or very good and the target
from one aspect of a person’s job performance is average (Smither
performance). et al 1988). Some researchers
claim that contrast effects are an
6 Contrast effects affect rating important source of rating error
accuracy, but not always in a (Rowe 1967), whereas others
negative way (level A) maintain that the implications of
A contrast effect occurs when the contrast effects for actual rating
performance of one employee situations are negligible (Hakel
(the anchor) has an effect on et al 1970). In fact, a randomised
the evaluation of a subsequent controlled study showed that the
performance of that employee or relationship between contrast
another employee (the target). effects and rating accuracy can
For example, if the anchor even be positive,14 suggesting
performance is poor and the that not all rater-centric errors are
target performance is average, the an indicator of the quality of the
contrast effect drives the evaluation rating (Becker and Miller 2002).

15 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
3 Synthesis

Figure 5: Synthesis of findings into the performance appraisal chain

Purpose Purpose

Past Future
Objectives Rating Judgement Feedback Reaction
performance performance

Rating Perceived
accuracy fairness

‘... employee Over the last 50 years, there has and suggests that the relationship
been a considerable number of between performance appraisal
reaction to studies on the topic of performance and work performance is
appraisal. In fact, performance contingent upon a large number
feedback appraisal may well be one of of moderators and mediators.
is one of the the most widely studied topics
in the domain of management, The outcome of this REA indicates
most important with research on it dating back that employee reaction to feedback
at least as far as the early 1920s, is one of the most important
mediators in the and continuing to the present mediators in the performance-
day. After critically selecting and appraisal process. In fact, there is
performance assessing the available empirical strong evidence that employees’
appraisal studies, we can conclude that
the scientific evidence is rich
reaction to feedback, and not
feedback per se, determines the
process’ in both quantity and quality. extent to which their performance
will improve. How an employee
In addition, the assumed positive will react to the feedback on their
effect of performance appraisal appraisal, however, is strongly
on work performance is grounded moderated by the perceived
in three well-established social fairness of the appraisal process:
theories. However, this REA did not when the procedure is perceived
yield any randomised controlled to be just, employee reactions
studies that measured the direct are more likely to be favourable,
effect of performance appraisal on largely irrespective of the outcome.
workplace performance. Instead, Perceived fairness, in turn, is
the best available evidence consists moderated by several other
of a large number of high-quality variables, of which perceived
studies that focus on (one or usefulness, rating method, rating
multiple) separate elements of accuracy, focus of the feedback,
the appraisal process, such as level of employee participation, and
rating, judgement, feedback or quality of the relationship between
perceived fairness. The outcome manager and employee seem to
of these studies is unequivocal have the largest impact. A second

16 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
factor that moderates employees’ gender, and the outcome of their Finally, there is strong evidence that
reaction to the feedback embraces own performance appraisal. the purpose of the performance
personality variables such as Raters, however, cannot be held appraisal (administrative
self-esteem, locus of control and responsible for all these errors. versus developmental) strongly
openness to feedback. The evidence clearly indicates affects the way in which raters
that ratees actively attempt to evaluate and judge a person’s
In addition to these mediating and change a rater’s judgement by performance. In fact, the evidence
moderating factors, the outcome using influence tactics such as indicates that when making a
of this REA indicates that there is ingratiation, self-promotion and performance-appraisal judgement
a wide number of variables that political skills. In addition, errors for developmental reasons, raters
potentially affect the accuracy of in judgement occur because of apply a different decision-making
the performance rating, which as a the way in which the performance process from when they are making
result may threaten the validity of appraisal is conducted (procedures) a judgement for administrative
the appraisal outcome (judgement) or the performance is evaluated reasons. As a result, employees may
and seriously affect the fairness (measurement). Variables such as perceive the evaluative decision as
of the appraisal process. There type and source of measurement, less accurate and thus unfair.
is strong evidence that when a rating method, as well as aspects
person evaluates another person’s such as accountability and job
performance, systematic errors complexity, may all affect the
in judgement occur. Raters are accuracy of the performance
often biased by a wide range of rating and, as a result, affect the
variables, such as implicit person validity and fairness of the appraisal
theory, power level, relationship process.
with the ratee, personality traits,

17 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Conclusion

Based on the evidence found, measures in appraisals. These are 12 Race bias influences
we conclude that performance grouped into three categories. First, performance ratings (small
appraisals can have positive effects rater-centric errors include: effect; level C).
on work performance, but that 13 The performance of employees
these effects are highly contingent 1 Employees’ contextual with a disability is rated higher
upon a wide range of moderating performance influences their (small effect; level A+).
factors. These include: performance ratings (large
effect; level A). Ratee-centric rating errors include:
1 Reactions to feedback, rather 2 Managers’ implicit person theory
than the feedback itself, regarding the malleability of 1 Employees’ tactics for
influence performance (level A). personal attributes influences influencing raters, such as
2 Personality variables how they rate their employees ingratiation or self-promotion,
moderate reaction to the (large effect; level A*). affect performance ratings
feedback (level n/a).15 3 Managers’ power level influences (moderate effect; level C).
3 The perceived fairness of the how they rate both others and 2 Employees’ organisational
performance appraisal process themselves (large to moderate citizenship behaviour contributes
moderates the impact on future effect; level A). to performance ratings (moderate
performance (medium to large 4 Rater training contributes to to large effect; level A).
effect; level A). rating accuracy (medium to 3 Employees’ political skills
4 Both rating format and rating large effect; level A). contribute to performance
method moderate the effect on 5 Male employees who experience ratings (small effect; level C*).
perceived fairness, self-efficacy a conflict between family and
and ability to improve (small to work receive lower performance Finally, system-centric rating
large effect; level A). ratings (level A*). errors include:
5 Feedback perceived as useful 6 The outcome of managers’ own
improves perceptions of fairness performance appraisal influences 1 The purpose of the appraisal
(level B*). how they evaluate their moderates the performance
6 Negative feedback adversely employees (large effect; level A). rating (level A).
affects perceived fairness (level 7 Introverted employees 2 The type of measurement,
C*), whereas feedback that evaluate their extroverted source of measurement and
focuses only on positive aspects and disagreeable colleagues’ level of job complexity affect
contributes to perceived fairness performance lower (large effect; the reliability of performance
and overall job performance level A). measures (level A/B).
(medium effect; level A*). 8 Whether or not an employee was 3 Both rating format and rating
7 Participation contributes to hired or recommended by the method influence how positive
perceived fairness (medium to rater influences the performance ratings are (small to large effect;
large effect; level B). rating (large effect; level A*). level A/B).
8 The quality of the relationship 9 If raters like people, they rate 4 The medium used to report
between manager and employee their performance higher (small the outcome of the appraisal
contributes to the perceived to moderate effect; level B). influences the performance
fairness of the appraisal 10 A rater’s personality16 influences rating (large effect; level A*).
(substantial effect; level B). performance ratings (small to 5 Accountability affects both
medium effect; level A/B). rating outcomes and rating
We also identify a range of factors 11 Gender bias influences accuracy (large effect; level A).
that can undermine or strengthen performance ratings (small to 6 Contrast effects affect rating
the reliability of performance moderate effect; level B). accuracy, but not always in a

18 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Limitations

negative way (level A). A second limitation concerns the Finally, this REA focused only on
This REA aims to provide a critical appraisal of the studies high-quality studies, that is, studies
balanced assessment of what is included, which did not incorporate with a control group and/or a
known in the scientific literature a comprehensive review of the before-and-after measurement. For
about the effects of performance psychometric properties of the this reason, more than 50 cross-
appraisal on individual work tests, scales and questionnaires sectional studies were excluded.
performance by using the used. In addition, it should be As a consequence, new, promising
systematic review method to search noted that most of the studies findings that are relevant for
and critically appraise empirical included used performance ratings practice may have been missed.
studies. However, in order to be as an outcome measure, not actual
‘rapid’, concessions were made performance, so the evidence is Given these limitations, care must
in relation to the breadth and often indirect. be taken not to present the findings
depth of the search process, such presented in this REA as conclusive.
as the exclusion of unpublished A third limitation concerns the
studies, the use of a limited fact that the evidence on several
number of databases and a focus moderators is based on only one
on empirical research published in study (findings marked with an
the period 1980 to 2016 for meta- asterisk). Although most of these
analyses and the period 2000 studies were well controlled or even
to 2016 for primary studies. In randomised, no single study can be
addition, the search for empirical considered to be strong evidence –
studies was based only on terms it is merely indicative.
such as ‘performance appraisal’,
‘performance review’, ‘employee
evaluation’, and so on, and
related terms such as ‘feedback’,
‘judgement’, or ‘perceived
fairness’ were not included. As a
consequence, some relevant studies
may have been missed.

19 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Endnotes

1 All reports are available at cipd.co.uk/coulddobetter


2 See www.cipd.co.uk/knowledge/journals
3 It should be noted that randomised controlled studies are often conducted in an artificial (lab-type) setting – with students carrying out prescribed work tasks
– which may restrict their generalisability. Non-randomised studies in a field setting – with employees carrying out their normal tasks within an organisational
setting – on the other hand, have a lower level of trustworthiness, but can still be useful for management practice.
4 In a meta-analysis, statistical analysis techniques are used to pool the results of individual studies numerically in order to achieve a more accurate estimate
of the effect. Most studies defined as systematic reviews include a meta-analysis. The difference between a systematic review and a meta-analysis is therefore
mainly semantic. Indeed, in medicine a meta-analysis is often called a systematic review.
5 It should be noticed that in feedback intervention theory, the definition of ‘performance’ includes a wide spectrum of tasks, such as physical tasks, cognitive
tasks, complying with regulations and so forth.
6 It would, of course, make sense to differentiate between the performance of the individual, team and organisation. However, in most of the studies included,
only individual performance is taken into account.
7 One may ask what type of performance these studies measured. After all, in performance-appraisal programmes, ratings can be collected on many
performance dimensions (for example communication, risk-taking, customer focus, sales, decisiveness, and so on). Both meta-analyses included a large
number of primary studies that examined the effect of feedback interventions on performance. In the meta-analysis by Smither et al, several studies reported
separate scores for multiple dimensions, but these scores turned out to be highly correlated. In the meta-analyses by Kluger and DeNisi, a wide range of
performance dimensions were taken into account (for example physical tasks, knowledge tasks, creativity, new tasks, rule-following, vigilance tasks, quality vs
quantity, and so on), but only stronger effects were found for memory tasks and weaker effects for physical tasks and adherence to regulations. This suggests
that the findings of these meta-analyses may apply to a wide range of performance outcomes.
8 A moderator is a variable that affects the direction and/or strength of the relation between an independent or predictor variable (in this case performance
appraisal) and an outcome variable (work performance). Put differently, moderators indicate when or under what conditions a particular effect can be
expected. For this reason, they are also referred to as ‘boundary conditions’. A mediator, by contrast, is a variable that specifies how or why a particular effect
or relationship occurs. Thus, if you remove the effect of the mediator, the relationship between the independent or predictor variable (in this case performance
appraisal) and the outcome variable (work performance) will no longer exist. In short, moderators specify when a certain effect will hold, whereas mediators
determine how or why the effect occurs. The moderators and mediators are presented here in order of evidence quality and effect size, with the highest-quality
evidence and greatest effect first.
9 The studies mentioned here are not included in this REA, so their quality was not evaluated.
10 Because of the large number of contexts in which perceived fairness was studied, the studies included in these meta-analyses used a wide variety of
performance measures that came from various sources. For example, the measure of work performance included official performance ratings as they appeared
in organisational files, and in-role behaviour ratings. A separate analysis was conducted for organisational citizenship behaviour.
11 This type of feedback is also known as ‘feedforward’ (see Kluger and Nir 2010).
12 Extroversion and agreeableness are two separate personality traits. Extroversion tends to be manifested in outgoing, talkative, energetic behaviour,
whereas introversion is manifested in a more reserved and solitary approach. Agreeableness is a personality trait manifesting itself in individual behavioural
characteristics that are perceived as kind, sympathetic, co-operative, warm and considerate.
13 One may wonder how personality can be measured. There is a wide consensus among researchers that the best way to describe personality is by using the
‘Big Five’ personality traits. These are: openness (inventive/curious vs consistent/cautious), conscientiousness (efficient/organised vs easy-going/careless),
extroversion (outgoing/energetic vs solitary/reserved), agreeableness (friendly/compassionate vs analytical/detached) and neuroticism (sensitive/nervous vs
secure/confident). The validity of the Big Five factors has been replicated numerous times in different languages and cultural contexts, and several validated
measurement tools exists (for example the NEO-PI-R). Other measurements of personality, such as the Myers-Briggs Type Indicator (MBTI), are widely used,
but their stability, validity and reliability are questionable at best.
14 Until now, no plausible explanation for this finding has been found.
15  The studies mentioned here are not included in this REA, so their quality was not evaluated.
16  See note 13 above.

20 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
References

ADAMS, J.S. (1965) Inequity in BOWEN, C.-C., SWIM, J.K. and COHEN-CHARASH, Y. and
social exchange. Advances in JACOBS, R.R. (2000) Evaluating SPECTOR, P.E. (2001) The role of
Experimental Social Psychology. gender biases on actual job justice in organizations: a meta-
Vol 2. pp267–99. performance of real people: analysis. Organizational Behavior
a meta-analysis. Journal of and Human Decision Processes.
ALLEN, T.D., and RUSH, M.C. Applied Social Psychology. Vol 86, No 2. pp278–321.
(2001) The influence of ratee Vol 30, No 10. pp2194–2215.
gender on ratings of organizational COLQUITT, J.A., CONLON, D.E.,
citizenship behavior. Journal BROCKNER, J. and WIESENFELD, WESSON, M.J., PORTER, C.O.
of Applied Social Psychology. B.M. (1996) An integrative and NG, K.Y. (2001) Justice at
Vol 31, No 12. pp2561–87. framework for explaining the millennium: a meta-analytic
reactions to decisions: review of 25 years of organizational
BARENDS, E., JANSSEN, B. interactive effects of outcomes justice research. Journal of Applied
and VELGHE, C. (2016) Rapid and procedures. Psychological Psychology. Vol 86, No 3. p425.
evidence assessment of the Bulletin. Vol 120, No 2. p189.
research literature on the effect COLQUITT, J.A., SCOTT, B.A.,
of goal setting on workplace BUDWORTH, M.H., LATHAM, RODELL, J.B., LONG, D.M.,
performance. London: Chartered G.P. and MANROOP, L. (2015) ZAPATA, C.P., CONLON, D.E. and
Institute of Personnel and Looking forward to performance WESSON, M.J. (2013) Justice at
Development. Available at: improvement: a field test of the millennium, a decade later:
www.cipd.co.uk/coulddobetter. the feedforward interview for a meta-analytic test of social
[Accessed 22 November 2016]. performance management. Human exchange and affect-based
Resource Management. Vol 54, perspectives. Journal of Applied
BARTOL, K.M., DURHAM, C.C. and No 1. pp45–54. Psychology. Vol 98, No 2. p199.
POON, J.M. (2001) Influence of
performance evaluation rating BUTLER, A.B. and SKATTEBO, A. CONWAY, J.M. and HUFFCUTT, A.I.
segmentation on motivation (2004) What is acceptable for (1997) Psychometric properties
and fairness perceptions. women may not be for men: the of multisource performance
Journal of Applied Psychology. effect of family conflicts with work ratings: a meta-analysis of
Vol 86, No 6. pp1106–19. on job-performance ratings. Journal subordinate, supervisor, peer, and
of Occupational and Organizational self-ratings. Human Performance.
BECKER, G. and MILLER, C. (2002) Psychology. Vol 77. pp553–64. Vol 10, No 4. pp331–60.
Examining contrast effects in
performance appraisals: using CAWLEY, B.D., KEEPING, L.M. and DONNELLY, J. and TROCHIM, W.
appropriate controls and assessing LEVY, P.E. (1998) Participation (2007) The research methods
accuracy. Journal of Psychology. in the performance appraisal knowledge base. Mason, Ohio:
Vol 136, No 6. pp667–83. process and employee reactions: Atomic Dog Publishing.
a meta-analytic review of field
BOMMER, W.H., JOHNSON, J.L., investigations. Journal of Applied DENISI, A.S. and PRITCHARD, R.D.
RICH, G.A., PODSAKOFF, P.M. Psychology. Vol 83, No 3. pp615–33. (2006) Performance appraisal,
and MACKENZIE, S.B. (1995) performance management and
On the interchangeability of COHEN, J. (1988). Statistical improving individual performance:
objective and subjective measures power analysis for the behavioral a motivational framework.
of employee performance: A sciences (2nd ed.). Hillsdale, NJ: Management and Organization
metaanalysis. Personnel Psychology. Lawrence Erlbaum Associates. Review. Vol 2, No 2. pp253–77.
Vol 48, No 3. pp587–605.

21 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
DULEBOHN, J.H., SHORE, L.M., GEORGESEN, J.C. and HARRIS, HARRIS, M.M. and SCHAUBROECK,
KUNZE, M. and DOOKERAN, D. M.J. (1998) Why’s my boss J. (1988) A meta-analysis of
(2005) The differential impact always holding me down? A self-supervisor, self-peer, and
of OCBS and influence tactics meta-analysis of power effects peer-supervisor ratings. Personnel
on leader reward behavior on performance evaluations. Psychology. Vol 41, No 1. pp43–62.
and performance ratings over Personality and Social Psychology
time. Organizational Analysis. Review. Vol 2, No 3. pp184–95. HEIDEMEIJER, H. and MOSER, K.
Vol 13, No 1. pp73–90. (2009) Self-other agreement in
GIFFORD, J. (2016) In search job performance ratings: a meta-
DULEBOHN, J.H., MURRAY, of the best available evidence. analytic test of a process model.
B. and FERRIS, G.R. (2004) London: Chartered Institute of Journal of Applied Psychology.
The vicious and virtuous Personnel and Development. Vol 94, No 2. pp353–70.
cycles of influence tactic use Available at: www.cipd.co.uk/
and performance evaluation knowledge/strategy/analytics HENEMAN, R.L. (1986) The
outcomes. Organizational [Accessed 23 November 2016]. relationship between supervisory
Analysis. Vol 12, No 1. pp53–74. ratings and results-oriented
GORDON, R.A. (1996) Impact measures of performance:
ELICKER, J.D., LEVY, P.E. and of ingratiation on judgments a meta-analysis. Personnel
HALL, R.J. (2006) The role and evaluations: a meta- Psychology. Vol 39, No 4. p811.
of leader-member exchange analytic investigation. Journal
in the performance appraisal of Personality and Social HESLIN, P.A., LATHAM, G.P. and
process. Journal of Management. Psychology. Vol 71. pp54–70. VANDEWALLE, D. (2005) The
Vol 32, No 4. pp531–51. effect of implicit person theory
GRIFFIN, R. and EBERT, R.J. on performance appraisals.
EREZ, A., SCHILPZAND, P., LEAVITT, (2004) Business essentials. Journal of Applied Psychology.
K., WOOLUM, A.H. and JUDGE, London: Prentice Hall. Vol 90, No 5. pp842–56.
T.A. (2015) Inherently relational:
interactions between peers’ and HAKEL, M.D., OHNESORGE, J.P., HIGGINS, C.A., JUDGE, T.A. and
individual’s personalities impact and DUNNETTE, M.D. (1970). FERRIS, G.R. (2003) Influence
reward giving and appraisal of Interviewer evaluations of tactics and work outcomes: a meta-
individual performance. Academy job applicants’ resumes as a analysis. Journal of Organizational
of Management Journal. Vol 58, function of the qualifications Behavior. Vol 24. pp89–106.
No 6. p1761. of the immediately preceding
applicants: an examination of HOFFMAN, B., GORMAN, A., BLAIR,
FESTINGER, L. (1954) A contrast effects. Journal of Applied C., MERIAC, J., OVERSTREET, B. and
theory of social comparison Psychology, Vol 54, No1. p27. ATCHLEY, K. (2012) Evidence for
processes. Human Relations. the effectiveness of an alternative
Vol 7, No 2. pp117–140. HALBE, S. (1951) Appraisal of job multisource performance
performance. National Industrial rating methodology. Personnel
FOLGER, R., KONOVSKY, M.A. Conference Board. Vol 121. Psychology. Vol 65. pp531–63.
and CROPANZANO, R. (1992)
A due process metaphor for HARARI, M.B., RUDOLPH, C.W. and HUANG, G.-H., XIONG-YING,
performance appraisal. Research LAGINESS, A.J. (2015) Does rater N., ZHAO, H.H., ASHFORD, S.J.
in Organizational Behavior. personality matter? A meta analysis and LEE, C. (2013) Reducing
Vol 14. p129. of rater Big Five performance job insecurity and increasing
rating relationships. Journal of performance rating impression
Occupational and Organizational management matter? Journal of
Psychology. Vol 88, No 2. Applied Psychology. Vol 98,
pp387–414. No 5. pp852–62.

22 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
HUNG, H.-K., YEH, R.-S. and SHIH, KLUGER, A.N. and DENISI, A. ratings in actual and simulated
H.-Y. (2012) Voice behavior and (1996) The effects of feedback environments. Organizational
performance ratings: the role of interventions on performance: a Behavior and Human Decision
political skill. International Journal historical review, a meta-analysis, Processes. Vol 98, No 2. pp216–26.
of Hospitality Management. and a preliminary feedback
Vol 31, No 2. pp442–50. intervention theory. Psychological LAM, S., YIK, M. and SCHAUBROECK,
Bulletin. Vol 119, No 2. p254. J. (2002) Responses to formal
ILGEN, D.R., FISHER, C.D. and performance appraisal feedback:
TAYLOR, M.S. (1979) Consequences KLUGER, A.N. and NIR, D. (2010) the role of negative affectivity.
of individual feedback on behavior The feedforward interview. Journal of Applied Psychology.
in organization. Journal of Applied Human Resource Management Vol 87, No 1. pp192–201.
Psychology. Vol 64. pp349–71. Review. Vol 20. pp235–46.
LATHAM, G.P., BUDWORTH, M.-H.,
ILGEN, D., BARNES-FARELL, J. and KOOPMANS, L., BERNAARDS, C.M., YANAR, B. and WHYTE, G. (2008)
MCKELLIN, D. (1993) Performance HILDEBRANDT, V.H., SCHAUFELI, The influence of a manager’s own
appraisal process in the 1980s: W.B., DE VET HENRICA, C.W. performance appraisal on the
what has it contributed to and VAN DER BEEK, A.J. (2011) evaluation of others. International
appraisals in use? Organizational Conceptual frameworks of Journal of Selection and
Behaviour and Human Decision individual work performance: Assessment. Vol 16, No 3. pp220–28.
Processes. Vol 54. pp321–68. a systematic review. Journal of
Occupational and Environmental LEFKOWITZ, J. (2000) The role
IQBAL, M.Z., AKBAR, S. and Medicine. Vol 53, No 8. pp856–66. of interpersonal affective regard
BUDHWAR, P. (2015) Effectiveness in supervisory performance
of performance appraisal: an KORSGAARD, A., MEGLINO, B.M. ratings: a literature review and
integrated framework. International and LESTER, S.W. (1994, August) proposed causal model. Journal of
Journal of Management Reviews. The virtue of being altruistic: Occupational and Organizational
Vol 17, No 4, pp510-33. the role of the value of helping Psychology. Vol 73. pp67–85.
and concern in individuals’
JAWAHAR, I.M. (2010) The reactions to feedback from LINNA, E.M., VAN DEN BOS, K.,
mediating role of appraisal others. Paper presented at the KIVIMAKI, M., PENTTI, J. and
feedback reactions on the 1994 meeting of the Academy VAHTERA, J. (2012) Can usefulness
relationship between rater of Management, Dallas, TX. of performance appraisal interviews
feedback-related behaviors change organizational justice
and ratee performance. Group KRAIGER, K. and FORD, J.K. (1985) perceptions? A 4-year longitudinal
and Organization Management. A meta-analysis of ratee race study among public sector
Vol 35, No 4. pp494–526. effects in performance ratings. employees. International Journal
Journal of Applied Psychology. of Human Resource Management.
JAWAHAR, I.M. and WILLIAMS, C. Vol 70, No 1. pp56–65. Vol 23, No 7. pp1360–75.
(1997) Where all the children are
above average: the performance KUHL, J. (1992) A theory of MABE, P. and WEST, S. (1982)
appraisal purpose effect. self-regulation: actions vs. state Validity of self-evaluation of
Personnel Psychology. Vol 50. orientation, self-discrimination, ability: a review and meta-analysis.
and some applications. Applied Journal of Applied Psychology.
KLIMOSKI, R. and INKS, L. Psychology: An International Vol 67, No 3. pp280–96.
(1990) Accountability forces Review. Vol 41. pp97–129.
in performance appraisal.
Organizational Behavior and KURTZBERG, T.R., NAQUIN, C.E.
Human Decision Processes. and BELKIN, L.Y. (2005) Electronic
Vol 45. pp194–208. performance appraisals: the effects
of e-mail communication on peer

23 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
MIKULINCER, M. (1989) Cognitive Human Resource Development poor performance. Journal of
interference and learned Quarterly. Vol 12. pp127–46. Occupational and Organizational
helplessness: the effects of off- Psychology. Vol 85, No 1. pp42–59.
task cognitions on performance PETTIJOHN, C., PETTIJOHN,
following unsolvable problems. L.S., TAYLOR, A.J. and KEILLOR, REN, L.R., PAETZOLD, R.L. and
Journal of Personality and Social B.D. (2001) Are performance COLELLA, A. (2008) A meta-
Psychology. Vol 57. pp129–35. appraisals a bureaucratic exercise analysis of experimental studies
or can they be used to enhance on the effects of disability on
MURPHY, K.R. and CLEVELAND, sales-force satisfaction and human resource judgments. Human
J. (1995) Understanding commitment? Psychology and Resource Management Review.
performance appraisal: social, Marketing. Vol 18, No 4. pp337–64. Vol 18, No 3. pp191–203.
organizational, and goal-based
perspectives. London: Sage. PICARDI, C. (2015) The effects ROCH, S. (2005) An investigation
of multi-rater consensus on of motivational factors influencing
PALMER, J. and FELDMAN, J. performance rating accuracy. performance ratings. Journal of
(2005) Accountability and need Journal of Strategic Human Managerial Psychology. Vol 20,
for cognition effects on contrast, Resource Management. Vol 4, No 2. No 8. pp695–711.
halo and accuracy in performance
ratings. Journal of Psychology. PICHLER, S. (2012) The social ROCH, S. and MCNALL, L. (2007) An
Vol 136, No 2. pp119–37. context of performance investigation of factors influencing
appraisal and appraisal accountability and performance
PAYNE, S.C., HORNER, M.T., reactions: a meta-analysis. ratings. Journal of Psychology.
BOSWELL, W.R., SCHROEDER, Human Resource Management. Vol 141, No 5. pp499–523.
A.N. and STINE-CHEYNE, K.J. Vol 51, No 5. pp709–32.
(2009) Comparison of online and ROWE, P.M. (1967) Order
traditional performance appraisal PODSAKOFF, N.P., WHITING, effects in assessment decisions.
systems. Journal of Managerial S.W., WELSH, D.T. and MAI, K.M. Journal of Applied Psychology.
Psychology. Vol 24, No 6. pp526–44. (2013) Surveying for ‘artifacts’: Vol 51, No 2. p170.
the susceptibility of the OCB–
PESTA, B., KASS, D. and performance evaluation relationship ROWLAND, C.A. and HALL, R.D.
DUNNEGAN, K. (2005) Image to common rater, item, and (2012) Organizational justice and
theory and the appraisal of measurement context effects. performance: is appraisal fair?
employee performance: to screen or Journal of Applied Psychology. EuroMed Journal of Business.
not to screen? Journal of Business Vol 98, No 5. pp863–74. Vol 7, No 3. pp280–93.
and Psychology. Vol 19, No 3.
PODSAKOFF, P.M., MACKENZIE, S.B. SCHLEICHER, D.J., BULL, R.A. and
PETTICREW, M. and ROBERTS, and HUI, C. (1993) Organizational GREEN, S.G. (2009) Rater reactions
H. (2006) How to appraise citizenship behaviors and to forced distrubution rating
the studies: an introduction managerial evaluations of employee systems. Journal of Management.
to assessing study quality. In: performance: a review and Vol 35, No 4. pp899–927.
PETTICREW, M. and ROBERTS, H. suggestions for future research.
(eds) Systematic reviews in the In: FERRIS, G.R. and ROWLAND, SEIFERT, C.F., YUKL, G. and
social sciences: a practical guide K.M. Research in personnel and MCDONALD, R.A. (2003) Effects
(pp125-63). Hoboken, NJ: Wiley. human resources management of multisource feedback and a
(pp1–40). Greenwich, CT: JAI Press. feedback facilitator on the influence
PETTIJOHN, C.E., PETTIJOHN, behavior of managers toward
L.S. and D’AMICO, M. (2001) RANDALL, R. and SHARPLES, subordinates. Journal of Applied
Characteristics of performance D. (2012) The impact of rater Psychology. Vol 88, No 3. p561.
appraisals and their impact agreeableness and rating
on sales force satisfaction. context on the evaluation of

24 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
SHADISH, R., COOK, T.D. STURMAN, M.C., CHERAMIE, R.A. VISWESVARAN, C., SCHMIDT, F.L.
and CAMPBELL, D.T. (2002) and CASHEN, L.H. (2005) The and ONES, D.S. (2005) Is there
Experimental and quasi- impact of job complexity and a general factor in ratings of job
experimental designs performance measurement on the performance? A meta-analytic
for generalized causal temporal consistency, stability, framework for disentangling
inference. Boston: Houghton, and test-retest reliability of substantive and error influences.
Mifflin and Company. employee job performance ratings. Journal of Applied Psychology.
Journal of Applied Psychology. Vol 90, No 1. pp108–31.
SHAUGHNESSY, J.J. and Vol 90, No 2. pp269–83.
ZECHMEISTER, E.B. (1985) WALSTER, E., WALSTER,
Research methods in psychology. SUTTON, A.W., BALDWIN, S.P., G.W. and SCOTT, W.G. (1978)
New York: Alfred A. Knopf. WOOD, L. and HOFFMAN, Equity: theory and research.
B.J. (2013) A meta-analysis Boston: Allyn and Bacon.
SLAUGHTER, J.E. and GREGURAS, of the relationship between
G.J. (2008) Bias in performance rater liking and performance WOEHR, D.J. and HUFFCUTT,
ratings: clarifying the role ratings. Human Performance. A.I. (1994) Rater training for
of positive versus negative Vol 26, No 5. pp409–29. performance appraisal: a
escalation. Human Performance. quantitative review. Journal
Vol 21, No 4. pp414–26. TAYLOR, E.K. and WHERRY, R.J. of Occupational and
(1951) A study of leniency in Organizational Psychology.
SMITHER, J., LONDON, M. two rating systems. Personnel Vol 67, No 3. pp189–205.
and REILLY, R. (2005) Does Psychology. Vol 4, No 1. pp39–47.
performance improve following
multisource feedback? A theoretical TJAHJONO, H.K. (2014) The fairness
model, meta-analysis and review of organizations’ performance
of empirical findings. Personnel appraisal social capital and
Psychology. Vol 58, No 1. pp33–66. the impact toward affective
commitment. International Journal
SMITHER, J., REILLY, R. and of Administrative Science and
BUDA, R. (1988) The effects of Organization. Vol 21, No 3.
prior performance information on
ratings of present performance: VISWESVARAN, C., ONES,
contrast versus assimilation D.S. and SCHMIDT, F.L. (1996)
revisited. Journal of Applied Comparative analysis of the
Psychology. Vol 73. pp487–96. reliability of performance ratings.
Journal of Applied Psychology.
SPENCE, J.R. and KEEPING, Vol 81, No 5. pp557–74.
L.M. (2010) The impact of non-
performance information on ratings VISWESVARAN, C., SCHMIDT,
of job performance: a policy- F.L. and ONES, D.S. (2002) The
capturing approach. Journal of moderating influence of job
Organizational Behavior. Vol 31, performance dimensions on
No 4. p587. convergence of supervisory and
peer ratings of job performance:
unconfounding construct-level
convergence and rating difficulty.
Journal of Applied Psychology.
Vol 87, No 2. pp345–54

25 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Appendix 1

Search terms and hits


ABI/Inform Global, Business Source Elite, PsycINFO
Peer-reviewed, scholarly journals, May 2016
Search terms ABI BSP PSY
S1: ti(performance) AND ti(apprais*) 706 883 577
S2: ab(“performance appraisal*”) 1,093 1,328 810
S - : ti(performance) AND ti(review*) 500+ > nr
S3: ti(“performance review*”) 39 58 14
S4: ab(“performance review”) 150 289 50
S - : ti(performance) AND ti(evaluat*) 1500+ > nr
S5: ti(“performance evaluation*”) 1,143 2,055 248
S6: ab(“performance evaluation*”) 2,464 4,274 727
S7: ti(performance) AND ti(rating*) 326 382 626
S8: ab(performance rating*) 160 217 197
S9: ti(employee) AND ti(apprais*) 81 100 61
S10: ab(“employee appraisal”) 16 22 21
S11: S1 – S10 4,936 7,887 2,489

S12: S11 AND filter ti(meta-analy*) OR ab(meta-analy*) OR ti(“systematic review”) OR


32 34 33
ab(“systematic review”)

S13: S1 OR S3 (OR S5 - PsycINFO) OR S7 OR S9 1,060 1,318 1,432

S14: S13 AND filter ab(study OR studies OR empirical OR experiment* OR control* OR


244 299 –
longitudinal) and limited to 2000 - 2016

S15: S13 AND filter quantitative study and limited to 2000 - 2016 – – 444
S16: S14 AND filter ab(longitudinal) OR ab(experiment*) OR ab(control*) 40 52 62
S17: S14 OR S15 NOT S16 204 247 382

26 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Appendix 2

Selection process
Meta-analyses or systematic reviews

ABI Inform BSP PsycINFO


n = 32 n = 34 n = 33

Articles obtained from search*


n = 101

excluded Abstracts screened for relevance,


n = 60 duplicates removed

excluded Critical appraisal & text screened


n = 18 for relevance

included studies
n = 23

Meta-analyses or systematic reviews

ABI Inform BSP PsycINFO


n = 40 n = 52 n = 62

Articles obtained from search*


n = 156

excluded Abstracts screened for relevance,


n = 108 duplicates removed

excluded Critical appraisal & text screened


n = 11 for relevance

included studies
n = 37

* The total number of articles obtained from the search include a few additional articles that were referenced in articles found (but not identified
directly in our search) and judged to be worthy of inclusion.

27 | Rapid evidence assessment of the research literature on the effect of performance appraisal on workplace performance
Chartered Institute of Personnel and Development
151 The Broadway London SW19 1JQ United Kingdom
T +44 (0)20 8612 6200 F +44 (0)20 8612 6201
E cipd@cipd.co.uk W cipd.co.uk
Incorporated by Royal Charter
Registered as a charity in England and Wales (1079797) and Scotland (SC045154)
Issued: December 2016 Reference: 7357 Appraisal © CIPD 2016

S-ar putea să vă placă și