Sunteți pe pagina 1din 17

The evolution of

The evolution of the the performance


performance appraisal appraisal
process
process
233
Danielle S. Wiese and M. Ronald Buckley
Michael F. Price College of Business, University of Oklahoma, Norman,
Oklahoma, USA

Everyone has had their performance appraised in some context. The


performance appraisal process can be traced back, at least, for many thousands
of years. In spite of all of the thought given to this process, many of its flaws are
intractable. Quite frankly, most people have been evaluated at work and,
according to most anecdotes, find the experience uncomfortable and
unproductive. Moreover, those people who evaluate performance generally do
not report it as a particularly enjoyable or productive experience. Why do we
continue to use performance appraisals if most of our affect toward the process
is negative? The primary reason is that these systems are fundamental to a
number of important organizational decisions regarding pay, promotion, etc.
Worldwide, performance appraisals are used in nearly all organizations.
Corporations use different tools and have a number of goals for performance
appraisals, often resulting in some confusion as to the true purpose of
performance appraisal systems. However, at its core, the performance appraisal
process allows an organization to measure and evaluate an individual
employee’s behavior and accomplishments over a specific period of time
(DeVries et al., 1981).
For centuries, organizations survived quite well without formal performance
appraisal systems, which begs the question “Why do formal performance
appraisal systems exist?” As organizations evolve toward large organizations
with professional management, a more formal performance appraisal system
serves as an asset in administrative decision making. Regardless of the system
in place, decisions must be made regarding who receives raises and promotions
and who is terminated. These decisions are aided by a process that monitors
and evaluates an employee’s progress and allows for intra-organizational
comparisons of individual performance. Thus, the answer is that formal
systems simply have replaced informal ones. These formal performance
appraisal systems are not perfect and they continue to rely primarily upon
human information processing and judgment – imperfect processes, at best.
There are many advantages to using a formal system if performance
appraisals are designed and used properly (Murphy and Cleveland, 1995). It
facilitates organizational decisions such as reward allocation, promotions/ Journal of Management History,
Vol. 4 No. 3, 1998, pp. 233-249.
demotions, layoffs/recalls, and transfers. It may also assist managers in © MCB University Press, 1355-252X
Journal of developing employees. It serves to assist individual employee’s decisions
Management regarding career choices and the subsequent direction of individual time and
History effort. Additionally, performance appraisals may increase employee
commitment and satisfaction, due to improvements in organizational
4,3 communication.
A properly administered performance appraisal system may be an asset to
234 an organization. However, if the tools and goals of the performance appraisal
process are incongruent with organizational goals, the resulting performance
appraisal system may, in fact, be a detriment to effective organizational
functioning (Barrett, 1967). Furthermore, in a team environment, some believe
individual performance appraisals interfere with teamwork by
overemphasizing the individual. In fact, many have suggested (for example,
Deming) that there is no need for performance appraisal in the organizations of
the future. Additionally, ineffective performance appraisal systems may result
in mixed messages concerning which aspects of job performance are most and
least important, due to the oblique contingency between individual behavior
and organizational rewards. Finally, due to the differing (and often conflicting)
needs of stakeholders (the organization, appraiser, and employee), the process
itself is often a source of unmet expectations for all concerned (Murphy and
Cleveland, 1995).
The purpose of this paper is to outline the historical evolution of the
performance appraisal process. Our goal is to synthesize the progress (or lack
thereof) which has been made in this process, while critically analyzing
collective contributions to increasing the effectiveness with which behavior is
both observed and evaluated. In order to accomplish this, the appropriate
starting point is the earliest examples of processes which approximate those
which are currently identified as performance appraisal systems.

Early history
Although not called performance appraisal, the Bible has many examples
where the evaluation of individual performance is an important issue. “The
Lord has filled him (Bezalel) with the Spirit of God, in wisdom and
understanding, in knowledge and all manner of workmanship to design artistic
works, to work in gold and silver and bronze, in carving wood, and to work in
all manner of artistic workmanship” (Exodus, 35, pp. 31-3). In this instance,
Moses selected the man who was known to be the most skilled craftsman from
the tribes of Israel to build and furnish the tabernacle of the Lord in
approximately 1350 BC. Merit exams were given for selection and promotion
decisions as early as the Han Dynasty, 206 BC-220 AD (Wren, 1994).
Furthermore, in the early third century AD, “Imperial Raters” were employed
by emperors of the Wei dynasty to rate the performance of the official family
members. Commentary on the appraisals of the “Imperial Raters” mirrors the
sentiments of today’s critics, stating that “The Imperial Rater of Nine Grades
seldom rates men according to their merits but always according to his likes
and dislikes” (Patten, 1977). In 1648, it is reported that the Dublin (Ireland)
Evening Post evaluated legislators by using a rating scale based upon personal The evolution of
qualities (Hackett, 1928). the performance
Most likely, the early 1800s marked the beginning of performance appraisals appraisal
in industry with Robert Owen’s use of “silent monitors” in the cotton mills of
Scotland (Wren, 1994). Silent monitors were blocks of wood with different colors process
painted on each visible side and placed above each employee’s work station. At
the end of the day, the block was turned so that a particular color, representing 235
a grade (rating) of the employee’s performance, was facing the aisle for
everyone to see. Anecdotal evidence indicates that this practice had a
facilitating influence on subsequent behavior.

Formal performance appraisal begins in United States


Shift in ideas
In 1813, an Army General submitted an evaluation of each of his men to the U.S.
War Department. This is generally looked upon as the start of formal
performance appraisal in the United States. The Army General used a global
rating, with descriptions of his men such as “a good-natured man” or “a knave
despised by all” (Bellows and Estep, 1954). The Federal Civil Service of the
United States began giving merit ratings, also known as efficiency ratings, in
the late 1800s (Graves, 1948; Lopez, 1968; Petrie, 1950). In the 1840s and 1850s,
Congress required efficiency ratings of clerks which contained information on
competence, faithfulness and attention (White, 1954). However, these reports
were not used for selection, retention or promotion which continued to be at the
discretion of the bureau head and Secretary of the department. In response to
the public concern for economy and efficiency, a Division of Efficiency was
created within the Civil Service Commission in 1912 (Van Riper, 1958). In the
late nineteenth and early twentieth century, performance appraisals were used
primarily by military and government organizations – due to their large size,
hierarchical structure, geographic dispersal, and the necessity to promote the
top performers to higher organizational levels. At this time, most private
organizations used informal measures to evaluate individual performance and
make subsequent administrative decisions.
Development of performance appraisals in United States industry began
with early work in salesman selection by industrial psychologists at Carnegie-
Mellon University (Scott et al., 1941), who used trait psychology to develop a
man-to-man rating system. The army used this system during World War I to
assess the officer performance. After the war, business leaders, impressed by
the achievements of the army researchers, hired many of the men who had been
associated with the work in man-to-man appraisals. Industry wanted to use the
contributions of this new breed of psychologists (Scott et al., 1941). In 1922,
Donald Paterson introduced the graphic-rating scale to the general
psychological community (Landy and Farr, 1983). After this introduction,
numerous innovations in types of rating scales and techniques for scale
construction (see Likert, 1961) were created, leading to the increased popularity
of the graphic- or trait- rating scale (Patten, 1977; Van Riper, 1958).
Journal of Goal
Management Historically, performance appraisals have been used for administrative
History purposes, such as retention, discharge, promotion, and salary administration
decisions (DeVries et al., 1981; Murphy and Cleveland, 1995; Patten, 1977).
4,3 However, in this early era, with weak human resource management
departments and a lack of understanding of performance appraisal systems,
236 administrative decisions were often made independently of, and even ran
counter to, performance appraisals (Whisler and Harper, 1962). In addition to,
and perhaps because of, supervisors who did not take performance appraisals
seriously, the unions of this era advocated seniority-based decisions over
performance-based decisions. Thus, a loose correlation between appraisal
results and administrative decisions was permitted, which gave individual
supervisors discretionary power in relation to human resource outcomes (e.g.
promotions, salary increases).

Tools
The first tools used were global ratings and global essays (DeVries et al., 1981).
In global ratings, the rater provides an overall estimate of performance without
distinctions among any performance dimensions. Typical ratings include
“outstanding”, “satisfactory” and “needs improvement”. For global essays, a
rater responds narratively to a question such as “What is your overall
evaluation of this person over the last year?” The subjectivity of both methods
and the variability of the essay method made it difficult to use these tools to
make quality decisions. In addition, unless the essay is done correctly with a
great deal of detail, it is not particularly useful for developmental feedback. The
lack of reference to job-related behaviors would make these tools almost
certainly subject to legal action in today’s business environment.
The next tool widely used was the man-to-man ranking procedure,
developed for the US Army in 1914 (Scott and Clothier, 1923). The Army used
five scales to rank its officers: physical qualities; intelligence; leadership;
personal qualities; and the general value to the service. The rater chose 12 to 25
officers of the same rank as the officer being rated. The rater then ranks these
officers from highest to lowest based on one of the five scales and selects five
officers to use as the standard for judgment ((1) highest, (2) middle, (3) lowest, (4)
between highest and middle and (5) between middle and lowest). Values are
assigned to each of the five “standardized” officers and the ratee is assigned a
value by comparing him with these officers. Each rater makes his/her own
scale, resulting in a complex system which fails to account for individual
differences in scale construction.
Evolving from the man-to-man system was the judgmental rank order
procedure (DeVries et al., 1981). Raters provide an overall evaluation of
performance by checking a box which places each ratee in a certain percentage
of all ratees (top 25 per cent, top 50 per cent, bottom 50 per cent, bottom 25 per
cent). A rater may also list each employee’s name in order of effectiveness on
individual dimensions or distribute his/her employees along a scale on the basis
of total performance. While ranking employees does force distinction between The evolution of
ratees, these methods are qualitative, making it difficult to judge how much the performance
better the performance of one employee is over another and nearly impossible appraisal
to compare ratings across divisions. Also, while the top performers and bottom
performers will remain at the extremes, the employees in the middle may not process
have truly differentiable performance. Therefore, its utility in administrative
decisions is questionable. It is also not very effective for feedback and is subject 237
to legal issues due to the use of overall job performance ratings without
reference to job related dimensions.
The final tool to gain popularity during this time was the graphic- or trait-
rating scales. Benjamin (1952) reported that 87 per cent of a sample of 130
companies used these types of rating scales and they continue to be one of the
most common rating tools in use today. With this tool, the rater indicates on a
numerical scale the degree to which the ratee possesses certain personality
traits. The performance dimensions are usually ill-defined (and difficult to
measure) traits such as leadership, initiative, cooperation, judgment, creativity,
resourcefulness, innovativeness, and dependability. Because of the global, non
job specific nature of these traits, graphic-rating scales have not withstood legal
scrutiny (Bernardin and Beatty, 1984) and are not very useful in providing
performance feedback. Additionally, these vague performance dimensions call
on the rater to link observed behavior with the appropriate personality trait,
making rater error prevalent (Bernardin and Buckley, 1981). However, the
positive aspects of using the trait-rating scales are that they are inexpensive
and relatively easy to develop and administer, the results are quantifiable, the
rater examines more than one performance dimension, and because they are
standardized, the results are comparable across individuals and across
divisions (Cascio, 1991).
Prior to World War II, performance appraisal systems tended to exclude top
management, generally used graphic-rating scales and had just one or two
forms for all employees regardless of the job performed or skills necessary
(Spriegel, 1962). These systems appraised individuals on the basis of previously
established performance dimensions, using a standard, numerical scoring
system. They focused on past actions instead of future goals and were always
conducted by the supervisor with little input from the employee. These
shortcomings caused the military and industry to search for more accurate and
useful performance appraisal systems (DeVries et al., 1981).

Systems to reduce rater error and increase value


Shift in ideas
As the United States’ involvement in World War II became imminent, an
immediate need to promote top ranking officers to serve as generals was
apparent. However, the efficiency reports which had been used up to this point
rated half of the officers as “superior” or “best”. Additionally, efficiency reports
contained useless and naive statements about officers’ behavior. With these
issues in mind, psychologists were assembled to assist the army in the war
Journal of effort by improving its rating system. The resulting tools were the forced-choice
Management and critical incident methods (Sisson, 1948; Flanagan, 1954).
History
Tools
4,3 In the forced-choice method, a number of sets of statements, phrases or words
describing job performance is presented to the rater. In each set of four
238 statements, two appear favorable and two unfavorable. However, only one of
the favorable statements adds to the score and only one of the unfavorable
statements detracts from the score. Personnel research on prior successful and
unsuccessful performances determines the value added or subtracted for each
statement. The values are not known to the rater, who chooses the statements
which he/she believes to be most characteristic/descriptive of the employee.
Due to the rater’s ignorance of item values, the forced-choice method is
designed to reduce rater bias, creating more accurate ratings. This method
should also make the rater’s job easier, by focusing on observed behaviors
rather than personality traits or overall evaluations. Additionally, it establishes
objective standards of comparison between individuals (Richardson, 1949;
Sisson, 1948). However, a problem with this method is that raters may resent a
tool which only provides them with two very negative statements from which to
choose, forcing them to make derogatory comments about an employee
(Barrett, 1967). The raters also do not like the secrecy of the method because the
returned scores are just numbers without any true explanations supporting
them. The forced-choice method is a poor tool for individual development in
performance appraisal interviews. Therefore, when the raters are completing
the forms, they attempt to control the process by determining the item scores,
which undermines the method. Additionally, this method is expensive to
develop and provides a global indication of merit, rather than ratings of specific
dimensions of performance, causing confusion as to which performance is
acceptable and which is not (Cascio, 1991; Patten, 1977). Berkshire and
Highland (1953) noted “Ratees may resent a rating system that really rates.
Whatever the cause, forced-choice has not won wide acceptance in industry or
government.”
The critical incident method was developed to train pilots in take-offs and
landings. The behaviors which were crucial to success or failure were observed
and meticulously collected during World War II. In order to use this technique
in performance appraisals, supervisors must document positive and negative
behavioral events (incidents) that occur during a given performance period.
The supervisor then used the observations to review the employee’s
performance on a list of “go” and “no-go” behaviors related to a job, which does
not necessarily result in an overall rating or appraisal. The encouragement of
the supervisors to collect observed behavior information contributes to the
accuracy of the technique. However, because the supervisor observes and
records what he/she believes are behaviors which are critical to job
performance, rater inference is also a part of this technique. Additionally, the
method is complex, costly and time-consuming (DeVries et al., 1981; Patten,
1977). Finally, some believe that these results are misleading because only the The evolution of
extreme and unusual elements are reported at the expense of steady day-to-day the performance
performance, which is the real substance of an employee’s effectiveness appraisal
(Barrett, 1967). Thus, although the forced-choice and critical incident methods
are methodologically and substantively better than the tools used earlier, their
process
complexity and difficult application to individual development preclude their
use today (Flanagan, 1954). 239

Management by objectives for performance appraisal


Shift in ideas
By the early 1950s, 61 per cent of organizations regularly used performance
appraisals, compared with only 15 per cent immediately after World War II
(Spriegel, 1962). The primary tool was the trait-rating system, which focuses on
past actions, using a standard, numerical scoring system to appraise people on
the basis of a previously established set of dimensions (DeVries et al., 1981).
Many in the government were becoming dissatisfied with this method because
it used static measures of performance, was not closely related to employee
development and was too closely tied with reductions in force and removals
(Van Riper, 1958). Additionally, this rating system causes the manager to play
the role of judge, which is inconsistent with the roles of leader and coach
necessary to focus on and achieve both the employee’s and the organization’s
goals (McGregor, 1957). The performance appraisal problems associated with
these conflicting roles was accompanied by the initiation of widespread
manager appraisals, which began after World War II, gave impetus to the need
to update performance appraisal systems (DeVries et al., 1981). Recognition of
the limitations of performance appraisal systems in the 1950s led to the
development of new systems based on management by objectives.

Goals
It has been suggested that the purpose of a performance appraisal system
should be employee development and feedback (see Fedor, 1991). It has been
shown that individuals are motivated to seek feedback (if it is seen as a valuable
resource) to reduce uncertainty and to provide information relevant to self
evaluations (Ashford, 1986). There is also evidence that performance feedback
(if given appropriately) can lead to substantial improvements in future
performance (Guzzo et al., 1985; Kopelman, 1986; Landy et al., 1982). Feedback
can be a useful tool for development, especially if it is specific and behaviorally-
oriented, as well as both problem-oriented and solution-oriented (Murphy and
Cleveland, 1995). Therefore, many believe that performance appraisal systems
should provide meaningful feedback, rather than exclusively be used to make
judgments about the employee. Although, when the same performance
appraisal system is used for administrative decision making (e.g. raises,
promotions) and for feedback, both functions may suffer (Cleveland et al., 1989;
Meyer et al., 1965).
Journal of Tools
Management As a result of his study of managerial practices in General Motors, Peter
History Drucker first proposed Management by Objectives in The Practice of
Management in 1954. Douglas McGregor then applied this practice to
4,3 performance appraisals in 1957 in his article “An uneasy look at performance
appraisal”. McGregor recommended that employees be appraised on the basis
240 of short-term goals, rather than traits, which are jointly set by the employee and
the manager. The first step in the process is for the employee to arrive at a clear
statement of responsibilities of his/her position as they actually are in practice.
This statement is reviewed by the manager, and modified until both employee
and supervisor agree that the list is adequate. The employee then assesses
his/her strengths and weaknesses and, working from the statement of
responsibilities, establishes his/her goals for the projected evaluation period.
These goals are specific, measurable, time bounded and joined to an action plan
(McConkie, 1979). The employee and the manager meet to discuss and modify
the goals until both believe they are satisfactory. The final step occurs at the end
of the evaluation period or whenever there is a major change in the work
situation. Subordinates make appraisals of performance based on the
accomplishment of goals set forth at the beginning of the appraisal period. The
supervisor and employee have an appraisal interview in which they examine
the subordinate self-appraisal and set new goals for the next appraisal period
(Patten, 1977; DeVries et al., 1981). The typical cycle includes setting of
objectives, negotiation, implementation, discussion, changing directions, and
eventual measuring of accomplishment or failure (Kindall and Gatza, 1963).
McGregor believed that the management by objectives approach to
performance appraisal had many advantages. First, it redefined the role of
manager from judge to helper, permitting guidance needed for personal
development. Second, the technique focuses on what the employee produces as
a result of performance, increasing employee acceptance relative to
performance appraisal systems based on traits. Third, it shifts the orientation
toward future actions instead of past behaviors (DeVries et al., 1981; Patten,
1977). Management by objectives was an accepted practice in private industry
in the 1970s and was used primarily for managers (Hay Associates, 1976).
Although management by objectives has many positive features, its
limitations need to be understood. The primary issue that needs to be addressed
by the organization is the high level of management commitment and time
required to reorient the thinking of employees (Patten, 1977). Communication of
this commitment needs to be clear in order to prevent the complexity of the
system from turning initial excitement into confusion and disillusionment,
culminating in eventual disinterest and failure. Additionally, the purpose for
the new system needs to be clearly recognized, because while management by
objectives is a useful tool for performance planning and feedback, it is not
easily used for administrative decisions (DeVries et al., 1981).
A high degree of job analysis and inferential skills are needed to determine
which performance dimensions to measure and the goal achievement standards
to use. Initially, the goals and objectives which are set tend to be easily The evolution of
quantified, easily achieved and not necessarily central to the job (Murphy and the performance
Cleveland, 1995). Levinson (1970) discovered a tendency for objective-setting appraisal
measures such as sales dollars or number of units produced result in a
disregard for less quantifiable aspects of job performance such as customer process
service and quality work. Thus, if objectives are activity focused instead of
output centered (means rather than ends), this method is ineffective. There is 241
also a tendency for managers to ignore factors which are outside the employee’s
control, but which often affect goal accomplishment, leaving the employee
responsible for goal completion regardless of external influences (Goodale,
1977). In addition to external factors, managerial jobs are often measured in
terms of unit, rather than individual, objectives, which requires that individual’s
be held accountable for outcomes requiring interdependent employee efforts
(Levinson, 1970; Schneier and Beatty, 1978). These are only a few of the common
errors associated with MBO (Kleber, 1972), but they help to illustrate the
complexity of this performance appraisal method.

Focus on behaviorally-based ratings


Shift in ideas
Performance appraisal tools which are popular among raters, such as the global
essays, man-to-man ratings and graphic rating scales tend to be questionable in
terms of reliability, validity, and discriminability. Conversely, the forced-choice
method, which reduces rater errors while increasing the quality of the
psychometric properties, is often resisted by raters (Cascio, 1991). In an attempt
to produce a tool that was psychometrically sufficient (valid, reliable,
discriminating and useful), as well as accepted by raters, Smith and Kendall
(1963) devised the Behaviorally Anchored Rating Scales (BARS). This tool
replaced numerical or adjective anchors, which are used in the graphic- or trait-
rating scales, with behavioral examples of actual work behaviors. BARS
allowed supervisors to rate employees on observable behavioral dimensions,
instead of on a scale from 1 to 5 or “excellent” to “poor”. Numerous spin-offs to
BARS have been developed. The contribution of these developments has been
an emphasis on the behavioral bases of performance ratings.

Tools
The first tool to focus on behaviors was the Behaviorally Anchored Rating
Scales (BARS), designed by Smith and Kendall (1963). BARS development is a
long and arduous process, involving many steps and many people. From this
process, performance dimensions are more clearly defined and are based on
more observable behaviors. For example, a very high rating for a teacher in the
lecture performance dimension might be “lecturer uses concrete examples to
clarify answers”, a higher than average rating might be “lecturer’s response
repeats a point in the lecture” and a very low rating might be “lecturer insults
or verbally attacks questioner” (Murphy and Constans, 1987). Despite the time
and expense of developing a BARS tool, research has not shown that this
Journal of method is more accurate than graphic-rating scales (Schwab et al., 1975). Thus,
Management the goal of having sound psychometric properties was not achieved.
History Another tool which used behavioral examples was the Mixed Standard
Scales (MSS), designed by Blanz and Ghiselli (1972). Each scale is designed to
4,3 measure two performance dimensions, instead of one (as in BARS). For
example, in a six-item grouping for a lecturer/teacher, items 1, 4 and 5 might
242 refer to behaviors that represent a “response to questions” performance
dimension and items 2, 3 and 6 might refer to behaviors that represent a
“speaking style” performance dimension (Murphy and Constans, 1987). For
each performance dimension, there is one item describing good performance,
one describing average performance and one describing poor performance. For
each item, the rater is asked to respond whether the employee’s performance is
better than, about equal to or worse than the behavior described in each item.
While the rater’s task of filling out the form is more simple than with other
methods, the scoring system is so complex that the results may not be
understood, just as in the forced-choice method (Murphy and Cleveland, 1995).
Although BARS were found to result in more accurate rating of performance
than MSS (Benson et al., 1988), neither method alone fulfilled the needs of
performance raters – accuracy, ease of use, employee needs for information and
development.
Behavior observation scales were intended to improve on BARS (Latham and
Wexley, 1977). This scale uses the same class of items as the MSS, but asks the
rater to describe how frequently specific employee behaviors or critical
incidents occurred over the appraisal period (Murphy and Cleveland, 1995).
While it was designed to remove subjectivity, research suggests that this
method is as subjective, if not more subjective than trait-ratings or overall
evaluations (Murphy et al., 1982; Murphy and Constans, 1988). Other tools
which attempted to reduce rater error are the distributional measurement
model (Kane and Lawler, 1979) and the performance distribution assessment.
Thus, this era, which focused its efforts on reducing rater error, produced a
great deal of literature and a number of tools, but progress in improving
performance appraisals has been lacking. In fact, the beginning of the shift
away from rater error reduction research was a classic article by Bernardin and
Pence (1980) which demonstrated that decreases in rater error were
accompanied by decreases in the accuracy with which performance was
evaluated.

Legal issues
Shift in ideas
Passage of the Civil Rights Act of 1964 and the 1966 and 1970 Equal
Employment Opportunity Commission Guidelines for Regulation of Selection
procedures created a need for improvement in organizational appraisal
practices. These legal considerations exerted strong pressure on organizations
to formalize, validate, and organize appraisal systems (Murphy and Cleveland,
1995). The typical practices of the past such as use of personality traits in
appraisals, loose relationships between performance appraisal ratings and The evolution of
human resources outcomes, and a dearth of specific job-related behavior in the performance
evaluations were becoming targets of increasing amounts of federal regulation appraisal
and litigation.
In 1978, Uniform Guidelines on employee selection procedures were adopted process
by four major federal enforcement agencies. These guidelines stated that
employers cannot discriminate against any protected group by using a selection 243
device (including performance appraisals) for any personnel decisions/
practices that result in selection, training, transfer, retention or promotion of
employees (DeVries et al., 1981). It was not necessary to establish an intention to
discriminate to prove discrimination. The presence (or absence) of a
disproportionate number from a protected group is defined as prima facie
evidence of adverse impact. The Uniform Guidelines require any test or
selection device which shows adverse impact or discriminates against any
group protected by Title VII of the 1964 Civil Rights Act be validated or
demonstrated to be related to job performance.
Experts offered guidance in order to protect organizations from legal
considerations and the structure of performance appraisals changed (Bernardin
and Beatty, 1984) – performance appraisals should be based on specific
dimensions, defined in terms of behaviors, which have been established as
relevant through job analysis. Raters should receive training or instruction,
have adequate opportunities to observe the performance they are evaluating
and appraise their employees frequently. Feedback should be given to the ratee
and an appeals process should be in place. If possible, multiple raters should be
used to avoid rater bias and extreme ratings should be supported by
documentation. If organizations follow these guidelines for their performance
appraisal system, the likelihood of a successful discrimination case is reduced.

Performance appraisals today


Tools and goals
The evolution of performance appraisal systems has expanded the number of
available performance appraisal methodologies. Today, performance appraisals
are expected to serve a number of purposes simultaneously. Unfortunately, the
tools presently available are incapable of serving the myriad different purposes
of organization stakeholders.
When discussing the uses of performance appraisal, it is important to
distinguish between organizational goals, rater goals, and ratee goals.
Cleveland et al. (1989) described four types of uses of performance appraisal:
between person, within person, systems maintenance and documentation.
Between person uses are what have been referred to as administrative
purposes, consisting of recognition of individual’s performance to make
decisions regarding salary administration, promotions, retention, termination,
layoffs, etc. Within person uses are those identified in MBO, such as feedback on
performance strengths and weaknesses to identify training needs and
determine assignments and transfers. Performance appraisals also help in
Journal of organizational goals, which are referred to as systems maintenance uses.
Management Examples of this type of purpose are workforce planning, determining
History organizational training needs, evaluating goal achievement, identifying
organizational developmental needs, assisting in goal identification, evaluating
4,3 the personnel system and reinforcing the authority structure. Finally,
documentation purposes are to meet the legal requirements by documenting
244 personnel decisions and conducting validation research on the performance
appraisal tools. Organizations are attempting to meet all these needs
simultaneously while continuing to use tools that were designed for one type of
purpose. Thus, while organizations believe they need a performance appraisal
system, they are unsatisfied with the results. This dissatisfaction has
historically motivated researchers to try to improve performance appraisals
and continue to do so.

Summary of recent research issues


In the 1970s, research focused on (1) developing rating scales that were valid
and reliable and (2) training those who performed the performance appraisals
to reduce rating errors and improve observation skills (Bernardin and Buckley,
1981; Ilgen et al., 1993). In the late 1970s, concentration on reducing rater error
had reached a point of diminishing returns – decreasing rater error resulted in
decreases in rating accuracy (Bernardin and Pence, 1980). The focus shifted
from improving psychometric properties to understanding how the rater
processes information about the employee and how this mental processing
influences the accuracy of performance appraisal (Landy and Farr, 1980). The
best idea to come out of this line of research was the suggestion made by
Bernardin and Buckley (1981) that written diaries which contain critical
incidents of performance be kept by supervisors and serve as a basis for
performance appraisal ratings. The result of this suggestion was a closer
correspondence between the observation of behavior and the subsequent rating
of observed behavior and a decreasing reliance on the fallible memory of
supervisors. Although rating scale formats, training and other technical
qualities of performance appraisal influence the quality of ratings, the quality
of performance appraisals is also strongly affected by the administrative
context in which they are used (Ilgen et al., 1993; Murphy and Cleveland, 1995).
Effective managers recognize performance appraisal systems as a tool for
managing, rather than a tool for measuring, subordinates. They may use
performance appraisals to motivate, direct and develop subordinates and to
maximize access to important resources in the organization. A number of
factors affect the ratings, such as the purpose of the appraisal, the extent to
which the information is shared with other employees, other comparison
measures and the manager’s desire to be liked by the employee. Cognitive
processing issues and concern for rater error became secondary issues.
Emphasis was placed upon the internal and external environmental factors
which influence the performance appraisal process.
Effects of contextual factors The evolution of
Performance appraisals were developed when organizations were large and the performance
hierarchically organized, when market and organizational environments were appraisal
relatively stable, when the workforce was homogenous and relatively well
qualified, and when long-term employment was the norm (Murphy and process
Cleveland, 1995). For organizations today, both the internal and external
environments are dynamic. Organizations are becoming more decentralized and 245
the ratio of managers to non-managerial employees is shrinking. The social,
political and technical environments in which the organizations exist are
increasingly turbulent. The workforce is no longer homogenous and not
necessarily well prepared for complex jobs. Additionally, employees are
increasingly likely to change jobs, organizations and even careers.
Because of the changing definitions of jobs and their roles in the
organization, appraisal forms should reflect the important aspects of the work
performed in each functional area. The increased numbers of lateral transfers
(due to flatter organizations), suggests that the appraisal system should focus
on the strengths and weaknesses of the individual. Performance appraisals
should be used to identify a feasible set of quality workers or candidates,
instead of the best person in an organization. Performance appraisal goals need
to become more comprehensive – goals which are beneficial to both individual
and organization. For example, instead of just assisting an organization make
decisions concerning an individual, performance appraisals should be used to
help an individual make personal decisions regarding his/her current
performance and provide strategies for future development. Because it is
difficult for the supervisor and peers to observe each person’s performance, the
selection of the people who perform the appraisals may be influenced by the
increasing span of control caused by flatter organizations. Employee’s
performance may be evaluated by managers, peers, subordinates, self,
suppliers, customers and other relevant sources.
Presently, performance appraisals are used for individuals, however, more
companies are going to team/work group approaches, which may necessitate a
change from individual to the use of both individual and group performance
appraisals. For teams, the organization must determine who will be appraised,
how appraisals will be used and the definition of performance. To avoid feelings
of inequity and to assist in administrative decisions, individual appraisals
should be given as well as the team evaluation. Performance appraisals are
likely to evaluate a core set of performance dimensions common to a functional
unit, combined with an individual’s contribution to and interaction with his/her
work team. The definition of job performance needs to expand to include team-
oriented behaviors. Additionally, as teams change based on evolving
organizational demands, performance appraisal cycles may become more
closely linked with task/project cycles instead of, as is done at present, on a
somewhat regular basis (e.g. annually).
Other non-traditional issues which need to be addressed are telecommuting,
flexible and compressed work schedules, and any work performed outside of
Journal of the observation of supervisors. These situations give the manager a limited
Management opportunity to observe the worker’s performance, making it unlikely that
History performance appraisals will be valuable sources of information regarding the
quality of performance or the steps that workers should take to improve work
4,3 quality. Therefore, a shift in the basis of the evaluation from behaviors to results
may be necessary. Temporary workers are another modern day phenomenon.
246 Because the duration of employment is relatively short, there is little
opportunity to motivate and socialize employees or help them develop skills.
Logically, it may be wise for someone that manages a number of temporary
employees to spend little time evaluating employee performance.

Conclusion
After decades of research, where is the performance appraisal process today?
Have the tools and the processes advanced to the point of accurately and
effectively measuring the performance of employees? The answer is “probably
not”. Frankly, not much has changed since the classic work of Barrett (1966/67).
His book was one of the first to provide much of the advice given to users of
performance appraisal today. Those with a serious interest in performance
appraisal need to read this book. It is as applicable today as it was when first
published. Subsequent researchers have delved into the minutia of performance
appraisal process with negligible substantive contributions.
Researchers have focused on reducing the errors in the tools, but have not
been particularly concerned with what the tools are actually measuring – what
we are observing and how well we are observing it. Academicians have delved
into the mind of the rater and tried to determine the processes used to evaluate
employees. They attempt to train the rater to improve observational skills. They
tell the rater to keep a performance diary. While these ideas are theoretically
sound, practically, the amount of time for this exercise is too demanding.
Additionally, researchers studying the rating process appear to have
neglected the political aspect of performance appraisal. Often, the goal of the
rater is not to evaluate the performance of the employee, but to keep the
employee satisfied and not to deleteriously influence employee morale. The
manager also has to be concerned about his/her own image. If a number of
employees receive negative ratings, that reflects poorly on the manager. Thus,
the goals of the manager may be different from those that the organization is
trying to achieve through the performance appraisal process. Therefore,
research on performance appraisals needs to turn to learning more about the
conditions that encourage raters to use the performance appraisal system as it
was intended to be used.
Even though the process is unsatisfactory for most people in industry,
performance appraisals serve a number of valuable organizational purposes.
Because our culture believes that people should be rewarded for outstanding
performance, yet does not like to receive negative feedback, performance
appraisal systems are very complicated. Organizations need to understand the
strengths and weaknesses associated with each of the tools and determine
which goals they want to accomplish. They need to realize that a single tool The evolution of
cannot be used over a diverse series of jobs. Once performance appraisals are the performance
seen as a tool for managing resources, the research focus should shift to appraisal
matching the appropriate tools with the desired outcomes. Until then,
businesses will continue to use the performance appraisal systems in use today, process
and hopefully through Divine Providence, the best people will generally rise to
the top. 247
References
Ashford, S.J. (1986), “Feedback-seeking in individual adaptation: a resource perspective”,
Academy of Management Journal, Vol. 29, pp. 465-87.
Barrett, R.S. (1967), Performance Rating, Science Research Associates, Inc., Chicago, IL.
Bellows, R.M. and Estep, M.F. (1954), Employment Psychology: The Interview, Rinehart, New
York, NY.
Benjamin, R., Jr (1952), “A survey of 130 merit-rating plans”, Personnel, Vol. 29, pp. 289-94.
Benson, P.G., Buckley, M.R. and Hall, S. (1988), “The impact of rating scale format on rater accuracy:
an evaluation of the mixed standard scale”, Journal of Management, Vol 14, pp. 415-23.
Berkshire, J.R. and Highland, R.W. (1953), “Forced choice performance rating – a methodology
study”, Personnel Psychology, Vol. 6, pp. 355-78.
Bernardin, H.J. and Buckley, M.R. (1981), “Strategies in rater training”, Academy of Management
Review, Vol. 6, pp. 205-13.
Bernardin, H.J. and Beatty, R.W. (1984), Performance Appraisal: Assessing Human Behavior At
Work, Kent, Boston, MS.
Bernardin, H.J. and Pence, E.C. (1980), “Rater training: creating new response sets and decreasing
accuracy”, Journal of Applied Psychology, Vol. 65, pp. 60-6.
Blanz, F. and Ghiselli, E.E. (1972), “The mixed standard scale: a new rating system”, Personnel
Psychology, Vol. 25, pp. 185-99.
Cascio, W.F. (1991), Applied Psychology In Personnel Management, Prentice-Hall, Englewood
Cliffs, NJ.
Cleveland, J.N., Murphy, K.R. and Williams, R.E. (1989), “Multiple uses of performance appraisal:
prevelance and correlates”, Journal of Applied Psychology, Vol. 74, pp. 130-5.
DeVries, D.L., Morrison, A.M., Shullman, S.L. and Gerlach, M.L. (1981), Performance Appraisal On
The Line, Center for Creative Leadership, Greensboro, NC.
Drucker, P.F. (1954), The Practice of Management, Harper, NewYork, NY.
Fedor, D.B. (1991), “Recipient respounses to performance feedback: a proposed model and its
implications”, Research in Personnel and Human Resources Management, Vol. 9 (annually),
pp. 73-120.
Flanagan, J.C. (1954), “The critical incidents technique”, Psychological Bulletin, Vol. 51, pp. 327-58.
Goodale, J.G. (1977), “Behaviorally-based rating scales: toward an integrated approach to
performance appraisal”, in Hammer, W.C. and Schmidt, F.L. (Eds), Contemporary Problems in
Personnel (rev. ed.), St Clair, Chicago, IL.
Graves, W.B. (1948), “Efficiency rating systems: their history, organization and functioning”,
Hearings, Efficiency Rating System for Federal Employees, 80th Congress, 2nd Session,
Government Printing Office, Washington, D.C.
Guzzo, R.A., Jette, R.D. and Katzell, R.A. (1985), “The effects of psychologically-based intervention
programs on worker productivity”, Personnel Psychology, Vol. 38, pp. 275-93.
Hackett, J.D. (1928), “Rating legislators”, Personnel, Vol. 7 No. 2, pp. 130-1.
Hay Associates (1976), “Accent on appraisal”, Management Memo, No. 293.
Journal of Ilgen, D.R., Barnes-Farrell, J.L. and McKellin, D.B. (1993), “Performance appraisal process
research in the 1980s: what has it contributed to appraisals in use?”, Organizational Behavior
Management and Human Decision Processes, Vol. 54, pp. 321-68.
History Kane, J.S. and Lawler, E.E. (1979), “Performance appraisal effectiveness: its assessment and
4,3 determinants”, in Staw, B. (Ed.), Research in Organizational Behavior, Vol. 1, JAI Press,
Greenwich, CT.
Kindall, A.F. and Gatza, J. (1963), “Positive program for performance appraisal”, Harvard
248 Business Review, Vol. 41 No. 6, pp. 153-62.
Kleber, T.P. (1972), “Forty common goal-setting errors”, Human Resource Management, Vol. 11
No. 3, pp. 10-13.
Kopelman, R.E. (1986), “Objective feedback”, in Locke, E.A. (Ed.), Generalizing From Laboratory
To Field Settings, Lexington Books, Lexington, MA.
Landy, F.J. and Farr, J.L. (1980), “Performance rating”, Psychological Bulletin, Vol. 87, pp. 72-107.
Landy, F.J. and Farr, J.L. (1983), The Measurement of Work Performance: Methods, Theory, and
Applications, Academic Press, New York, NY.
Landy, F.J., Farr, J.L. and Jacobs, R.R. (1982), “Utility concepts in performance measurement”,
Organizational Behavior and Human Performance, Vol. 30, pp. 15-40.
Latham, G.P. and Wexley, K.N. (1977), “Behavioral observation scales”, Personnel Psychology, Vol.
30, pp. 255-68.
Levinson, H. (1970), “Management by whose objectives”, Harvard Business Review, Vol. 48 No. 4,
pp. 125-34.
Likert, R.L. (1961), New Patterns of Management, McGraw-Hill, New York, NY.
Lopez, F.M. (1968), Evaluating Employee Performance, Public Personnel Association, Chicago, IL.
McConkie, M.L. (1979), “A clarification of the goal setting and appraisal processes in MBO”,
Academy of Management Review, Vol. 4, pp. 29-40.
McGregor, D. (1957), “An uneasy look at performance appraisal”, Harvard Business Review, Vol.
35 No. 3, pp. 89-94.
Meyer, H.H., Kay, E. and French, J. (1965), “Split roles in performance appraisal”, Harvard
Business Review, Vol. 43, pp. 123-9.
Murphy, K.R. and Cleveland, J.N. (1995), Understanding Performance Appraisal: Social,
Organizational and Goal-Based Perspectives, Sage, Thousand Oaks, CA.
Murphy, K.R. and Constans, J.I. (1987), “Behavioral anchors as a source of bias in rating”, Journal
of Applied Psychology, Vol. 72, pp. 523-79.
Murphy, K.R. and Constans, J.I. (1988), “Psychological issues in scale format research: Behavioral
anchors as a source of bias in rating”, in Cardy, R., Peiffer, S. and Newman, J. (Eds), Advances
in Information Processing in Organizations, Vol. 3, JAI Press, Greenwich, CT.
Murphy, K.R., Martin, C. and Garcia, M. (1982), “Do behavioral observation scales measure
observation?”, Journal of Applied Psychology, Vol. 67, pp. 562-7.
Patten, T.H., Jr (1977), Pay: Employee Compensation and Incentive Plans, Free Press, London.
Petrie, F.A. (1950), “Is there something new in efficiency rating?”, Personnel Administrator, Vol.
13, pp. 24.
Richardson, M.W. (1949), “Forced-choice performance reports: a modern merit-rating method”,
Personnel, Vol. 26, pp. 205-12.
Schneier, C.E. and Beatty, R.W. (1978), “The influence of role prescriptions on the performance
appraisal processes”, Academy of Management Journal, Vol. 21, pp. 129-34.
Schwab, D.P., Heneman, H.G., III and DeCotiis, T.A. (1975), “Behaviorally anchored rating scales:
a review of the literature”, Personnel Psychology, Vol. 28, pp. 549-62.
Scott, W.D. and Clothier, R.C. (1923), Personnel Management: Principles, Practices, and Points of
View, A.W. Shaw, New York, NY.
Scott, W.D., Clothier, R.C. and Spriegel, W.R. (1941), Personnel Management, McGraw-Hill, New The evolution of
York, NY.
Sisson, E.D. (1948), “Forced-choice: the new army rating”, Personnel Psychology, Vol. 1, pp. 365-81.
the performance
Smith, P.C. and Kendall, L.M. (1963), “Retranslation of expectations: an approach to the appraisal
construction of unambiguous anchors for rating scales”, Journal of Applied Psychology, Vol. process
47, pp. 149-55.
Spriegel, W.R. (1962), “Company practices in appraisal of managerial performance”, Personnel,
Vol. 39, pp. 77. 249
Van Riper, P. (1958), History of the United States Civil Service, Row, Peterson and Co., Evanston, IL.
Whisler, T.L. and Harper, S.F. (Eds), (1962), Performance Appraisal: Research And Practice, Holt,
Rinehart & Winston, New York, NY.
White, L.D. (1954), The Jacksonians, The Macmillan Co., New York, NY.
Wren, D.A. (1994), The Evolution of Management Thought, John Wiley & Sons, Inc., New York,
NY.

S-ar putea să vă placă și