Sunteți pe pagina 1din 11

_______________________________________________________________

_______________________________________________________________

Report Information from ProQuest


01 March 2015 12:40
_______________________________________________________________

01 March 2015

ProQuest

Table of contents
1. "Quality of Measurement": The Implicit Legal Cornerstone of HR Assessments.........................................

Bibliography...................................................................................................................................................... 9

01 March 2015

ii

ProQuest

Document 1 of 1

"Quality of Measurement": The Implicit Legal Cornerstone of HR Assessments


Author: Lange, Rense; Houran, James
ProQuest document link
Abstract: HR assessments are often constructed and validated using statistical techniques that do not
guarantee legal compliance, due to unwitting risks of poor measurement quality and biased results. Such pitfalls
can be addressed using approaches grounded in modem Item Response Theory (IRT). This approach provides
superior measurement, enhanced protection against discriminatory outcomes as well as the capability for
human resources professionals to capture idiosyncratic information about test-takers to guide evidence-based
behavioral interviews, reference checking, and development programs.
Full text: Headnote
HR assessments are often constructed and validated using statistical techniques that do not guarantee legal
compliance, due to unwitting risks of poor measurement quality and biased results. Such pitfalls can be
addressed using approaches grounded in modem Item Response Theory (IRT). This approach provides
superior measurement, enhanced protection against discriminatory outcomes as well as the capability for
human resources professionals to capture idiosyncratic information about test-takers to guide evidence-based
behavioral interviews, reference checking, and development programs.
Preemployment screening and selection assessments are arguably a best practice part of a comprehensive due
diligence process for determining candidates' technical competence for specific roles and cultural fit within
particular organizations. It seems likely that many readers of the Employee Relations Law Journal have
completed some type of human resources (HR) assessment at some point, currently use them professionally, or
work with clients who use them. Much has been written in academic and lay literature about the proper
administration of HR assessments as a careful measurement of the salient traits or characteristics in candidates
(or incumbents) as well as a systematic protocol in the actual administration of assessments to minimize the
obvious risk of adverse impact. Adverse impact, in the context of employment, refers to employment practices
that appear neutral but have a discriminatory effect on a protected group. It occurs when a decision, practice, or
policy has a disproportionately negative effect on a protected group.
Under the Equal Employment Opportunity Commission (EEOC) guidelines, adverse impact is defined as a
substantially different rate of selection in hiring, promotion, or other employment decision that works to the
disadvantage of members of a race, sex, or ethnic group. The EEOC agencies have adopted a rule of thumb
under which they generally will consider a selection rate for any race, sex, or ethnic group that is less than fourfifths (4/5ths), or 80 percent, of the selection rate for the group with the highest selection rate as a substantially
different rate of selection. The selection rates for males and females are compared, and the selection rates for
race and ethnic groups are compared with the selection rate of the race or ethnic group with the highest
selection rate.
H. Beau Baez's 2013 article, "Personality Tests in Employment Selection: Use with Caution," in the Cornell HR
Review provides a succinct, timely, and cogent synopsis of the legal issues and practical considerations in the
systematic use of assessments in the workplace. Yet despite proper content and administration of HR
assessments, adverse impact or other discrimination still may occur unintentionally, not due to improper or
careless HR practices per se, but rather due to some fairly technical issues with the selection tools being used.
Such issues can be avoided only when HR professionals possess the background information to appreciate
their importance, and it is thus crucial that HR professionals, hiring managers, and performance management
consultants become more educated and discerning about the scientific validity of assessments, that is, their
psychometric properties. Ultimately, the psychometric quality and corresponding legal defensibility and practical
01 March 2015

Page 1 of 9

ProQuest

relevance of any HR assessment is grounded in the fundamental issue of quality of measurement-a topic rarely
discussed, if at all, in ways relatable to HR practitioners and other professionals who rely on the outputs from
assessments. Moreover, it is expected that most HR professionals are not aware of this implicit legal
cornerstone of assessments nor understand the practical risks of making business decisions, such as hiring or
promotion, based on assessments with poor psychometrics.
These basic issues will be framed in the form of an informal technical comparison of the two major theoretical
frameworks currently being used in industrial and organizational psychology (and indeed in psychological and
educational testing in general). First, starting early in the last century, Classical Test Theory (CTT)
conceptualizes test scores as a location on a continuous variable, and a test-taker's "true score" can be
obtained, at least theoretically, as the long term mean of their observed scores along this dimension. For an
overview, readers are referred to Lord and Novick's (1968) classic work. Secondly, Item Response Theory
(IRT), on the other hand, shifts the focus to the responses of particular test-takers to specific items (a term
interchangeable with test question or survey statement), where items "difficulties" and test-takers' abilities (or
trait levels) are regarded as locations along a common underlying latent (i.e., unobservable) variable. In the
Rasch (1960/1980) IRT model, which is used here throughout, item difficulty and test-taker ability are the only
parameters needed to capture the likelihood of observing a particular set of responses. Other IRT formulations
exist that include additional item parameters. However, such parameters add little of substance, while
needlessly complicating the presentation. Therefore, the following presentations are limited to the simpler
Rasch IRT model.
Speaking to adverse impact and fairness, readers will learn that IRT makes it feasible to present issues of
measurement quality, assessment equivalency, and assessment bias that otherwise typically remain invisible in
information provided by vendors trying to differentiate and promote their products. Measurement quality must be
addressed first and foremost above any other issue, and IRT has tremendous value for tackling a variety of
other HR and business-related research questions. For instance, IRT provides powerful quality control tools to
detect both "aberrant" (misfitting) items as well as test-takers (fig., due to poor instructions provided to testtakers, cheating, or scoring errors). Also, assessment results may be skewed due to item and test "biases" that
cause subsets of test-takers with identical trait levels to have group-specific different response patterns. By
identifying such issues, IRT allows assessment constructors to address issues of fairness and score quality
directly within the measurement model itself, rather than addressing such issues later as an afterthought-as is
often the case in CTT based approaches. Finally, the practical power and potential of IRT in HR assessment is
illustrated by explaining how this approach extracts valuable information on employment candidates that CTT
approaches inherently miss.
IRT STATISTICS: THE GOLD STANDARD IN MEASUREMENT QUALITY
IRT now dominates educational testing, especially in applications where results have real-world, high-stakes
implications for those being tested, for example, with well-known achievement and accreditation assessments
such as the GRE, MCAT, GMAT, and LSAT. Unfortunately, its application is lagging in HR assessment. Here,
assessment providers almost universally use traditional, CTT-based statistics to develop and validate their
products. While vendors subsequently market them as satisfying the quality standards set forth by the
Standards for Educational and Psychological Testing1 and the Uniform Guidelines for Employee Selection
Procedures,2 in fact, there is no guarantee that CTT-based approaches provide valid or unbiased measures of
test-takers' traits, attitudes, or knowledge areas. Thus, organizations that use such tools are unsuspectingly at
legal risk, for example, noncompliance with EEOC Title VII discrimination standards and adverse impact.
The psychometric reasons behind the change toward IRT are captured nicely by the following tongue-in-cheek
account provided by Professor Ben Wright, one of the earliest and most influential proponents of Rasch scaling.
Although phrased in terms of IQ scores, the following applies equally to HR assessment:
Ever since I was old enough to argue with my pals over who had the best IQ (I say "best" because some
01 March 2015

Page 2 of 9

ProQuest

thought 100 was perfect and 60 was passing), I have been puzzled by mental measurement. We were mixed up
about the scale. IQ units were unlike any of those measures of height, weight, and wealth with which we were
learning to build a science of life. Even that noble achievement, 100 percent [correct], was ambiguous. One
hundred might signify the welcome news that we were smart. Or it might mean the test was easy. Sometimes
we prayed for easier tests to make us smarter.3
This anecdote illustrates the absurd situations that arise when the difficulty of test or assessment questions is
confused with the knowledge or ability of the test-taker. By way of explanation, assume that new hires this year
score higher on Leadership than last year's new hires. Does this mean that candidates did better this year? Or,
does this mean that the new assessment is easier than the previous one? Might it be possible that the new
assessment is so much easier that the current year's candidates actually did worse than last year's candidates?
Or, just maybe the same online assessment was used from one year to the next, allowing this year's candidates
an opportunity to learn about the content and therefore better game the assessment. Similar issues play when
comparing groups of applicants. For instance, if women score higher on average than men, how can anyone be
sure that the assessment is not biased against men?
Wright then showed how IRT can solve the issues raised above, thereby illustrating that from the start IRT is
concerned with fairness in identifying the source of similarities and differences in test or assessment scores.
Just as a standard ruler should measure only the length of an object and not its weight, size, age, or color, HR
assessments should measure only the test-takers' ability or knowledge versus their gender, religion, ethnicity,
sexual orientation, or economic backgrounds.
Within the CTT framework, establishing the technical quality of questionnaires, tests and their subtests, and
assessments in general relies heavily on correlations computed over raw scores, as these form the basis for
factor analysis and internal consistency statistics (e.g., KR-20 or Coefficient Alpha). Although early warnings
against this practice can be found in the literature, a statistical technique called item-level factor analysis
continues to be used first to identify distinct "subscales" within larger item sets. Given acceptable inter-item
correlations, it is then assumed that the (sometimes weighted) "raw" scores of the items in such clusters yield
valid indices of whatever the items appear to have in common. For example, one might have items addressing
issues regarding Service Orientation, Leadership, or Emotional Intelligence. Respondents' answers might be
scored as either 0 or 1. In the context of Service Orientation, checking the answer "yes" to the question "I like
helping customers" might be treated as 1 while checking "no" might yield 0.
Although this scoring appears reasonable, it ignores the fact that some questions are more important than
others. Case in point, on a math test, knowing what is the square root of 36 suggests greater knowledge of
mathematics than does (just) knowing that 1 + 1=2, and ignoring this fact introduces a distortion and loss of
information, reminiscent of toddlers' insight that having three coins makes for more money than having two
paper bills. To make matters worse, test answers also have a chance component as equally capable individuals
typically have different answer patterns, and this prevents the use of simple item weighting schemes, as such
weights would have to be estimated without first already knowing the desired total.
Some Instructive Examples
The following provides illustrations of some of central differences between CTT and Rasch IRT. It relies heavily
on graphical examples to avoid overreliance on technical details. That said, important technicalities are included
to provide ample documentation required in a reference work of this nature.
1. Rasch IRT conceives of test answers as the probabilistic outcome of test-takers (i) with some ability (or trait
level) A. responding to items (j) with difficulty levels D., as
(ProQuest: ... denotes formulae omitted.)(1)
where P.. denotes the probability of person i answering correctly item j, and log(...) denotes the taking of natural
logarithms (i.e., to the base ). Note that the item and person parameters are expressed in a common metric,
which given the left-hand side of Eq. 1, are referred to as "logits." Given appropriate test data, software exists to
01 March 2015

Page 3 of 9

ProQuest

obtain maximum likelihood estimates of the various A and D in Eq. 1, but a presentation of the estimation
process is beyond the scope of this article and interested readers are instead referred to Wright and Stone
(1979). It is further noted that Rasch (1960/1980) scaling also can model ordered multicategory responses,
sometimes called "Likert-scales," but this formulation is not further discussed here.
Figure 1 shows P.. as obtained by solving Eq. 1 for hypothetical items 1, 2, and 3 with difficulties D2 = -1, D2 =
0, and D3 = 2, with A ranging from -2 to 4. Note that the probability of observing a correct answer (or an answer
indicative of greater trait levels) ranges from 0 to 1, and that P.. always increases with A. The rate of increase is
maximal whenever D = A, and at that point P.. = 0.5. Naturally, real tests and assessments have more than
three questions, and a test can be characterized by the spread and locations of the item parameters D relative
to test-takers' trait levels A. In most well-designed tests and assessment, at least 10 items are used per subfactor and the average of respondents' A parameters exceeds the items' D average by about 0.5 logits. Also,
the distribution of A values often exhibits greater "spread" than does the D distribution, as quantified by these
parameters' statistical standard deviations (SD). For instance, in a typical academic subject test, SDd might be
1.0 to 1.5 with SDA less than 2.
2. Assuming that Equation 1 holds, it becomes possible to obtain a maximum-likelihood estimate of test-takers'
abilities (or trait levels) as a function of the number of correct test answers. It is crucial to note that the rawscore (R) to trait-level transformation (A) is always nonlinear. This is illustrated in Figure 2 for a hypothetical test
with 31 items whose D follow a normal distribution with MD = 0 and SDd = 1. As is shown along the X-axis, one
can get from 0 to 31 questions correct on such a test. Two major conclusions can be drawn from Figure 2, and
these conclusions hold for all tests with a fixed number of questions.
a. Getting one more (fewer) questions "right" for extreme raw sums, changes the estimated trait level A in Logits
(shown along the Y-axis) more than does a similar change near the middle of the dimension. For instance, one
additional "point" increases A by about 1 Logit at the lower extreme, and by about 0.25 Logits near the middle of
the raw score dimension. This makes the relation between raw scores and test-takers' estimated traits or ability
inherently nonlinear, and thus the use of raw sum scores as in CTT distorts measurement.
b. As is indicated by the vertical bars, the Standard Error of Estimate (SEA) for the estimated A values is
smallest for intermediate raw scores (near 15) and greatest for very low or very high raw scores (near 0 and
31). In fact, the SE values vary from about 1 Logit near the extremes to about 0.3 Logits near the middle. Again,
this pattern is universal, indicating that the reliability of all "scores" varies across the raw score dimension. In
other words, the reliability of any test cannot be captured by a single reliability index. This means that testtakers with different raw scores receive qualitatively different treatment because their test results will have
inherently different degrees of reliability. This, in turn, will affect the precision of any decisions that are made
using this test.
3. One might ask whether it is possible to change the preceding findings by using items with different statistical
properties, and flattening the distribution of item difficulties is undoubtedly the most promising approach. Figure
3 indicates that while it is possible to counteract differential reliability, the correction is poor and it comes at an
unacceptable price. In particular, the solid lines in the top part of this figure show the SEA for estimated A
ranging from -5 to +5 Logits. The line labeled "Normal test" shows SEa for items with SDd = 1, and also plotted
are the SEa for a "Wide" test with SDd = 5. One can see that increasing the spread of the items, as in the
"Wide" test, indeed decreases the disparity among the SEA, thereby equalizing test scores' reliability. But, the
correction is imperfect, and the overall SEA has increased considerably (and thus reliability has decreased) for
intermediate A. In this area one finds typically most of the test-takers, and hence there is little to be gained from
using a "Wide" test.
Furthermore, the dots in the "Normal" and "Wide" lines correspond to a particular raw score (not listed). One
can see that the latter test produces coarser measures (i.e., A estimates) because near the middle the
differences between the dots along the X-axis are far greater for the "Wide" test than for the "Normal" test.
01 March 2015

Page 4 of 9

ProQuest

4. IRT rests on the assumption that items' properties are invariant. Specifically, items' (relative) difficulties
should be the same across all sub-populations. Bias occurs whenever some of the items' locations differ among
subgroups. For instance, an assessment of Service Orientation would show gender bias if equally serviceoriented men and women had different endorsement rates of a relevant question or statement, for example,
"The customer is always right." Such a situation would introduce two major problems. First, this means that the
relation between "customers being right" and other issues like "a competitive attitude," "exceeding others'
expectations," and "attending promptly to customer's complaints" differ between the sexes. Apart from the fact
that men and women simply might see "Service Orientation" in different ways (and this should be investigated),
it also introduces fairness issues because the overall test scores of men and women are thereby affected. This,
in turn, might affect hiring decisions.
Although a complete presentation is beyond the scope here, the potential for adverse impact and other
challenges to legal defensibility are illustrated effectively via the results of a simulation of an assessment with 21
uniformly distributed items and assuming that some criterion was reached at 1.85 Logits. Relative to the
"Original" version, a second "Biased" version was created in which the difficulty parameters (D) of the five most
difficult items had been decreased by 0.75 Logits. Such effects might well occur due to unfortunate selection of
item calibration samples. The results indicate that biasing just five items can affect hiring decisions materially.
Specifically, since the biased items are "easier," Figure 4 shows that a higher raw score is needed to achieve
similar trait level estimations. As a result, to reach a hypothetical "Criterion" value of 1.85, the "Original" test
requires getting 16 items right, whereas the biased test requires getting 17 items right.
5. IRT based applications have seen a revolution in recent years.4 On the quantitative side, Web-based
systems now make it possible to test literally thousands of people within a single day, to compute their scores,
and to report the results quickly. Qualitative changes have occurred as well, including computer adaptive testing
(CAT), in which the questions are tailored to match each test-taker's achievement level, and diagnostic testing,
which aims to identify an individual's particular strengths and weaknesses. Therefore, it is only a matter of time
before the advantages of IRT-based methods begin to be reflected in the practices of all serious HR
assessment services, just as they have been adopted in other professional testing domains.
For instance, many of the issues addressed in Examples 1 through 3, but not 4, can be avoided by the use of
CAT. In this approach, test-takers are presented with different items that were selected to provide the most
information concerning each individual test-taker. CAT requires the use of computers as it involves considerable
computation and searches of item banks. Essentially, CAT dynamically computes a test-taker's trait level A
based on all available information, and it then presents as the next item the one that is nearest to this estimate.
By using maximally informative items, CAT produces trait estimates A with greater reliability (same SEA) as
would a standard fixed-length test. In fact, as is illustrated by the top dashed line in Figure 3, CAT achieves the
same precision as the "Normal" 30-item test discussed earlier by using just 16 items. Alternatively, when the
same number of items is used in CAT (i.e., 30), superior reliability (smaller SEA) is obtained. Given a sufficiently
large item bank, the reliability of CAT-based estimates of A remains invariant across the entire range of interest.
Thus, in contrast to the case for fixed-length tests, CAT provides equally reliable estimates, regardless of testtakers' trait levels.
IRT PRACTICAL BENEFITS AND CLOSING CONSIDERATIONS
The technical details above were necessary to explain and underscore the critical importance of measurement
quality in assessments. Unreliability and question bias, even unwitting violations, are legitimate challenges to an
assessment's legal defensibility as these can contribute to adverse impact or general noncompliance with the
EEOC Title VII discrimination standards. For this reason, CTT-based assessments are outdated and should be
avoided. Along with a desire to avoid legal issues related to a lack of reliability, validity, and fairness, there are
practical reasons for switching from CTT to IRT, especially since online testing typically produces sufficiently
large datasets to expand the approach into many new directions. For example, the flexibility afforded by the
01 March 2015

Page 5 of 9

ProQuest

Web-based administration of assessments easily allows for the introduction and calibration of new questions or
question types ivithout losing continuity and without the need to recompute baselines or previously established
cutoff scores.
Moreover, the availability of a variety of person fit statistics allows outlying {i.e., aberrant) respondents and
responses to be identified5 and to be leveraged for diagnostic and classification purposes. IRT does not reveal
only that a particular rating or answer is low or high in some population or sample of interest. Rather, it identifies
low or high ratings relative to what is to be expected given a particular individual's trait level. Thus, the same
rating might be too low or too high, depending on the test-taker's trait level. Of course, as is the case for most
HR assessments, individuals' scores also can be compared to particular benchmarks, norm groups, and
populations-an effective exercise when there are validated success profiles for particular positions and within
the context of specific company cultures.
Analyzing every answer in a test-taker's record in this way derives an in-depth profile of an individual's particular
views on issues related to the variable under consideration. This information can be used in two major ways.
Firstly, it serves to differentiate between respondents with very similar scores on a particular trait or variable.
For instance, whereas before we might have to conclude that Applicants A and B are indistinguishable because
they have the same total scores, we can now investigate their idiosyncratic differences in detail. Secondly, the
finding of outlying answers provides vital information to HR professionals and hiring managers that can be used
effectively during interviews or follow-up with professional references of a candidate.6 This IRT fit analysis also
can be extremely valuable to HR professionals working with incumbents, because it identifies performance
goals and training issues that need to be addressed in order to increase an individual's trait level of Service
Orientation (or any other trait or competency). In essence, such fit analyses can serve as the foundation for
evidence-based onboarding, training, and development action plans.
Given the increased competition among the 2,000 plus businesses that offer HR assessment, vendors
increasingly are pressured to differentiate themselves in their marketing and sales materials. Practitioners
understandably prefer assessments that are simple, user-friendly, and consistent with their own philosophy
toward candidate selection (e.g., screening for personality versus competency). Building on this foundation,
assessment vendors often differentiate or promote their products in three main ways: (1) stating their number of
years in business in the field of industrial-organizational psychology, (2) listing the academic credentials of their
staff, and (3) referencing a body of apparent supporting research and literature for their assessments. Each of
these marketing tactics aim to instill customer confidence in the scientific validity of a vendor's products. Other
vendors provide more detailed points of consideration when HR professionals consider competing products.
Table 1 summarizes these issues around a predictable set of questions.
In conclusion, although all of the questions in the first column of this table deserve consideration by HR
professionals and other professionals when evaluating competing assessments, column two indicates that each
issue is effectively meaningless in the absence of a demonstrable, high-quality of measurement. The same
holds true regardless of an assessment's simplicity, user-friendliness, or theoretical orientation. Of course, the
benefits of superior measurement provided by IRT have HR and business applications far beyond the
construction of selection instruments. Modern test approaches can ensure reliability, validity, and statistical
fairness across many other business and HR tools, such as associate engagement surveys, customer
satisfaction surveys, and employee 360-degree performance reviews. In psychometrics, minimizing legal risks
and maximizing meaningful knowledge are ultimately two sides of the same coin.
Footnote
NOTES
1. American Educational Research Association, American Psychological Association, &National Council on
Measurement, 2002.
2. Equal Employment Opportunity Commission, Department of Labor, Department of Justice, &the Civil Service
01 March 2015

Page 6 of 9

ProQuest

Commission, 1978.
3. Wright &Stone, 1979, p.xi.
4. Cf. Houran &Lange, 2007.
5. Wright &Stone, 1979.
6. Cf. Houran &Lange, 2007.
References
REFERENCES
American Educational Research Association, American Psychological Association, &National Council on
Measurement (2002). Standards for educational and psychological testing. Washington, DC.
Bond, T. G., &Fox, C.M. (2001). Applying the Rasch model: fundamental measurement in the human sciences.
Mahwah, NJ: Lawrence Erlbaum.
Equal Employment Opportunity Commission, Department of Labor, Department of Justice, &the Civil Service
Commission (1978). Section 60-3, Uniform Guidelines on Employee Selection Procedure (1978); 43 FR 38295
(August 25, 1978). Washington, DC: Author.
Houran, J., &Lange, R. (2007). State-of-the-art measurement in human resource assessment. International
Journal of Tourism and Hospitality Systems, 1, 78-92.
Lange, R. (2007). Binary items and beyond: a simulation of computer adaptive testing using the Rasch partial
credit model. In E. Smith &R. Smith (Eds.), Rasch measurement: advanced and specialized applications, 148180. Maple Grove, MN: JAM Press.
Lord, F. M., &Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Naglieri, J, A., Drasgow, E, Schmit, M., Handler, L., Prifitera, A., Margolis, A., &Velasquez, R. (2004).
Psychological testing on the Internet: new problems, old issues. American Psychologist, 59, 150-162.
Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests. Chicago, IL: MESA
Press.
Wright, B. D., &Stone, M. H. (1979). Best test design. Chicago, IL: MESA Press.
AuthorAffiliation
Rense Lange holds a doctorate in psychology and a master's in computer science from the University of Illinois
at Urbana-Champaign. He has authored over 100 refereed papers and books in artificial intelligence, education,
hospitality, medicine, and psychometrics. His experience includes the annual scaling of millions of students'
standardized assessment tests. James Houran, PhD., currently serves as a managing director at AETHOS
Consulting Group, which specializes in human capital solutions for companies with service-driven cultures,
including executive search and compensation consulting, culture creation and organizational effectiveness,
talent assessment, and customer and employee opinion surveys. The authors may be contacted at
rense.lange@gmail.com and jhouran@aethoscg.com, respectively.
Subject: Studies; Human resource management; Quality; Best practice; Due diligence;
Location: United States--US
Classification: 9130: Experiment/theoretical treatment; 6100: Human resource planning; 5320: Quality control;
9190: United States
Publication title: Employee Relations Law Journal
Volume: 40
Issue: 4
Pages: 50-64

01 March 2015

Page 7 of 9

ProQuest

Number of pages: 15
Publication year: 2015
Publication date: Spring 2015
Publisher: Aspen Publishers, Inc.
Place of publication: New York
Country of publication: United States
Publication subject: Law, Business And Economics--Labor And Industrial Relations
ISSN: 00988898
CODEN: ERLJDC
Source type: Trade Journals
Language of publication: English
Document type: Feature
Document feature: Equations Graphs Tables References
ProQuest document ID: 1648612842
Document URL: http://search.proquest.com/docview/1648612842?accountid=48290
Copyright: Copyright Aspen Publishers, Inc. Spring 2015
Last updated: 2015-01-29
Database: ProQuest Health Management,ABI/INFORM Complete,ProQuest Research Library

01 March 2015

Page 8 of 9

ProQuest

Bibliography
Citation style: APA 6th - American Psychological Association, 6th Edition
Lange, R., & Houran, J. (2015). "Quality of measurement": The implicit legal cornerstone of HR assessments.
Employee Relations Law Journal, 40(4), 50-64. Retrieved from
http://search.proquest.com/docview/1648612842?accountid=48290

_______________________________________________________________
Contact ProQuest

Copyright 2015 ProQuest LLC. All rights reserved. - Terms and Conditions

01 March 2015

Page 9 of 9

ProQuest

S-ar putea să vă placă și