Sunteți pe pagina 1din 28

From Concepts to Data:

Conceptualization,
Operationalization, and
Measurement
in Educational Research

Larry D. Gruppen, Ph.D.


University of Michigan

Objectives
Identify key research
design issues
Wrestle with the
complexities of
educational measurement
Explain the concepts of
reliability and validity in
educational measurement
Apply criteria for
measurement quality
when conducting
educational research

Agenda
A brief nod to design
From theory to measurement
Criteria for measurement quality
Reliability
Validity
Application: analyze an article

Guiding Principles for


Scientific Research in Education
1.

Question: pose significant question that can be


investigated empirically

2.

Theory: link research to relevant theory

3.

Methods: use methods that permit direct investigation of


the question

4.

Reasoning: provide coherent, explicit chain of reasoning

5.

Replicate and generalize across studies

6.

Disclose research to encourage professional scrutiny and


critique

Study design
Study design consists of:
Your measurement method(s)
The participants and how they are assigned
The intervention
The sequence and timing of measurements
and interventions

Comparison Group
Pre-post design - compare intervention group to
itself
Non-equivalent control group design - compare
intervention group to an existing group
Randomized control group design - compare to
equivalent controls

Overview of Study Designs


Symbols
Each line represents a group.
x = Intervention (e.g. treatment)
O1, O2, O3= Observation (measurement) at
Time 1, Time 2, Time 3, etc.
R = Random assignment

Non-Experimental Designs

One-Group Posttest

x
O
1
x O
1

Quasi-Experimental Designs

Posttest-Only
Control Group

x O1
O1

One-Group
Pretest-Posttest

O1 x O 2

Control Group
Pretest-Posttest

O1 x O2
O1

O2

Experimental Designs

Posttest Only Randomized Control


Group

R x O1
R

O1

Randomized Control Group PretestPosttest

R O 1 x O2
R O1

O2

From Theory to Measurement

Theory
Constructs
Operational Definition
Measurement

Measurement
Measurement:
assignment of numbers to
objects or events
according to rules
Quality: reliability and
validity

The Challenge of Educational


Measurement
Almost all of the constructs we are interested in
are buried inside the individual
Measurement depends on transforming these
internal states, events, capabilities, etc. into
something observable
Making them observable may alter the thing we
are measuring

Examples of Measurement Methods


Tests (knowledge, performance): defined
response, constructed response, simulations
Questionnaires (attitudes, beliefs, preferences):
rating scales, checklists, open-ended responses
Observations (performance, skills): tasks
(varying degrees of authenticity), problems, realworld behaviors, records (documents)

Reliability
Dependability (consistency or stability) of
measurement
A necessary condition for validity

Types of Reliability

Stability (produces the same results with repeated measurements


over time):
Test-retest
Correlation between scores at 2 times
Equivalence/Internal Consistency (produces same results with
parallel items on alternate forms):
Alternate forms; split-half; Kuder-Richardson; Chronbachs alpha
Correlation between scores on different forms; Calculate
coefficient alpha (a)
Consistency (produces the same results with different observers or
raters):
Inter-rater agreement
Correlation between scores from different raters; kappa
coefficient

Validity
Refers to the accuracy of inferences based on
data obtained from measurement
Technically, measures arent valid, inferences
are
No such thing as validity in the abstract: the key
issue is valid for what inference
Want to reduce systematic, non-random error
Unreliability lowers correlations, reducing validity
claims

Conventional View of Validity


Face validity: logical link between items and purpose
makes sense on the surface
Content validity: items cover the range of meaning
included in the construct or domain. Expert judgment
Criterion validity: relationship between performance on
one measurement and performance on another (or actual
behavior) Concurrent and Predictive Correlation
coefficients
Construct validity: directly connect measurement with
theory. Allows interpretation of empirical evidence in
terms of theoretical relationships. Based on weight of
evidence. Convergent and discriminant evidence.
Multitrait-MultiMethod Analysis (MTMM)

Unified View of Construct Validity


(Messick S, Amer Psych, 1995)
Validity is not a property of an instrument but rather of the
meaning of the scores. Must be considered holistically.
6 Aspects of Construct Validity Evidence
Contentcontent relevance & representativeness
Substantivetheoretical rationale for observed consistencies in
test responses
Structuralfidelity of scoring structure to structure of construct
domain
Generalizabilitygeneralization to the population and across
populations
Externalconvergent and discriminant evidence
Consequentialintended and unintended consequences of score
interpretation; social consequence of assessment (fairness,
justice)

Finding Measurement Instruments

Scan the engineering education literature (obviously)


Email engineering ed researchers (use the network)
Examine literature for instruments used in prior studies
General education/social science instrument databases
Buros Institute of Mental Measurements (Mental
Measurement Yearbook, Tests in Print) http://buros.unl.
edu/buros/jsp/search.jsp
ERIC databases http://www.eric.ed.gov/
Educational Testing Service Test Collection http://www.ets.
org/testcoll/index.html
Construct your own (last resort!)
Get some expert consultation (test writing, survey design,
questionnaire construction, etc.)

Example
In your groups, analyze the Steif & Dantzler
statics concept inventory article. Look for:
Theoretical framework
Constructs used in the study
How constructs were operationalized
Measurement process
Attention to reliability and validity

References
Campbell DT, Stanley JC. Experimental and quasiexperimental designs for research. Chicago: Rand
McNally; 1969.
Cook, T.D. and Campbell, D.T. (1979). QuasiExperimentation: Design and Analysis for Field Settings.
Rand McNally, Chicago, Illinois.
Messick S. Validity of psychological assessment:
validation of inferences from persons' responses and
performances as scientific inquiry into score meaning.
American Psychologist. 1995;50:741-749.
Messick S. Validity. In: Linn RL, ed. Educational
measurement. 3rd ed. New York: American Council on
Education & Macmillan; 1989:13-103.

S-ar putea să vă placă și