Documente Academic
Documente Profesional
Documente Cultură
1
Like
objec:ve
tes:ng,
performance
assessment
has
some
strengths
and
some
weaknesses.
Both
measurement
approaches
should
be
in
a
classroom
teacher's
set
of
measurement
tools,
available
to
be
used
in
the
right
situa:on.
2
Performance
assessment
is
one
type
of
assessment in which students are involved in
activities where they demonstrate skills and/or create products.
Performance
assessment
differs
from
tradi:onal
assessment
in
the
degree
to
which
the
assessment
task
matches
the
behavior
domain
to
which
you
want
to
make
inferences.
Performance assessment is a very good way to directly measure learning.
This type of assessment is also called “alterna:ve assessment” or “authen:c assessment.”
3
Recall
module
5
where
we
compared
different
types
of
objec:ve
items.
Now
will
we
compare
objec:ve
tests
to
performance
assessments.
4
There
are
also
some
disadvantages
to
both
objec:ve
tests
and
performance
assessment.
It
can
be
:me-‐consuming
to
write
good
objec:ve
test
items.
Without
a
well-‐designed
test
blueprint,
the
objec:ve
test
may
overemphasize
knowledge
level.
Objec:ve
items
may
have
more
than
one
defensible
answer.
The
objec:ve
test
can
be
an
unfamiliar
format.
OOen
:mes
performance
assessment
has
very
few
items
with
high
task-‐specificity
and,
as
a
result,
the
assessment
results
will
have
low
generalizability.
Performance
assessment
oOen
contains
narrow
domains
so
choosing
appropriate
domains
is
very
crucial
for
performance
assessment.
The
scoring
of
performance
assessment
is
subjec:ve
in
nature,
which
may
decrease
the
consistency
(or
reliability)
of
test
scores.
5
Now
we
focus
on
developing
a
performance
assessment.
Generally
speaking,
there
are
four
steps
for
developing
a
performance
assessment.
6
The
first
step
of
developing
a
performance
assessment
is
“deciding
what
to
test”.
The
easy
way
to
do
this
is
to
create
a
list
of
instruc:onal
objec:ves
that
you
would
like
to
assess.
This
step
is
similar
to
developing
a
test
plan
for
objec:ve
tests.
Once
you
complete
step
1,
you
will
have
iden:fied,
important
knowledge,
skills,
and
habits
of
mind
that
will
be
the
focus
of
performance
assessment.
In
addi:on
to
the
objec:ves
in
the
cogni:ve
domain,
instruc:onal
objec:ves
for
the
affec:ve
and
social
domains
should
also
be
taken
into
considera:on.
To
determine
which
objec:ves
to
include
from
the
cogni:ve
domain,
find
out
if
anything
is
missing
from
your
tradi:onal
tests
or
if
there
are
any
skills
that
would
require
students
to
acquire,
organize,
and
use
informa:on
–
such
as
inves:ga:ng
and
problem
solving.
7
Here
are
some
examples
of
instruc:onal
objec:ves
in
the
cogni:ve
domain
for
performance
assessment.
“Draw a physical map of North America from memory and locate 10 ci:es”
“Construct an electrical circuit using wires, a switch, a bulb, resistors, and a ba_ery”
8
Objec:ves
for
the
affec:ve
and
social
domain
can
include
habits
of
mind,
which
would
include
construc:ve
cri:cism,
respect
for
reason,
and
apprecia:on,
and
social
skills
which
would
include
coopera:on,
sharing,
and
nego:a:on.
9
Examples
of
items
for
the
Affec:ve
and
Social
Domain
could
be:
• Coopera:ng
in
answering
ques:ons
and
solving
problems;
or
working
together
to
pool
ideas,
explana:ons,
and
solu:ons
• Apprecia:ng that mathema:cs is a discipline that helps solve real-‐world problems
• Recognizing that there is more than one way to solve a problem.
10
Step
2
is
designing
the
assessment
context,
which
means
to
“create
a
task
for
learners
to
demonstrate
their
knowledge,
skills,
or
adtudes.”
It
may
be
as
straigheorward
as
asking
students
to
complete
an
art
project
or
write
an
essay
on
their
favorite
hobby.
It
is
important
to
know
that
tasks
created
should
focus
on
“real-‐world”
issues,
concepts
or
problems.
The
ques:ons
you
can
ask
could
be:
“What does the doing of (art, music, design) look like to professionals in the real world?”
“How can their real-‐world tasks be adapted to the school sedng?”
11
Regarding
the
tasks
in
performance
assessment,
the
following
things
should
be
noted.
Make
sure
that
the
requirements
for
task
mastery
are
clear
without
revealing
the
solu:on.
For
instance,
learners
should
be
able
to
tell
when
they
are
finished.
The
specific
ac:vity
is
from
which
generaliza:ons
can
be
made
about
knowledge
and
skills.
Task
should
be
complex
enough
to
provide
wide
range
of
behavior
in
a
narrow
skill
domain.
Tasks
should
also
be
complex
enough
to
allow
for
mul:-‐modal
assessment,
such
as
observa:ons,
oral
reports,
journals,
exhibits,
and
so
on.
12
Tasks
should
yield
mul:ple
solu:ons
such
as
judgment
and
interpreta:on,
each
with
costs
and
benefits.
Tasks
in
performance
assessment
should
require
“persistence
and
determina:on”
as
well
as
“the
use
of
cogni:ve
strategies”
rather
than
depending
on
coaching.
13
When
performance
tasks
have
been
developed,
the
following
criteria
can
be
used
to
evaluate
these
tasks.
Authen:city:
How
authen:c
is
the
task,
in
other
words
is
the
task
similar
to
a
real-‐world
ac:vity?
Mul:ple Foci: Have you included mul:ple foci? Or Does it measure mul:ple outcomes?
Teachability:
How
teachable
is
the
content?
Is
it
likely
that
students
will
be
proficient
aOer
instruc:on?
Fairness:
Is
the
performance
task
fair
and
unbiased
to
every
student?
Is
the
task
beneficial
to
high
socioeconomic
status
students?
Feasibility:
How
feasible
is
the
task?
Does
the
school
have
the
space
and
equipment?
Do
students
have
enough
:me
to
conduct
and
how
much
will
it
cost?
Scorability:
Does
the
performance
task
have
scorability?
Can
it
be
evaluated
reliably
and
accurately?
14
Step
3
of
performance
assessment
is
“crea:ng
scoring
rubric.”
When crea:ng rubrics do not limit scoring criteria to those that are easiest to measure.
In
contrast,
you
should
carefully
construct
detailed
scoring
systems
to
help
you
minimize
the
arbitrariness
of
judgments.
A
scoring
rubric
holds
learners
to
high
standards
of
achievement.
15
It
is
important
to
know
that
rubrics
should
be
developed
for
a
variety
of
accomplishments.
In
general,
performance
assessment
requires
the
following
types
of
accomplishment:
including
products,
cogni:ve
processes,
and
observable
performance.
Products for performance assessment could be essays, graphs, movies, or websites.
Cogni:ve processes could be skills in acquiring, organizing, or using informa:on.
Observable performance could be dancing, dissec:ng frogs, or following recipes.
16
The
second
crucial
considera:on
in
developing
rubrics
is
to
choose
a
scoring
system
appropriate
to
the
task
you
want
to
measure.
There are three types of rubrics to use: checklists, ra:ng scales, and holis:c scoring.
17
Checklists
contain
a
list
of
behaviors,
traits
or
characteris:cs
that
can
be
scored
as
either
present
or
absent.
They are best suited for tasks that can be broken down into clearly defined, specific ac:ons.
When
using
a
checklist
you
should
provide
for
cases
in
which
there
was
no
opportunity
to
observe
a
specific
element.
In
such
cases,
the
value
of
+1
represents
the
task
present,
0
for
no
opportunity
to
observe,
and
-‐1
for
absent.
Typically,
a
task
being
present
is
marked
as
"1"
or
"yes"
and
not
being
present
is
marked
as
a
"0"
or
"no."
18
Ra:ng
scales
are
typically
used
for
more
complex
behaviors
that
yes/no
judgments
are
not
enough.
The use of ra:ng scales usually involves assigning numbers to performance categories.
Most
numerical
ra:ng
scales
use
an
analy:c
scoring
technique
called
“primary
trait
scoring.”
Primary
trait
scoring
requires
the
test
developer
to
first
iden:fy
the
most
important
traits,
and
then
assign
numbers
to
represent
degrees
of
performance.
This
helps
scorer
focus
on
important
criteria.
19
Holis:c
scoring
is
used
when
the
rater
is
more
interested
in
es:ma:ng
the
overall
quality
of
the
performance.
It is typically used with essays, term papers, dance or musical performance.
It
is
important
to
have
a
model
for
each
category
to
ensure
similar
quality
with
categories.
See
your
text
for
more
informa:on:
-
fig
8.9
(8th
ed)
-
fig
9.9
(9th
ed)
20
Each
of
the
three
scoring
systems
has
its
par:cular
strengths
and
weaknesses.
This
table
summarizes
the
comparisons
of
checklists,
ra:ng
scales,
and
holis:c
scoring
in
terms
of
ease
of
construc:on,
scoring
efficiency,
reliability,
defensibility,
and
quality
of
feedback.
Checklists
have
the
highest
level
of
reliability,
defensibility,
and
feedback
while
holis:c
scoring
is
easiest
to
construct
and
has
high
scoring
efficiency.
Ra:ng scales received the moderate ra:ng for all facets of comparison.
21
Checklists,
ra:ng
scales,
and
holis:c
judgments
can
be
combined
to
determine
total
assessment
and
this
strategy
should
be
used
if
a
variety
of
traits
are
assessed.
22
In
the
scoring
system,
three
sources
of
error
may
occur:
including
scoring
instrument,
procedure,
and
teacher.
Common
flaws
in
scoring
instruments
include
lack
of
descrip:ve
rigor
and
ambiguity
which
can
lead
to
unreliability.
Having
too
many
grading
criteria
for
a
task
or
having
too
many
students
to
rate
can
cause
procedural
flaws.
Teachers
can
be
a
source
of
scoring
error.
There
are
mul:ple
types
of
teacher
bias.
Generosity
error
is
when
a
teacher
grades
too
leniently.
Severity
error
is
when
a
teacher
grades
too
harshly.
Central-‐tendency
error
is
when
a
teacher
grades
all
students
about
the
same.
The
halo
effect
in
grading
is
when
the
teacher’s
adtude
toward
a
student
influences
the
score
a
student
receives.
23
Step
four
for
developing
performance
assessment
is
specifying
constraints.
Outside
the
classroom,
professionals
have
constraints
on
their
performance,
such
as
deadlines,
limited
office
space,
and
outmoded
equipment.
In
the
same
way,
teachers
need
to
decide
which
condi:ons
to
impose
on
a
performance
task.
Among
the
most
typical
things
of
test
constraints
are:
•
Time:
How
much
:me
are
students
allowed
to
prepare,
rethink,
and
finish
a
performance
task?
•
Reference
material:
Are
students
allowed
to
have
reference
materials?
•
Other
people:
Are
they
allowed
to
consult
with
other
people?
•
Equipment:
Can
students
use
computers
or
calculators
to
help
them
solve
problems?
•
Prior
knowledge
of
task:
How
much
informa:on
will
they
be
tested?
Do
they
receive
the
informa:on
in
advance?
•
Scoring
criteria:
Do
students
know
the
standards
(or
criteria)
for
their
performance
task
in
advance?
24
To
help
decide
what
to
do
about
these
constraints,
ask
yourself
the
following
ques:ons:
What
are
authen:c
limits
to
place
on
the
use
of
:me,
help
from
others,
reference
materials,
etc.?
25
Like
objec:ve
tests,
to
determine
the
quality
of
performance
assessment,
validity
and
reliability
need
to
be
considered.
These are the two most cri:cal criteria of test quality.
26
Validity
is
the
extent
to
which
the
test
actually
measures
what
it's
supposed
to
measure.
To
help
ensure
validity,
teachers
should
go
over
all
the
elements
in
any
performance
assessment
to
look
for
possible
problems
with
validity.
27
The
following
things
should
be
considered:
Avoid
common
errors,
such
as,
“failure
to
use
en:re
ra:ng
scale,”
“reliance
on
mental
record-‐keeping,”
and
“influence
of
prior
percep:on
of
student.”
28
Reliability
of
the
test
refers
to
the
consistency
or
stability
of
the
test
scores.
Ideally,
students
should
get
the
same
score
regardless
of
who
the
rater
is.
Reliability
of
a
performance
assessment
is
more
challenging
to
achieve
than
with
an
objec:ve
test.
One
choice
that
teachers
can
make
to
help
increase
the
reliability
of
performance
assessment
is
to
use
several
performance
tasks
that
are
rela:vely
small
in
scope,
rather
than
only
one
large
task.
Other ways to increase reliability of assessments can be as follows:
Be
explicit
about
assessment
purpose
and
state
the
performance
criteria
and
ra:ng
categories
clearly.
29