Module6 PDF

In
this module, we will consider performance assessment.
1
Like objec:ve tes:ng, performance assessment has some strengths and some weaknesses.
Both measurement approaches should be in a classroom teacher's set of measurement
tools, available to be used in the right situa:on.
2
Performance assessment is one type of assessment in which students are involved in
activities where they demonstrate skills and/or create products.
Performance assessment differs from tradi:onal assessment in the degree to which the
assessment task matches the behavior domain to which you want to make inferences.
Performance assessment is a very good way to directly measure learning.
This type of assessment is also called “alterna:ve assessment” or “authen:c assessment.”
3
Recall module 5 where we compared different types of objec:ve items. Now will we
compare objec:ve tests to performance assessments.
Objec:ve tests have several advantages over performance assessment.

For instance, objec:ve tests have quick and objec:ve scoring. They can contain informa:on
from a large number of content and can measure learning outcomes from knowledge to
evalua:on level. Objec:ve items are amenable to the use of item analysis.
In contrast, performance assessment has some advantages over objec:ve tests.

Performance assessment can provide a direct measure of student learning. It can assess the
process of doing as well as the final product and it also can measure greater depth of
understanding.
4
There are also some disadvantages to both objec:ve tests and performance assessment.
It can be :me-‐consuming to write good objec:ve test items. Without a well-‐designed test
blueprint, the objec:ve test may overemphasize knowledge level. Objec:ve items may have
more than one defensible answer. The objec:ve test can be an unfamiliar format.
OOen :mes performance assessment has very few items with high task-‐specificity and, as a
result, the assessment results will have low generalizability. Performance assessment oOen
contains narrow domains so choosing appropriate domains is very crucial for performance
assessment. The scoring of performance assessment is subjec:ve in nature, which may
decrease the consistency (or reliability) of test scores.
5
Now we focus on developing a performance assessment. Generally speaking, there are four
steps for developing a performance assessment.
Step 1 is to “decide what to test”

Step 2 is to “design the assessment context”
Step 3 is to “create scoring rubrics”
Step 4 is to “specify constraints”
6
The first step of developing a performance assessment is “deciding what to test”.
The easy way to do this is to create a list of instruc:onal objec:ves that you would like to
assess. This step is similar to developing a test plan for objec:ve tests. Once you complete
step 1, you will have iden:fied, important knowledge, skills, and habits of mind that will be
the focus of performance assessment.
In addi:on to the objec:ves in the cogni:ve domain, instruc:onal objec:ves for the
affec:ve and social domains should also be taken into considera:on.
To determine which objec:ves to include from the cogni:ve domain, find out if anything is
missing from your tradi:onal tests or if there are any skills that would require students to
acquire, organize, and use informa:on – such as inves:ga:ng and problem solving.
7
Here are some examples of instruc:onal objec:ves in the cogni:ve domain for
performance assessment.
“Draw a physical map of North America from memory and locate 10 ci:es”
“Construct an electrical circuit using wires, a switch, a bulb, resistors, and a ba_ery”
“Describe two alterna:ve ways to solve a mathema:cs word problem”
"Program a calculator to solve an equa:on with one unknown”
8
Objec:ves for the affec:ve and social domain can include habits of mind, which would
include construc:ve cri:cism, respect for reason, and apprecia:on, and social skills which
would include coopera:on, sharing, and nego:a:on.
9
Examples of items for the Affec:ve and Social Domain could be:
• Willingness to modify explana:ons
• Coopera:ng in answering ques:ons and solving problems; or working together to pool
ideas, explana:ons, and solu:ons
• Apprecia:ng that mathema:cs is a discipline that helps solve real-‐world problems
•  Recognizing that there is more than one way to solve a problem.
10
Step 2 is designing the assessment context, which means to “create a task for learners to
demonstrate their knowledge, skills, or adtudes.”
It may be as straigheorward as asking students to complete an art project or write an essay
on their favorite hobby.
It is important to know that tasks created should focus on “real-‐world” issues, concepts or
problems. The ques:ons you can ask could be:
“What does the doing of (art, music, design) look like to professionals in the real world?”
“How can their real-‐world tasks be adapted to the school sedng?”
11
Regarding the tasks in performance assessment, the following things should be noted.
Make sure that the requirements for task mastery are clear without revealing the solu:on.
For instance, learners should be able to tell when they are finished.
The specific ac:vity is from which generaliza:ons can be made about knowledge and skills.
Task should be complex enough to provide wide range of behavior in a narrow skill domain.
Tasks should also be complex enough to allow for mul:-‐modal assessment, such as
observa:ons, oral reports, journals, exhibits, and so on.
12
Tasks should yield mul:ple solu:ons such as judgment and interpreta:on, each with costs
and benefits.
Tasks should require mental effort and self-‐regulated learning.
Tasks in performance assessment should require “persistence and determina:on” as well as
“the use of cogni:ve strategies” rather than depending on coaching.
13
When performance tasks have been developed, the following criteria can be used to
evaluate these tasks.
Generalizability: Can performance tasks be generalizable to comparable tasks?
Authen:city: How authen:c is the task, in other words is the task similar to a real-‐world
ac:vity?
Mul:ple Foci: Have you included mul:ple foci? Or Does it measure mul:ple outcomes?
Teachability: How teachable is the content? Is it likely that students will be proficient aOer
instruc:on?
Fairness: Is the performance task fair and unbiased to every student? Is the task beneficial
to high socioeconomic status students?
Feasibility: How feasible is the task? Does the school have the space and equipment? Do
students have enough :me to conduct and how much will it cost?
Scorability: Does the performance task have scorability? Can it be evaluated reliably and
accurately?
14
Step 3 of performance assessment is “crea:ng scoring rubric.”
When crea:ng rubrics do not limit scoring criteria to those that are easiest to measure.
In contrast, you should carefully construct detailed scoring systems to help you minimize
the arbitrariness of judgments. A scoring rubric holds learners to high standards of
achievement.
15
It is important to know that rubrics should be developed for a variety of accomplishments.
In general, performance assessment requires the following types of accomplishment:
including products, cogni:ve processes, and observable performance.
Products for performance assessment could be essays, graphs, movies, or websites.
Cogni:ve processes could be skills in acquiring, organizing, or using informa:on.
Observable performance could be dancing, dissec:ng frogs, or following recipes.
16
The second crucial considera:on in developing rubrics is to choose a scoring system
appropriate to the task you want to measure.
There are three types of rubrics to use: checklists, ra:ng scales, and holis:c scoring.
See your text for more informa:on:

-  pg 172-‐178 (8th ed)
-  pg 195-‐202 (9th ed)
17
Checklists contain a list of behaviors, traits or characteris:cs that can be scored as either
present or absent.
They are best suited for tasks that can be broken down into clearly defined, specific ac:ons.
When using a checklist you should provide for cases in which there was no opportunity to
observe a specific element. In such cases, the value of +1 represents the task present, 0 for
no opportunity to observe, and -‐1 for absent.
Typically, a task being present is marked as "1" or "yes" and not being present is marked as
a "0" or "no."

-  fig 8.5 & 8.6 (8th ed)
-  fig 9.5 & 9.6 (9th ed)
18
Ra:ng scales are typically used for more complex behaviors that yes/no judgments are not
enough.
The use of ra:ng scales usually involves assigning numbers to performance categories.
Most numerical ra:ng scales use an analy:c scoring technique called “primary trait
scoring.” Primary trait scoring requires the test developer to first iden:fy the most
important traits, and then assign numbers to represent degrees of performance. This helps
scorer focus on important criteria.

-  fig 8.7 & 8.8 (8th ed)
-  fig 9.7 & 9.8 (9th ed)
19
Holis:c scoring is used when the rater is more interested in es:ma:ng the overall quality of
the performance.
It is typically used with essays, term papers, dance or musical performance.
It is important to have a model for each category to ensure similar quality with categories.
-  fig 8.9 (8th ed)
-  fig 9.9 (9th ed)
20
Each of the three scoring systems has its par:cular strengths and weaknesses.
This table summarizes the comparisons of checklists, ra:ng scales, and holis:c scoring in
terms of ease of construc:on, scoring efficiency, reliability, defensibility, and quality of
feedback.
Checklists have the highest level of reliability, defensibility, and feedback while holis:c
scoring is easiest to construct and has high scoring efficiency.
Ra:ng scales received the moderate ra:ng for all facets of comparison.
21
Checklists, ra:ng scales, and holis:c judgments can be combined to determine total
assessment and this strategy should be used if a variety of traits are assessed.

-‐  fig 8.9 (8th ed)
-‐  fig 9.9 (9th ed)
22
In the scoring system, three sources of error may occur: including scoring instrument,
procedure, and teacher.
Common flaws in scoring instruments include lack of descrip:ve rigor and ambiguity which
can lead to unreliability.
Having too many grading criteria for a task or having too many students to rate can cause
procedural flaws.
Teachers can be a source of scoring error. There are mul:ple types of teacher bias.
Generosity error is when a teacher grades too leniently. Severity error is when a teacher
grades too harshly. Central-‐tendency error is when a teacher grades all students about the
same. The halo effect in grading is when the teacher’s adtude toward a student influences
the score a student receives.
23
Step four for developing performance assessment is specifying constraints.
Outside the classroom, professionals have constraints on their performance, such as
deadlines, limited office space, and outmoded equipment. In the same way, teachers need
to decide which condi:ons to impose on a performance task. Among the most typical
things of test constraints are:
•  Time: How much :me are students allowed to prepare, rethink, and finish a performance
task?
•  Reference material: Are students allowed to have reference materials?
•  Other people: Are they allowed to consult with other people?
•  Equipment: Can students use computers or calculators to help them solve problems?
•  Prior knowledge of task: How much informa:on will they be tested? Do they receive the
informa:on in advance?
•  Scoring criteria: Do students know the standards (or criteria) for their performance task in
advance?

-  pg 178 (8th ed)
-  pg 201 (9th ed)
24
To help decide what to do about these constraints, ask yourself the following ques:ons:
What constraints authen:cally replicate the real-‐world situa:on?
What constraints bring out the best performance in novices?
What are authen:c limits to place on the use of :me, help from others, reference
materials, etc.?
25
Like objec:ve tests, to determine the quality of performance assessment, validity and
reliability need to be considered.
These are the two most cri:cal criteria of test quality.
26
Validity is the extent to which the test actually measures what it's supposed to measure.
To help ensure validity, teachers should go over all the elements in any performance
assessment to look for possible problems with validity.
27
The following things should be considered:
Be a_en:ve to issues of task-‐specificity and domain-‐sampling
Recognize poten:al subjec:vity in raters
Inform students of performance criteria
Avoid common errors, such as, “failure to use en:re ra:ng scale,” “reliance on mental
record-‐keeping,” and “influence of prior percep:on of student.”
28
Reliability of the test refers to the consistency or stability of the test scores. Ideally,
students should get the same score regardless of who the rater is.
Reliability of a performance assessment is more challenging to achieve than with an
objec:ve test. One choice that teachers can make to help increase the reliability of
performance assessment is to use several performance tasks that are rela:vely small in
scope, rather than only one large task.
Other ways to increase reliability of assessments can be as follows:
When possible, obtain mul:ple observa:ons,
When possible, use mul:ple raters
Use smaller tasks.
Be explicit about assessment purpose and state the performance criteria and
ra:ng categories clearly.
29

Module6 PDF

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Module6 PDF

Încărcat de

Drepturi de autor:

Formate disponibile

In

this module, we will consider performance assessment.

Objec:ve tests have several advantages over performance assessment.

In contrast, performance assessment has some advantages over objec:ve tests.

Step 1 is to “decide what to test”

“Describe two alterna:ve ways to solve a mathema:cs word problem”

"Program a calculator to solve an equa:on with one unknown”

• Willingness to modify explana:ons

Tasks should require mental eﬀort and self-­‐regulated learning.

Generalizability: Can performance tasks be generalizable to comparable tasks?

See your text for more informa:on:

See your text for more informa:on:

See your text for more informa:on:

See your text for more informa:on:

See your text for more informa:on:

What constraints authen:cally replicate the real-­‐world situa:on?

What constraints bring out the best performance in novices?

Be a_en:ve to issues of task-­‐speciﬁcity and domain-­‐sampling

Recognize poten:al subjec:vity in raters

Inform students of performance criteria

When possible, obtain mul:ple observa:ons,

When possible, use mul:ple raters

Use smaller tasks.

S-ar putea să vă placă și

• Willingness to modify explana:ons

Tasks should require mental eﬀort and self-‐regulated learning.

What constraints authen:cally replicate the real-‐world situa:on?

Be a_en:ve to issues of task-‐speciﬁcity and domain-‐sampling