Assessment Concepts For Web 14 Apr 2014

Assessment Concepts
The following section is a discussion of key assessment concepts. Concepts which are
related have been grouped together. Examples have been provided to illustrate some of the
concepts, but are by no means representative of all applications of the concept.
Assessment is the process of gathering and analysing evidence about student learning for
making decisions and enhancing learning. It is used for making decisions about students,
curricula, programmes and educational policy, with a view to providing information to
students, teachers, schools, parents, other stakeholders and to the education system1.
Assessment is also meant to enhance learning through activities undertaken by teachers
when the information serves as feedback to modify the instructional practice. For example, it
answers important questions such as whether the students have learnt or how well they
mastered a concept or topic.
Evaluation is the process of making a value judgment about the worth of a students product
or performance. Evaluation is the basis for decisions about which course of action should be
taken.
Purpose of Assessment
Assessment for Learning (AfL) is assessment that supports teaching and learning1. For
instance, teachers may identify gaps in student learning, and provide quality feedback for
students on how to improve their work. AfL is used to redirect learning in ways that help
learners master learning goals, and is primarily used for ensuring that the intended learning
outcomes are achieved by students. For these reasons, it is formative in nature, and is
central to classroom instruction.
Languages: For effective language learning, teachers regularly obtain feedback on
learning and teaching and use the feedback to shape decisions on what to teach in
the next lesson or even in the next moment in the lesson. For example, teachers use
questions to elicit information on what pupils understand and can do, and to
encourage dialogue. Some modes of assessment that teachers employ in assessing
for learning are informal quizzes, portfolios and learning logs.
Assessment of Learning (AoL) ascertains what students have learnt and functions as a
means to judge if curricular outcomes have been met. It is used primarily for accountability
purposes grading, ranking and certification to record and report what has been learned.
For these reasons, it tends to be summative in nature and is usually carried out at the end
of a unit, semester or year.
Formative Assessment is carried out during the instructional process to provide feedback
to adjust ongoing teaching and learning in order to improve the students achievement of
intended instructional outcomes. It may involve informal methods such as observation and
oral questioning, or the formative use of more formal measures such as traditional quizzes,
portfolios, or performance assessments. Formative assessment entails observations which
allow one to determine the degree to which students know or are able to do a given learning
task, and which identify the part of the task that the student does not know or is unable to do
so.
Summative Assessment is carried out at the end of an instructional unit or course of study
for the purpose of giving information on students mastery of content, knowledge and skills2,
CONFIDENTIAL
assigning grades or certifying student proficiency3. It is designed primarily to serve the

purpose of accountability, or of ranking, or of certifying competence or achievement at a
particular point. However, it is possible to use summative assessment in a formative way.
For example, teachers may analyse students performance in the class test and provide
clear, detailed and descriptive feedback to help students bridge the gap.
Assessment Modes
School-based Assessment is assessment that is designed, conducted and graded by
schools. In the Singapore context, school-based assessment can also include assessment
that is managed but not designed by schools, and the results contribute to a component in
the national examination, e.g. Elements of Business Skills coursework, Project Work at A
Levels.
National Assessment is designed, conducted and graded at a national level, usually by an
examination authority. In the Singapore context, examinations like the Primary School
Leaving Examination (PSLE) and GCE O Levels are national assessments.
Alternative Assessment refers to meaningful assessment options that are not traditional,
i.e. tasks other than pen-and-paper tests, standardised achievement tests and multiplechoice (true-false, matching, completion) item formats1. Modes of assessment can include
practical tasks such as Science Practical Assessment (SPA), project work such as Project
Work at A Levels, performance tasks such as drama, debates or oral presentations, etc.
Science: In an inquiry-based science classroom, assessment can take the form of
practicals, projects, model-making, journals, debates, drama and learning trails which
allow students to demonstrate their conceptual attainment and skill acquisition in
various ways.
Performance Assessment requires students to perform product- or behaviour-based tasks
that directly reflect the range of knowledge and skills they have learnt. It is based on settings
designed to emulate real-life contexts or conditions in which specific knowledge or skills are
applied4.
Geography: Working in groups, students conduct geographical investigations in the
field to apply and extend what they have learnt in class. In the process they will
appreciate the real-world applications of geographical knowledge and skills as well
as acquiring 21st century competencies.
Music: Practical performances are an integral part of music lessons as they provide
teachers with information on their students learning. Teachers also use rubrics to
provide students with information on their strengths and areas for improvement.
Peer Assessment is where students are involved in the assessment of the work of other
students. Students must have a clear understanding of what they are to look for in their
peers' work5.
Self-Assessment is where students are involved in the assessment of their own work.
Students can use a checklist or rubric to assess their own work and in some situations,
students can decide on the criteria for assessment. Students must have a clear
understanding of what they are to look for in their own work.
CONFIDENTIAL
Example of a self-assessment checklist

Sample Self-assessment Criteria on Writing
Almost
never
Some
of the
time
Most of
the
time
All the
time
Generation and
Selection of Ideas
for Writing and
Representing
Spelling
1. I spell accurately.
2. I check spelling accuracy, using print
and non-print resources.
3. I identify the purpose, audience and
context of the writing task.
4. I know how to generate and gather
ideas for the writing task.
5. My ideas are relevant to the topic.
Elements of Grading
Criterion-referenced Assessment has set criteria to be achieved and describes assessed
performance in terms of the kinds of tasks a person can do to achieve a certain standard1. A
criterion-referenced test is a test in which the results obtained can be used to determine a
students progress toward mastery of a content area.
Norm-referenced Assessment describes assessed performance in terms of a persons
position in a reference group that has been given the assessment task 1. A norm-referenced
test is a test where a students performance is compared to that of a norm group. The results
are relative to the performance of an external group and are designed to be compared with
the norm group providing a performance standard. Norm-referenced assessment is often
used to measure and compare students, schools, districts, and states on the basis of normestablished scales of performance.
Rubric refers to the description of established criteria and standards in relation to each
other. A scoring rubric usually refers to the descriptive scoring schemes developed to guide
the analysis of quality and is commonly used in the evaluation of open-ended written
answers or performance tasks. The judgment of the quality of the answer or performance
depends on the pre-defined criteria in the scoring scheme.
History and Social Studies: The Level of Response Marking (LORM) is an example of a
scoring rubric with each level describing the characteristics of a response that would receive
the respective score.
Analytic scoring is a method of scoring in which each criterion of performance is judged
and scored separately before the resultant values are combined for an overall score4. It is
used when the criteria used for judging the performance can be considered separately.
CONFIDENTIAL
Example of analytic scoring rubric for use of grammar and vocabulary in tasks:
Criteria
Level 0
Level 1
Level 2
Level 3
Level 4
Below
Expectation
s
Approachin
g
Expectation
s
Accurate and
appropriate
use of simple
words but
repetitive
and does not
fully convey
intended
meaning.
Some
inaccuracies
and
inappropriate
use of
idioms.
Accurate use
of items in
simple
structures
with frequent
errors arising
from
attempts to
use a
combination
of items.
Meeting
Expectation
s
Exceeding
Expectation
s
Generally
accurate,
varied and
appropriate
use of
words. Some
inaccurate
and
inappropriate
use of
ambitious
words and
idioms.
Accurate,
varied and
appropriate
use of words
across the
full range.
Accurate and
appropriate
use of
idioms.
Generally
accurate with
some errors
arising from
attempts to
use a
combination
of items in
complex
structures.
Accurate
apart from
very
occasional
slips when
used in
complex
structures.
Vocabulary
Errors in
choice of
words impede
meaning.
Some errors
in use of
simple words
and
inappropriate
to convey
intended
meaning.
Frequent
errors in use
of idioms.
Grammatic
al items
There is little
coherent or
understandabl
e sense due
to density of
linguistic error.
Frequent
errors of
various kinds
that affect
understandin
g though
meaning is
clear in some
parts.
e.g. tenses,
subjectverb
agreement
Holistic scoring is a method of obtaining a score on a test or test item, based on a

judgment of overall performance using specified criteria4. In contrast to analytic scoring,
there is an overlap between the criteria and hence it is not possible to separate the
evaluation into independent factors.
Example of holistic scoring rubric for oral presentation:
Level 3 Excellent
-
Fluent and confident in delivery
Displays an in-depth knowledge of the presentation topic
Provides well-elaborated responses to questions posed
CONFIDENTIAL
Level 2 Good
-
Generally fluent, with some pauses in delivery
Displays a good knowledge of the presentation topic
- Provides responses to questions posed

Level 1 Average/Needs improvement
-
Halting pace, with many pauses in delivery
Displays gaps in knowledge of the presentation topic
Provides short responses to questions posed
Accuracy in Assessment
Fairness is the principle that every test taker should be assessed in an equitable way4.
Some questions to ask when evaluating fairness include:
Are the learning targets and assessment criteria made clear to all students?
Are there sources of bias in the test questions which could distort the results for
certain subgroups of the population taking the test?
Is the test administered under standard conditions such that no student is unfairly
advantaged or disadvantaged?
Validity is the degree to which a test measures what it claims to measure. A test is valid to
the extent that inferences made from it are appropriate, meaningful, and useful. Some
questions to ask when evaluating validity include:
Do the assessment items or tasks represent the learning outcomes? (e.g. in a

Mathematics test, students should not be assessed on their English language
proficiency)
Is the test a representative sampling of the construct to be assessed?
Reliability is the degree to which test scores for a group of test takers are consistent over
repeated applications of a measurement procedure and hence are inferred to be
dependable, and repeatable for an individual test taker; or the degree to which scores are
free of errors of measurement for a given group4. Reliability is used to define the accuracy of
measurement resulting from an assessment, and how likely it is that the same result would
be produced in slightly different circumstances. An assessment is reliable if a student gains
the same result even if he/she repeats the assessment on different occasions, or even if the
assessment is graded by different markers.
Inter-rater agreement/reliability is the consistency with which two or more assessors rate
the work or performance of test takers4.
Discrimination Index (D.I.)6 of a test item refers to the degree to which it differentiates
between students with high and low achievement. The index can be obtained by subtracting
the number of students in the lower group (bottom quartile) who get the item right (RL) from
the number of students in the upper group (top quartile) who get the item right (RU) and
dividing by one half the total number of students (T) included in the item analysis:
CONFIDENTIAL
Discrimination Index = (RU RL) / (0.5 T)

The higher the D.I., the better the test item discriminates between students with high and low
achivement.
Facility Index (or Item Difficulty Index) (F.I.)6 of a test item refers to the percentage of
students who get the item right. It is a measure of the level of difficulty of the item. Therefore,
the facility index can be computed by means of the following formula, in which R equals the
number of students who got the item right and T equals the total number of students who
tried the item:
Facility Index = 100R / T
The higher the F.I, the lower the difficulty of the test item.
Test specifications are a detailed description for a test, often called a test blueprint, which
specifies the number or proportion of items that assess each content and process/skill area;
the format of items, responses, and scoring rubrics and procedures; and the desired
psychometric properties of the items and test such as the distribution of facility index and
discrimination index4. A test specification is sometimes referred to as a table of
specifications (TOS).
1
rd
Adapted from Nitko, A. J. (2001). Educational assessment of students (3 ed.). New Jersey:
Prentice-Hall
2
Adapted from Glossary of useful terms. (2008). Retrieved January 16, 2013, from
http://www.sabes.org/assessment/glossary.htm
3
Adapted from Shepard, L. A. (2006). Classroom Assessment. In R. L. Brennan (Ed.), Educational
measurement
(4th
ed.)
(pp.
623-646).
Westport:
Praeger.
4
Adapted from Standards for Educational and Psychological Testing (AERA, APA, & NCME), 1999
5
Adapted from Types of assessment (2013). Retrieved January 16, 2013, from
http://www.qub.ac.uk/directorates/AcademicStudentAffairs/CenterforEducationalDevelopment/Assess
mentFeedback/AssessmentTypes of Assessment
6
th
Adapted from Linn, R. L. & Miller, M. D. (2005). Measurement and Assessment in Teaching (9 ed.).
New Jersey: Merrill Prentice Hall
CONFIDENTIAL

Assessment Concepts For Web 14 Apr 2014

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Assessment Concepts For Web 14 Apr 2014

Încărcat de

Drepturi de autor:

Formate disponibile

Assessment Concepts

assigning grades or certifying student proficiency3. It is designed primarily to serve the

Example of a self-assessment checklist

Holistic scoring is a method of obtaining a score on a test or test item, based on a

Fluent and confident in delivery

Displays an in-depth knowledge of the presentation topic

Provides well-elaborated responses to questions posed

Generally fluent, with some pauses in delivery

Displays a good knowledge of the presentation topic

- Provides responses to questions posed

Halting pace, with many pauses in delivery

Displays gaps in knowledge of the presentation topic

Provides short responses to questions posed

Do the assessment items or tasks represent the learning outcomes? (e.g. in a

Discrimination Index = (RU RL) / (0.5 T)

S-ar putea să vă placă și