Sunteți pe pagina 1din 67

APPROPRIATENESS AND

ALIGNMENT OF
ASSESSMENT METHODS TO
LEARNING OUTCOMES
OVERVIEW
 What principles govern assessment of learning?
Chappuis, Chappuis & stiggins (2009) delineated
five standards of quality assessment to inform
sound instructional decisions {1} clear purpose;[2}
clear learning strategies [3] sound assessment design;
[4] effective communication of results; and [5] student
involvement in the assessment process.
 Classroom assessment begins with the questions “why
are you assessing ‘’? The answer to this questions
gives the purpose of assessment which was discussed
in section 1.
 The next question is “what do you want to asses?
These pertains to the student learning outcomes -
What the teacher would like their students to
know and be able to do at the end of a section or
unit. Once targets or outcomes are defined “how
are you going to assess?
These refers to the assessment tools that can
measure the learning outcomes. Assessment
methods and tools
IDENTIFYING LEARNING OUTCOME
 A learning outcome pertains to a particular level
of knowledge, skills and values that a student has
acquired at the end of a unit or period of study as
a result of his/her engagement in a set of
appropriate and meaningful learning experiences.
An organize set of learning outcomes helps
teacher plan and deliver appropriate instructions
an a design valid assessment tasks and strategies.
 Anderson, et al.[2005] listed four steps in a
student outcome assessment [1] create learning
statements; [2]design teaching/assessment to
achieve this outcome statements; [3] implement
teaching/assessment activities [4] analyze data
and aggregate levels; and [5] reassess the process.
TAXONOMY OF LEARNING DOMAINS
 Learning outcomes are statements of performance
expectations: cognitive, affective and
psychomotor. These are three broad domains of
learning characterized by change in a learner’s
behavior. Within each domain are levels of
expertise that drives assessment. These levels are
listed in order of increasing complexity. Higher
levels requires more sophisticated methods of
assessment but they facilitate retention and
transfer of learning(Anderson, et al 2005)
importantly, all learning outcomes must be
capable of being assessed and measured.
A. COGNITIVE (KNOWLEDGE BASED)
 Table 3.1 shows the levels of cognitive learning
originally devise by blooms Engelhart, furst jill &
krathwohl in 1956 and revised by Anderson,
krathwohl et.al in 2001 to produce three
dimensional framework of knowledge and
cognitive process and account for twenty fist
century needs by including metacognition. It is
designed to help teachers understand and
implements standard-based curriculum. The
cognitive domain involves the development of
knowledge and intellectual skills it answer the
questions “What do I want learners to know” the
first three are lower-order, while the next three
Levels promote higher order thinking.
Krathwohl [2002] stressed that the revised blooms
taxonomy table is not only used to classify
instructional and learning activities used to achieve
the objectived, but also for assessments employed
the determine how will learners have attained and
mastered the objectives .
Marzano and Kendall (2007) came up with their
own taxonomy composed of three systems {self
system} {metacognitive system} and {
cognitive system} and the knowledge domain .
There cognitive levels have four levels: knowledge,
comprehension, Analysis, and knowledge
Utilization.
The knowledge component is same as
Remembering level in the revised blooms
taxonomy . Comprehension entails synthesis and
representation. Relevant information are taken and
then organized into categories. Analysis involves
processes of matching, classifying, error analysis,
generalising and specifying. The last level
Knowledge Utilization, comprises decision
making problem solving, experimental inquiry and
investigation – processes essential in problem based
and project based learning.
COGNITIVE LEVELS AND PROCESSES
(ANDERSON, ET. AL 2001)

Levels Process and action verbs Sample learning


describing outcomes competencies

Remembering – retrieving Recognizing, recalling verb; Define the four levels of


relevant knowledge from define ,describe identify, mental processes in
long term memory label, list, match, name marzano and kendall's
outline, reproduce, select, (cognitive system)
state.

Understanding – - interpreting, exemplying, Explain the purpose of


constructing meaning from classifying, marzano and kendall's new
instructional messages, summarizing, inferring , taxonomy of educational
including oral, written, and comparing and objectives.
graphic communication explaining, paraphrase,
- re write. Summarize.
Analyzing- breakig material Process- diffrentiating, Compare and contrast, the
into its constituent parts and organizing,attributing, thinking levels ad the revised
determne how the parts relate Verbs; analyze, bloom's taxonomy and
to one another and to overall arrange,associate, Marzano & kendall”s
structure or purpose. compare,contrast, infer, Cognitive System.
organize, solve, support, a
(thesis)

Evaluating- making Process; executing, Judge the effectivenes of


judgements based on criteria monitoring, generating, writing learning outcomes
and standards. Verbs; appraise, compare, using marzano and kendall's
conclude, contrast,criticize, taxonomy.
evaluate, judge, justify,
support, ( a judgement),
verify.

Creating- putiing elements Process; planning, producing, Design a classification


together to form a coherent or Verbs; classify ( infer the scheme for writing learning
functional whole, reorganize classification system), outcomes using the levels of
elements into new pattern or construct, create, extend, cognitive system developed
structure. formulate, generate, by marx=zano & kendall.
synthesize.
B. PSYCHOMOTOR (SKILLS-BASED)
The psychomotor domain focuses on physical and
mechanical skills involving coordination of the brain and
muscular activity. It answer the questions “what action do
I want learners to be able to perform”?
Dave (1970) identified five levels of behavior in the
sychomotor domain ; Imitation, Manipulation, Precision,
Articulation, and Naturalization. In his taxonomy, Simpson
(1972) laid down seven progressive levels : Perception ,
Set Guided response, mechanism, complex overt
response.
Meanwhile, Harrow (1972) developed her own taxonomy
with six categories organized according to degree of
coordination,; Reflex movement, Basic fundamental
movement, Perceptual activities , Skilled movements and
non discursive comm.
TAXONOMY AND PSYCHOMOTOR
DOMAIN

levels Action verbs describing Sample competencies


learning outcomes

Observing – active mental Describe, detect, Relate music to a particular


attending of a physical distinguish , dance.
event, diffrentiate,relate,select

Imitatiing- attempted Beigin, display, explain, Demonstrate, a simple


copying of a physical move, proceed,react, show, dance step.
behaivour. state, volunteer.
Practicing – trying a Bend, calibrate, construct, Display several dance
specific physical activity differentiate, dismantle, steps in sequence.
over and over. fasten, grasp, grind,
handle, measure, mix,
organize, operate,
manipulate, mend.

Adapting- fine tuning. Arrange, combine, Perform a dance showing


Making minor adjustments compose, construct, new combinations of
in the physical activity in create , design, originate, steps.
order to perfect it. re -arange,, reorganize.
C. AFFECTIVE(VALUES,ATTITUDES
AND INTERESTS)

the affective domain emphasizes emotional knowledge. It


tackles the question, “ what actions do I want learners to
think or care about.

able 3.3 presents the classification scheme for the


affective domain developed by krathwohl, Bloom and
Masia in 1964. the affective domain includes factors such
as a student motivation, attitudes, appreciation and
values.
TAXONOMY OF AFFECTIVE
DOMAIN (KRATHWOHL, ET AL.,
1964)
 Receiving – is being  Ex. Include to
aware of or sensitive differentiate, to accept,
to the existence of to listen (for), to
certain idea, material respond to.
or phenomena and
being willing to
tolerate the.
TAXONOMY OF AFFECTIVE
DOMAIN (KRATHWOHL, ET AL.,
1964)
 Responding- is  Ex . To comply with, to
committed in some follow, to commend to
small measure to the commend, to volunteer
ideas, materials or , to spend leisure time
phenomena involved in, to acclaim.
by actively responding
to them.
TAXONOMY OF AFFECTIVE
DOMAIN (KRATHWOHL, ET AL.,
1964)
Valuing- showing  ex. attend optional
some diffinite matches.
involvement or  Arrange his/her own
commitment . volleyball practice.
Organizing-
integrating a new
value into one's
general set of values,
giving it some ranking
among one's general
priorities.
TAXONOMY OF AFFECTIVE DOMAIN
(KRATHWOHL, ET AL., 1964)
Internalizing value-  ex- join to play
characterization by a volleyball twice a
value or value complex week.
acting consistently with
the new value.
TYPES OF ASSESSMENT

Assesment method can be categorized according to the


nature and characteristics of each method. McMillan
(2007) identfied four major categories: selected-
response, constructed,- response, teacher observation
and student self-assessment. These are similar to
carpenter tools and you need to choose which is apt for
a given task. It is not wise to stick to one method of
assessment. As the saying goes, “ if the tools is only
hammer, you tend to see every problem as a nail”
1.SELECTED- RESPONSE FORMAT

n a selected- response format, students select from a


given set of options to answer a question or a problem.
Because there is only one correct or best answer,
selected- response items are objective and efficiently.

teacher commonly assess students using questions and


items that are multiple – choice; alternate response

true or false) ; matching type and interpretive .


Multiple choice question consist a stem of {question or
statement form} with four or five choices ( distracters);
matching type items consist of a set or column or
descriptions of words, phrases, or image.
 Students review each stem and match each with
a word, phrases or images. From the list of
responses. Alternate response (true/false)
questions are binary choice type. The reliability
of true /false items is not generally high of the
possibilty of guessing.
2.CONSTRUCTED- RESPONSE FORMAT

in selected- response type, students need only to


recognize and select the correct answer.

A constructed – response format(subjective) demands


that students create or produce their own answer in
response to a questions , problem or a task . In this
type, items may fall under any of the following
categories: Brief-constructed response items:
performance tasks; essay items; or oral questioning.
BRIEF-CONSTRUCTED RESPONSE
ITEMS

equire only short response from students. Examples


include sentence completion where students fill in blank at
the end of a statement ; short answer to open-ended
questions; labelling a diagram; or answering a
mathematics problem by showing their solutions.
PERFORMANCE ASSESSMENTS

require students to perform a task rather than select


from a given set of options. Unlike brief-constructed
response items, students have to come up with
extensive and elaborate answer or response.
Performance tasks are called authentic or
alternative assessments.
essay assessment- involving answering a questions or
proposition in written form.

oral questioning- is a common assessment method


during instruction to check on student understanding.
3.TEACHER OBSERVATION

teacher observation are a form of on-going


assessment, usually done in combination with oral
questioning. Teacher regularly observe students to
check on their understanding. By watching how
students respond to oral questions and behave during
individual and collaborative activities, the teacher can
get information if learning is taken place in the
classroom. Non verbal cues communicate how learners
are doing.
4.STUDENT SELF-ASSESSMENT

self-assessment is one of the standards of quality


assessment identified by Chappuis, Chappuis & Stggins
(2009). it is a process where the students are given a
chance to reflect and rate their own work and judge
how well they have performed in a relation a set of
assessment criteria. Student track and evaluate their
own progress or performance. There are self
assessment monitoring techniques like activity
checklist, diaries, and self report inventories. The latter
are questionaries‘ or surveys that student fill out to
reveal their attitudes and beliefs about themselves and
others,
MATCHING LEARNING TARGETS
WITH ASSESSMENT METHODS

In a outcome- based approach, teaching methods and


resources that are used to support learning as well as
assessment task and rubrics are explicity linked to the
program and course learning outcomes. Biggs and
Tang (2007) calls this Constructed Alignment.
Constructed alignment provides the “how-to” by
verifying that the teaching- learning activities (Tlas)
and the assessment task (Ats), activate the some verbs
as in the ILOs are indicators of devised by Anderson,
krathwhol, et, at.(2001) can increase the alignment of
learning outcomes and instruction. Airasin & miranda ,
2002)
A learning target is defined as a description a
performance that includes what learners should know
and able to do. It contains the criteria used to judge
student performance. It is derived from national and
local standards. This definitions is similar to that of a
learning outcome.
LEARNING AND ASSESSMENT METHODS (MCCMILLAN,2007)
ASSESSMENT METHOD

Selected Essay Performance Oral observ Student


response task questioning ation self
and Brief- assesseme
constructed nt
response

Targets 5 4 3 4 3 3
Knowledge and
simple
understanding

Deep 2 5 4 4 2 3
understanding
and reasoning

skills 1 3 5 2 5 3
Knowledge and simple understanding – pertains
to mastery of substantive subject matter and
procedures. In the revised blooms taxonomy, this
cover the lower order thinking skills of
remembering, understanding ,and applying .
Selected-response and constructed-response items
are best in assessing low-level learning targets in
terms of coverage and efficiency.
Reasoning is a mental manipulation and use of
knowledge in critical and creative ways. Deep
understanding and reasoning involve higher
order thinking skills of analyzing, evaluating, and
synthesizing.
 To assess skills, performance assessment is
obviously the superior assessment method.
 As mentioned, Product are most adequately
assessed through performance task.
 Student affective cannot be assessed simply by
selected-response or brief-constructed response
test. Affective pertains to attitudes, interests, and
values students manifest. The best method for
this learning target is self assessment. Most
commonly this is the form of students response to
self report affective inventories using rating skills.
In the study conducted by Stiggins & Popham
(2009) there are two affective variables influenced
by teacher who employ assessment formatively in
their classess: academic efficacy(perceived ability
to succeed and sense of control over once academic
well being) and eagerness to learn.
GUIDE FOR ASSESSING LEARNING
OUTCOME FOR GRADE 1
What to assess How to assess How to score How to utilize
(suggested results
assessment
tool/strategies)

• Content of the • 1 quizzes Raw score To identify


curriculum • Multiple individual
• Facts and choice learner with
information • True or false specific needs for
that learners • Matching type academic
acquire • Constructed interventions and
response individual
2.Oral Rubrics instruction.
participation
3.Periodical test Raw score
• Cognitive • Quizzes Raw To identify
operations • Outlining,organizing score learners with
that learners ,analyzing,interpretin similar needs
perform on g, translating, for academic
facts and converting or interventions
information expressing the and small
for information in group
constructing another format. instruction.
meanings • Constructing graphs
flowcharts, maps or To assess
graphic organizer effectiveness of
• Transforming a teaching and
textual presentation learning
into a diagram strategies.
• Drawing or painting
pictures
• Other output
2.Oral participation Rubri
cs
• Explanation 1.Quizzes Raw scores To evaluate
• Interpretation Explain instrucional
• application /justify materials used
something
based on
facts data,
phenomena
or evidence
•Tell /retell To design
stories instructional
•Make materials
connections
of what was
learned in
real life
situation
•Oral
discourse/rec Rubrics
itation
•Open –
ended test Rubrics
• Learners Participation Rubrics To assess and
authentic Projects classroom
task as Homework instruction.
evidence of Experiments
understandi Portfolio To design in
ng others service
• Multiple training
intelligence program of
teachers in the
core subjects
curriculum.
VALIDITY AND RELIABILITY
Overview:
It is not usual for teachers to complaints or
comments from students regarding test and other
assessment. For one, there maybe an issue
concerning the coverage of the test. Students may
have been tested on areas that were not part of the
content domain. They may not have been given the
opportunity to study or learn the material. The
emphasis of the test may also be too complex,
inconsistent with the performance verbs in the
learning outcome.
Validity alone does high quality assessment.
Reliability of test results should also be checked.
 Questions on reliability surface if there are
inconsistencies in the results when tests are
administered over different time periods, sample
of questions or groups.
 Both validity and reliability are considered when
gathering information or evedencies about student
achievement.
 This chapter discuss the distinctions between the
two.
Validity
Validity is term derived from the latin word
validus, meaning strong. In view of assessment
, it is deemed valid if measures what is supposed
to. In contrast to what some teachers believe, it is
not property of a test. It pertains to the accuracy
of the inferences teachers make about students
based on the information gathered from
assessment (McMillan,2007;fives &Didonato-
Barnes, 2013) this implies that the conclusions
teachers come up with in their evaluation of
student performance is valid if there are strong
and sound evidences of the extent of students
learning . Decisions also include those about
instructions and classroom climate.
{Russell & Airaian, 2012}.
An assessment is valid if it measures a student actual
knowledge and performance with respect to the
intended outcomes, and not something else. It is
representative of the area of learning or content of
the curricular aim being assessed ( McMillan, 2007;
Popham, 20011) for instance an asssessment
purpotedly for mesuring arithmetic skills of grade 4
pupils invalid if used for grade 1 pupils because of
issues on content (test content evidences) and level
of performance ( response process evidence) a test
that measures recall of validity problems
particularly on content – related evidence.
A. CONTENT –RELATED EVIDENCE
 Content-related evidence for validity pertains to the
extent to which the test covers the entire domain of
content . In summative test covers a unit with four
topics, then the assessment should contain items from
each topic. This is done through adequate sampling of
a content. A student performance in the test maybe
used as an indicator of his/her content knowledge. For
intance, if a grade 4 pupils was able to correctly
answer 80% of the items in a science test about
matter , the teacher may infer that the pupil knows
80% of a content area.
In the previous chapter, we talked about appropriateness
of assessment methods to learning outcomes.
a test that appears to adequately measure the learning
outcomes and content is said to possess Face validity . As
the name suggest, it looks at the superficial face value of
instrument. It is based on the subjective opinion of the one
reviewing it. Hence, it is considered non-systematic or non
scientific. A test that was prepared to assess the ability of
pupils to construct simple sentences with correct subject-
verb agreement has face validity if the test looks like an
adequate measure of the cognitive skill.
Another consideration to content validity is
Instructional validity- the extent to which an assessment
is systematically sensitive to the nature of instruction
offered.this is closely related to instructional sensitivity
which Popham {2006,p.1} define as the “degree to which
students performance on a test accurately reflect quality of
What is being assessed.} Yoon & Resnick (1998)
asserted that an instructionally valid test is one
that register differences in the amount and kind
instruction to which students have been exposed.
They also described the degree of overlap between
the content tested and the content taught as
oppurtunity to learn which has an impact on test
scores . Lets consider the grade 10 curriculum in
Araling Panlipunan (social studies). In the first
grading period they will cover three economic
issues; unemployment, globalization and
sustainable development. Only two were
discussed in classbut assessment covered three
issues. Although these were all identified to the
urriculum guide and may even found in a
texbooks, the questions remains as to whether the
topics were all taught or not.
Inclusion of items that were not taken up in class reduce
because students had no oppurtunuty to learn the
knowledge or skill being assessed.

To improve the validity of assessments, it is


recommended that the teacher constructs a two
dimensional grid called Table of Specification
(Tos). The Tos is prepared before developing the
test. It is a blueprint that identifies the content area
and describes the learning outcomes at each level of
cognitive domain (notar, et al. 2004) it is a tool to
used in conjuction with the lesson and unitplanning
to help teachers make genuine connections between
planning, instruction and assessment ( Five
&Didonato- barnes, 2013) it assures the teacher
that they are testing students learning across a
wide range of content and readings as well
As cognitive processes requiring higher order
thinking. Table 4.1 is an example of adapted Tos
using the learning competencies found in the
Math curriculum guide. In a two way table with
learning objectives or content matter on the
vertical axis and the intellectual process on the
other.
TABLE 4.1 SAMPLE OF SPECIFICATION (NOTAR,ET AL.,2004)

Course title: Math


Grade level: V
Periods test is being used: 2
Date of test : August 8 2014
Subject matter digest: Number and number sense
Type of a test: Power, speed, partially speeded
(circle one)
Test time : 45minutes
Test value: 100 points
Base number of a test questions: 75
Constraints: test time
Learning Objective Item types Revised Bloom’s Taxonomy Total

Remember Understand Apply Analyze Evaluate Create

No level Instructio
nal time
Q/P Q/P
. in /%
minutes

1 apply 95 11/ Matc 6(1) 5(2) 11/16


16% 16 hing

2 understa 55 7/1 MC 5(2) 5/10


nd 9% 0

: : : : : : : :

10 evaluate 40 5/7 essay 1(7) 1(7)


7%

total 600 75/ 11/ 23/31 16/34 4/10 3/6 1/7 58/10
100% 100 12 0
SPECIFIED SIX ELEMENTS IN TOS
DEVT.

1. Balance among the goals selected for the


examinations
2. Balance among the levels of learning
3. The test format;
4. The total number of items
5. The number of items for each goal and level of
learning
6. The enabling skills to be selected from each goal
frame work.
 The first three elements were discussed in the
previous chapter . As to the number of items that
would depend on the duration of the test which is
contigent on the academic level and attention
span of the students. A sic year old grade 1 pupil
Is not expected to accomplish a one hour test .
They do not have the tolerance to sit in an
exaination that long. The # of items is also
determine by the purpose of the test or its
proposed uses. It is a power test or a speed test ?
Power test are intended to measureb the range of
the student's capacity in a particular area, as
opposed to a speed test that is characterized by
time- pressure.
Time requirements for certin assessment
task
Types of Questions Time required to answer
Alternative response (true-false) 20-30 seconds
Modified true or false 30-45 seconds(notar,et.,al,2004)
Sentence completion (one -word-fill-in) 40-60 seconds
Multiple choice with four responses(lower 40-60 seconds
level)

Multiple choice(higher level) 70-90 seconds


Matching type(5 items,6 choices) 2-4 minutes
Short answer 2-4 minutes
Multiple choice(with calculations) 2-5 minutes
Word problem (simple arithmetic) 5-10 minutes
Short essays 15-20 minutes
Data analysis/graphing 15-25 minutes
Drawing models/labelling 20-30 minutes
Extended essays 35-50 minutes
B. CRITERION- RELATED EVIDENCE
 Criterion-related evidence for validity refers to the
degree to which test scores agree with external
criterion. As such, it is related to eternal validity.
Examines the relationship between an assessment
and another measure of the trait [McMillan,2007]
 There are three types of criteria [Nitko &
Brookhart, 2011]
1. Achievement test scores

2. Ratings grade and other numerical


judgements made by the teacher;
3. Career data
Criterion-related evidence is of two types:
concurrent validity and predictive validity.
Concurrent validity provides an estimate of
students current performance in relation to a
previously validated or established measure.
Predictive validity pertains to the power or
usefulness of test scores to predict future
performance.
Person correlation coefficient [r2] or
spearman’s rank order correlation is called the
coefficient of determination.
C. CONSTRUCT- RELATED
EVIDENCE
 A construct is an individual characteri stics that
explains some aspect behavior. (Millen,linn &
Gronlund, 2009). Construct related evidence of
validity is an assessment of the quality the
instrument used. It measure that extend to which
the assessment is meaningful measure of an
unobservable trait or characteristic . {McMillan,
2007} there are three types of constructed -related
evidence :theoritical, logical and statistical
{McMillan, 2007)
 A good construct has theoritical basis. This means
that the construct must be operationally defined
or explained explicitly to differentiate it from
other constructs.
Two methods of establishing
construct validity:
 Convergent validity occurs when measures of
constructs that are related are in fact observed to
be related .
 Divergent (or discriminant) validity, on the other
hand, occurs when constructs that are unrelated
are in reality observed not to be.
UNIFIED CONCEPT OF VALIDITY
 In 1989, Messick proposed a unified concept of
validity based an expanded theory of construct
validity which adreses score meaning and social
values in test interpretation and test use.
 His concept of unified validity ‘’ integrates
consideration of content, criteria, and
consequences into a construct framework for the
empirical testing of rational hypotheses about
score meaning and theoritically relevant
relationships.
VALIDITY OF ASSESSMENT METHOD

 In the previous sections validity of traditional


assessments was discussed. What about the other
assessment method? The same validity apply .

1. The selected performance should reflect a


valued activity

2. The completion of performance assessments


should provide a valuable learning
experience.
3.The statement of goals and objectives
should be clearly aligned with the
measureable outcomes of the performance
activity.

4.The task should be not examine


extraneous or unintended variables.

5.Performance assessments should be fair


and free from bias
TREATS TO VALIDITY
Miller,linn & gronlund 9(2009) identified ten factors
are defects in the construction of assessment tasks
that would render assessment inferences
inaccurate. The first four factors apply to
traditional test and performance assessments.
The remaining factors concern brief-
constructed response and selected response-
items.
1.Unclear test direction

2.Complicated vocabulary and sentence structure

3.Ambiguous statements.

4.Inadequate times limits


5. Inappropriate level of difficulty of test items
6.Poorly constructed test items
7.Innaproprite test items for outcome being
measured
8.Short test
9.Improper arrangement items
10.Indenfiable pattern of answers.
 Compare scores taken before to those taken after
instruction.
 Compare predicted consequences to actual
consequences.
 Compare scores taken before to those taken after
instruction.
 Compare predicted consequences to actual
consequences.
 Compare scores taken before to those taken after
instruction.
 Compare scores on similar , but different traits.

 Provide adequate time to complete the


assessment.
 Ensure appropriate vocabulary, sentence
structure and item difficulty.
MCMILLAN (2009)LAID DOWN HIS
SUGGESTION FOR ENHANCING
VALIDITY. THESE ARE FOLLOWS
 Asks other to judge the clarity of what you are
assessing.
 Check to see if different ways of assessing the
same thing give the same result.
 Sample a sufficient number of examples of what is
being assessed.
 Prepared a detailed table of specifications

 Asks others to judge the match between the


assessment items in the objectives of the
assessment.
 Compare groups known to differ on what is being
assessed.
 Ask easy questions first
 Use different methods to assess the same thing

 Use only for intended purpose,


RELIABILITY
- Talks about reproducibility and consistency in
methods and criteria. An assessment is said to be
reliable if it produces the same results if given to
examine on two occasions. It is important then to
stress that reliability pertained to obtained
assessment results and not to the test or any other
instrument. Another point is that reliabilty is
unlikely to turn out 100% because group of students
after a day or two will have some differences , there
are environmental factors like lightning and noise
that affect realiability . Student error and physical
well- being of examiness also affect consistency of
assessment results.
 For test to be valid, it has to be reliable.

 Reliabilty is expressed as correlation coefficient. A


high reliabilty denotes that if a similar test
readministered to the same group of students, test
results from the first and second resting are
comparable.
Types of Reliabilty
 Internal – assesses the consistency of results
across items within a test whereas
 Ixternal consistency evidence based on scorer or
rater consistency and evidence based on decision
consistency.
Sources of Reliabilty
A. Stabilty
the test-retest reliabilty
TYPES OF RELIABILTY
TYPES OF RELIABILTY

S-ar putea să vă placă și