Documente Academic
Documente Profesional
Documente Cultură
Professional
Education
ASSESSMENT OF LEARNING
Mr. Angelo Unay
*BEED, PNU-Manila (Cum Laude)
*PGDE-Math & English NTU-NIE, Singapore
BASIC CONCEPTS
Test
An instrument designed to measure any quality,
ability, skill or knowledge.
Comprised of test items of the area it is designed
to measure.
Measurement
A process of quantifying the degree to which
someone/something possesses a given trait (i.e.
quality, characteristics or features)
A process by which traits, characteristics and
behaviors are differentiated.
BASIC CONCEPTS
Assessment
It is a prerequisite to evaluation. It
provides the information which enables
evaluation to take place.
BASIC CONCEPTS
Evaluation
graded
not graded
PRINCIPLES OF HIGH QUALITY
ASSESSMENT
1.Clarity of Learning Targets
Clear and appropriate learning targets
include (1) what students know and can do
and (2) the criteria for judging student
performance.
4. Reliability
This refers to the degree of consistency
when several items in a test measure the
same thing, and stability when the same
measures are given across time.
PRINCIPLES OF HIGH QUALITY
ASSESSMENT
5. Fairness
Fair assessment is unbiased and provides
students with opportunities to demonstrate what
they have learned.
6. Positive Consequences
The overall quality of assessment is enhanced
when it has a positive effect on student
motivation and study habits. For the teachers,
high-quality assessments lead to better
information and decision-making about students.
PRINCIPLES OF HIGH QUALITY
ASSESSMENT
COMPREHENSION
Ability to grasp the meaning of material
Shown by translating material from one form
to another, by interpreting material, and by
estimating future trends
TAXONOMY OF EDUCATIONAL
OBJECTIVES
COGNITIVE DOMAIN
(Bloom, 1956)
APPLICATION
Ability to use learned material in new and
concrete situations
Application of rules, methods, concepts,
principles, laws, and theories
TAXONOMY OF EDUCATIONAL
OBJECTIVES
COGNITIVE DOMAIN
(Bloom, 1956)
ANALYSIS
Ability to break down material into its
component parts so that its organizational
structure may be understood
Include identification of parts, analysis of the
relationships between parts, and recognition
of the organizational principles involved
TAXONOMY OF EDUCATIONAL
OBJECTIVES
COGNITIVE DOMAIN
(Bloom, 1956)
SYNTHESIS
Ability to put parts together to form a new
whole
Stress creative behaviors, with major
emphasis on the formulation of new patterns
or structures
TAXONOMY OF EDUCATIONAL
OBJECTIVES
COGNITIVE DOMAIN
(Bloom, 1956)
EVALUATION
Ability to judge the value of material for a
given purpose
Judgments are to be based on definite
criteria [internal (organization) or external
(relevance to the purpose)]
TAXONOMY OF EDUCATIONAL
OBJECTIVES
COGNITIVE DOMAIN
(Bloom, 1956)
READING
K: Knows vocabulary
U: Reads with comprehension
Ap: Reads to obtain information to solve a problem
An: Analyzes text and outlines arguments
S: Integrates the main ideas across two or more passages
E: Critiques the conclusions in a text and offers alternatives
TAXONOMY OF EDUCATIONAL
OBJECTIVES
COGNITIVE DOMAIN
(Bloom, 1956)
MATHEMATICS
K: Knows the number system and basic operations
U: Understands math concepts and processes
Ap: Uses mathematics to solve problems
An: Shows how to solve multistep problems
S: Derives proofs
E: Critiques proofs in geometry
TAXONOMY OF EDUCATIONAL
OBJECTIVES
COGNITIVE DOMAIN
(Bloom, 1956)
SCIENCE
K: Knows terms and facts
U: Understands scientific principles
Ap: Applies principles to new situations
An: Analyzes chemical reactions
S: Conducts and reports experiments
E: Critiques scientific reports
Question:
With SMART lesson objectives in the
synthesis in mind, which one does
NOT belong to the group?
a. Formulate
b. Judge
c. Organize
d. Build
Question:
Which test item is in the highest level of
Bloom’s taxonomy of objectives?
Verbal Non-Verbal
Students do not
use words in
Language Words are used by
attaching meaning
Mode students in attaching
to or in responding
meaning to or
to test items (e.g.
responding to test items
graphs, numbers,
3-D subjects)
MAIN POINTS
FOR TYPES OF TESTS
COMPARISON
Standardized Informal
Constructed by a Constructed by a
professional item writer classroom teacher
Covers a broad range of
Covers a narrow
content covered in a subject
range of content
area
Construction
Various types of
Uses mainly multiple choice
items are used
Items written are screened
Teacher picks or
and the best items were
writes items as
chosen for the final
needed for the test
instrument
MAIN POINTS
FOR TYPES OF TESTS
COMPARISON
Standardized Informal
Scored
Can be scored by a
manually by the
machine
teacher
Construction
Interpretation
Interpretation of
is usually
results is usually
criterion-
norm-referenced
referenced
MAIN POINTS
FOR TYPES OF TESTS
COMPARISON
Individual Group
Mostly given orally or
This is a paper-
requires actual
and-pen test
demonstration of skill
Loss of rapport,
One-on-one situations,
Manner of insight and
thus, many opportunities
Administration for clinical observation knowledge about
each examinee
Chance to follow-up
Same amount of
examinee’s response in
time needed to
order to clarify or
gather information
comprehend it more
from one student
clearly
MAIN POINTS
FOR TYPES OF TESTS
COMPARISON
Objective Subjective
Affected by
Scorer’s personal
scorer’s personal
judgment does not
opinions, biases
affect the scoring
and judgments
Effect of
Worded that only one Several answers
Biases answer is acceptable are possible
Possible to
Little or no
disagreement on
disagreement on what
what is the correct
is the correct answer
answer
MAIN POINTS
FOR TYPES OF TESTS
COMPARISON
Power Speed
Consists of series of
Consists of items
items arranged in
approximately
ascending order of
Time Limit and equal in difficulty
difficulty
Level of
Difficulty
Measure’s
Measures student’s
student’s speed or
ability to answer more
rate and accuracy
and more difficult items
in responding
MAIN POINTS
FOR TYPES OF TESTS
COMPARISON
Selective Supply
Result is interpreted
Result is interpreted
by comparing one
by comparing
student’s
student’s performance
performance with
based on a predefined
Interpretation other students’ standard/criteria
performance
Emphasizes Emphasizes
discrimination description of what
among individuals in learning tasks
terms of level of individuals can and
learning cannot perform
MAIN POINTS
FOR
COMPARISON
TYPES OF TESTS
Norm-Referenced Criterion-Referenced
Matches item
Favors items of
difficulty to learning
average difficulty
tasks, without altering
and typically omits
item difficulty or
very easy and very
omitting easy or hard
hard items
Interpretation items
Interpretation
Interpretation
requires a clearly
requires a clearly
defined and delimited
defined group
achievement domain
Similarities Between NRTs and CRTs
1. Both require specification of the
achievement domain to be measured.
Essay Test
c. Restricted Response – limits the content of the
response by restricting the scope of the topic
d. Extended Response – allows the students to select
any factual information that they think is pertinent, to
organize their answers in accordance with their best
judgment
Question:
Which assessment tool will be most
authentic?
a. Short Answer
b. Completion
c. Multiple Choice
d. Restricted-response essay
ALTERNATIVE ASSESSMENT
PERFORMANCE & AUTHENTIC ASSESSMENTS
Time-consuming to administer,
develop, and score
Inconsistencies in performance on
alternative skills
ALTERNATIVE ASSESSMENT
PORTFOLIO ASSESSMENT
CHARACTERISTICS:
1) Adaptable to individualized instructional goals
2) Focus on assessment of products
3) Identify students’ strengths rather than
weaknesses
4) Actively involve students in the evaluation
process
5) Communicate student achievement to others
6) Time-consuming
7) Need of a scoring plan to increase reliability
ALTERNATIVE ASSESSMENT
RUBRICS – scoring guides, consisting of specific
pre-established performance criteria, used in
evaluating student work on performance
assessments
Types:
1) Holistic Rubric – requires the teacher to score
the overall process or product as a whole,
without judging the component parts separately
2) Analytic Rubric – requires the teacher to score
individual components of the product or
performance first, then sums the individual
scores to obtain a total score
Types of NON-COGNITIVE TEST
1. Closed-Item or Forced-choice Instruments –
ask for one or specific answer
Ex:
Math is
easy __ __ __ __ __ __ __ difficult
important __ __ __ __ __ __ __ trivial
useful __ __ __ __ __ __ __ useless
Types of NON-COGNITIVE TEST
3) Likert Scale – measures the degree of one’s
agreement or disagreement on positive or
negative statements about objects and people
Ex:
Use the scale below to rate how much you agree or
disagree about the following statements.
5 – Strongly Agree
4 – Agree
3 – Undecided
2 – Disagree
1 – Strongly Disagree
1. Science is interesting.
2. Doing science experiments is a waste of time.
Types of NON-COGNITIVE TEST
c. Alternative Response – measures students
preferences, hobbies, attitudes, feelings, beliefs,
interests, etc. by choosing between two possible
responses
Ex:
T F 1. Reading is the best way of spending leisure time.
a. Observation
b. Non-restricted essay test
c. Short answer test
d. Essay test
GENERAL SUGGESTIONS
IN WRITING TESTS
1. Use your test specifications as guide to
item writing.
2. Write more test items than needed.
3. Write the test items well in advance of the
testing date.
4. Write each test item so that the task to be
performed is clearly defined.
5. Write each test item in appropriate reading
level.
GENERAL SUGGESTIONS
IN WRITING TESTS
6. Write each test item so that it does not
provide help in answering other items in
the test.
7. Write each test item so that the answer is
one that would be agreed upon by experts.
8. Write test items so that it is the proper level
of difficulty.
9. Whenever a test is revised, recheck its
relevance.
SPECIFIC SUGGESTIONS
Supply Type
1. Word the item/s so that the required
answer is both brief and specific.
2. Do not take statements directly from
textbooks to use as a basis for short
answer items.
3. A direct question is generally more
desirable than an incomplete statement.
4. If the item is to be expressed in numerical
units, indicate the type of answer needed.
SPECIFIC SUGGESTIONS
Supply Type
5. Blanks should be equal in length.
a. Five
b. Three
c. Any
d. Four
SPECIFIC SUGGESTIONS
Essay Type
1. Restrict the use of essay questions to
those learning outcomes that cannot be
satisfactorily measured by objective items.
2. Formulate questions that will bring forth the
behavior specified in the learning outcome.
3. Phrase each question so that the pupils’
task is clearly defined.
4. Indicate an approximate time limit for each
question.
5. Avoid the use of optional questions.
Question:
What should a teacher do before
constructing items for a particular test?
a. Objectivity
b. Reliability
c. Validity
d. Usability
Question:
The same test is administered to different
groups at different places at different
times. This process is done in testing
the:
a. Objectivity
b. Validity
c. Reliability
d. Comprehensiveness
ITEM ANALYSIS
STEPS:
1. Score the test. Arrange from lowest to
highest.
2. Get the top 27% (T27) and below 27% (B27)
of the examinees.
3. Get the proportion of the Top and Below who
got each item correct. (PT) & (PB)
4. Compute for the Difficulty Index.
Df = (PT + PB) / N
5. Compute for the Discrimination Index.
Ds = (PT - PB) / n
ITEM ANALYSIS
INTERPRETATION
Difficulty Index (Df)
0.76 – 1.00 = easy (revise)
0.25 – 0.75 = average (accept)
0.00 – 0.24 = very difficult (reject)
Question A B C D Df
1 0 3 24* 3
2 12* 13 3 2
# of students: 30
Question A B C D Df
1 0 3 24* 3 0.80
2 12* 13 3 2
# of students: 30
Question A B C D Df
1 0 3 24* 3 0.80
2 12* 13 3 2 0.40
# of students: 30
Question PT PB Df Ds
1
2
3
ITEM ANALYSIS
Example:
Question PT PB Df Ds
1 4 4
2 0 3
3 5 1
ITEM ANALYSIS
Example:
Question PT PB Df Ds
1 4 4 0.80
2 0 3
3 5 1
ITEM ANALYSIS
Example:
Question PT PB Df Ds
1 4 4 0.80
2 0 3 0.30
3 5 1
ITEM ANALYSIS
Example:
Question PT PB Df Ds
1 4 4 0.80
2 0 3 0.30
3 5 1 0.60
ITEM ANALYSIS
Example:
Question PT PB Df Ds
1 4 4 0.80 0
2 0 3 0.30
3 5 1 0.60
ITEM ANALYSIS
Example:
Question PT PB Df Ds
1 4 4 0.80 0
2 0 3 0.30 - 0.6
3 5 1 0.60
ITEM ANALYSIS
Example:
Question PT PB Df Ds
1 4 4 0.80 0
2 0 3 0.30 - 0.6
3 5 1 0.60 0.8
1. Which question was the easiest?
2. Which question was the most difficult?
3. Which item has the poorest discrimination?
4. Which question would you eliminate (if any)?
Why?
Question:
A negative discrimination index means that:
NOMINAL
ORDINAL
RATIO
INTERVAL
frequency TYPES OF DISTRIBUTION
Unimodal Distribution
Bimodal Distribution
high scores
Multimodal / Polymodal
Distribution
TYPES OF DISTRIBUTION
frequency
high scores
low scores scores
high scores
low scores scores
Mesokurtic
distributions are the ideal
example of the normal
distribution, somewhere
between the leptokurtic and
playtykurtic.
Platykurtic
distributions are broad
and flat.
Question:
Which statement applies when score
distribution is negatively skewed?
X = 61 SD = 6 X = 63
X + SD = 61 + 6 = 67
X - SD = 61 – 6 = 55
All scores between 55-67 are average.
All scores above 67 or 68 and above are above average.
All scores below 55 or 54 and below are below average.
a. Below Average
b. Average
c. Needs Improvement
d. Above Average
Question:
The score distribution of Set A and Set B have
equal mean but with different SDs. Set A has an
SD of 1.7 while Set B has an SD of 3.2. Which
statement is TRUE of the score distributions?
a. The scores of Set B has less variability than
the scores in Set A.
b. Scores in Set A are more widely scattered.
c. Majority of the scores in Set A are clustered
around the mean.
d. Majority of the scores in Set are clustered
around the mean.
INTERPRETING MEASURES OF VARIABILITY
QUARTILE DEVIATION (QD)
• The result will help you determine if the group is
homogeneous or not.
• The result will also help you determine the number of
students that fall below and above the average performance.
for Validity:
.81 – 1.0 = very high correlation
computed r should be at least
.61 - .80 = high correlation 0.75 to be significant
.41 - .60 = moderate correlation
for Reliability:
.21 - .40 = low correlation
computed r should be at least
0 - .20 = negligible correlation 0.85 to be significant
Question:
The computed r for scores in Math and
Science is 0.92. What does this mean?
Z-Scores -3 -2 -1 0 +1 +2 +3
T-Scores 20 30 40 50 60 70 80
Percentiles 1 2 16 50 84 98 99.9
PERCENTILE
tells the percentage of examinees that lies below
one’s score
Example:
Jose’s score in the LET is 70 and his percentile
rank is 85.
Formula: Where:
X – individual’s raw score
XX
Z X – mean of the normative group
SD SD – standard deviation of the
normative group
Example:
Jenny got a score of 75 in a 100-item test. The mean
score of the class is 65 and SD is 5.
Z = 75 – 65
5
=2 (Jenny is 2 standard deviations above the mean)
Example:
Mean of a group in a test: X
= 26 SD = 2
X X 27 26 1 X X 25 26 1
Z Z
SD 2 2 SD 2 2
Z = 0.5 Z = -0.5
T-Score
refers to any set of normally distributed standard deviation
score that has a mean of 50 and a standard deviation of 10
Formula:
T score 50 10(Z )
Example:
Joseph’s T-score = 50 + 10(0.5)
= 50 + 5
= 55
John’s T-score = 50 + 10(-0.5)
= 50 – 5
= 45
ASSIGNING GRADES / MARKS / RATINGS
Could be in:
1. percent such as 70%, 88% or 92%
2. letters such as A, B, C, D or F
3. numbers such as 1.0, 1.5, 2.75, 5
4. descriptive expressions such as Outstanding
(O), Very Satisfactory (VS), Satisfactory (S),
Moderately Satisfactory (MS), Needs Improvement (NI)
ASSIGNING GRADES / MARKS / RATINGS
Could represent:
1. how a student is performing in relation
to other students (norm-referenced
grading)
2. the extent to which a student has
mastered a particular body of knowledge
(criterion-referenced grading)
3. how a student is performing in relation
to a teacher’s judgment of his or her
potential
ASSIGNING GRADES / MARKS / RATINGS
Could be for: