Sunteți pe pagina 1din 12

PAPER

NATURE OF STUDENT ASSESSMENT

CHEMISTRY EDUCATION DEPARTMENT

SCHOOL OF SCIENCE AND MATHEMATICS EDUCATION

YOGYAKARTA STATE UNIVERSITY

2010
CHAPTER I

INTRODUCTION

A. Background

In the first chapter we’ve already known about achievement, assessment and
instruction. Assessment is used as a broad category that includes all of the methods used to
determine the extent to which students are achieving the intended learning competences of
instruction including both testing and performance assessment. Both are important. The
knowledge test tells how well the students know what to do and the performance assessment
tells how skillfully the students can do it.

Nowadays, most teachers use selection-type tests because the questions can be
answered in relatively short time, they are easy to scores and the results can be expressed in
numbers that are easily recorded, compared and reported to others. Various studies have
shown that between 80 – 90 % of teacher made tests focus on knowledge outcomes.
However, education is best served in paper-and-pencil testing and the assessment of actual
performance with both focusing on complex learning tasks.

B. Problems

To make the teaching learning process run well and succeed there are some important
things which have to be considered. Here, we’re going to discuss about:

1. What are the major types of assessment method?

2. How are the characteristics of those major types of assessment methods?

3. What are the guidelines for effective student assessment?

4. What are the meaning of validity and reliability?

5. What are norm-referenced and criterion-referenced assessments?

C. Purposes

Based on the problems above, the purposes of this discussion are to:

1. Mention and describe major types of assessment methods.


2. Distinguish the characteristics of each method

3. List the guidelines for effective students assessment

4. Describe the meaning of validity and reliability and the role play in preparing
assessment procedures

5. Distinguish between norm-referenced and criterion-referenced assessments.


CHAPTER II

NATURE OF STUDENT ASSESSMENT

A. Major Types of Assessment Methods

Teacher must choose a right assessment method to assess students. There are many
kinds of assessment methods but they can be summarized in four main categories. They are:

1. Selected-response Tests, it requires the student to choose the correct or best answer, as
in multiple-choice, true-false and matching tests.

2. Supply-response tests, they require the student to respond with a word, short phrase or
completely essay answer.

3. Restricted performance assessments, they focus on a limited task which is highly


structured.

4. Extended performance assessments, they require less structured but more


comprehensive task, such as writing short story and using computer to solve a
problem.

Those methods require students to apply their knowledge and skills to performance
tasks in realistic setting. They can also make a review the task before submitting it to the
teacher and make it more realistic.

B. Characteristic of Assessment Methods

Based on the major types of assessment methods mentioned above, we can determine
the characteristics of each. There’s no the worst or the best methods among them. Each
method plays an important role in assessing students. Those methods can be compared by
their characteristics that are typical of it. They are the realism of tasks, complexity of tasks,
assessment time needed and judgment in scoring. The more difficult the task is, the higher the
characteristic is.

a. Realism of tasks
It means how far the tasks contribute or simulate performance in the real
world. Selected-response tests have the lowest realism of tasks among others
because they emphasize in selecting a response from a given list of possible
answer. In fact, highly structured problems seldom occur in the real world. The
extended performance assessments have the highest realism of tasks among others
because they involve simulating performance in the real world. For examples do
lab work, give a speech and operate a machine require comprehensive of response
that often occurs in the real world. In between those two extreme differences,
there are the supply-tests and the restricted-response performance. They provide
amount of structured problem but having greater freedom of response that are
more realistic problems than selection-type tests.

However, there has been also a trend to make paper-and-pencil tests more
authentic or having more realism. The tasks are designed based on the fact in the
real world so students can select the most realistic answers to solve it from some
alternative answers provided. In some cases, it’s the combination of both
selection-type test and supply-response test by giving opinion why the selected
answer was chosen.

b. Complexity of Tasks

Selected-response tests are commonly low in complexity of tasks. Though the


alternative answers are designed to measure students’ understanding and thinking
skills but they can be easily answered by the students. While extended
performance tasks typically involve multiple learning competences. They provide
various kinds of possible solutions of the problems available and there are some
criteria that have to be considered to evaluate the result. Supply type tests lie
between these two extreme methods. For example, essay test can be designed to
select, integrate and express ideas but it’s more limited than in performance
assessment. As we can see from the difficulties of the task given, we can conclude
that performance assessment has the highest complexity of tasks among others.

c. Assessment Time Needed

Selected-response items can be easily given to students in relatively short time


and the results can be quickly scored and announced manually or by machine.
This method is the most efficient because it only spends short time. Unlike
selected-response test, performance assessment tends to be extremely spends
much time. It may require days or even weeks to complete. The evaluation of this
assessment is very difficult and time consuming. Supply-response test (e.g. an
essay test) takes more time to score than selected-response test but less than
performance assessment. Consuming much time may cause loss of content
coverage of the result. It is because of the limited problem given to the task
doesn’t cover the whole instructional program.

d. Judgment in scoring

Scoring in selected-response test is completely objective because students only


choose right or wrong answer or the best answer. Supply-response test provide
more freedom in answering the answer, so it needs more subjectivity in scoring.
The scoring focuses on many criteria, such as completeness, coherence etc. In
performance assessment, the demands on teacher become greater and greater
because it really depends on their assessing. Different scorers may produce
different scores too based on their knowledge and opinion about the performance.
Moreover, complex performance which involves the integration of many kinds of
information may have multiple solutions that make it’s getting harder to score.

C. Guidelines for Effective Student Assessment

One of the goals of a classroom assessment program is to improve students’


motivation to learn something more. However, teacher must use the right methods to assess
students. There are some guidelines to use student assessment effectively.

1. Effective assessment requires a clear conception of all intended learning


outcomes.

To assess students, teacher must specify the learning outcomes so the students
will be more focus on what they’re going to perform.

2. Effective assessment requires that a variety of assessment procedures be used.

In assessing learning outcomes, a combination of methods may be most


suitable.
3. Effective assessment requires that the instructional relevance of the procedures be
considered.

Instructionally relevant assessment means that the intended outcomes of


instruction, the domain of learning tasks and the assessment procedures all are in a
close agreement.

4. Effective assessment requires an adequate sample of student performance.

Sampling is needed in assessing learning outcomes. Because of the limited


time available for assessment there should be a representative sample to assess
students’ performance.

5. Effective assessment requires that the procedures be fair to everyone.

The use of assessment result is to improve students’ learning and it would be


better if it occurs in an atmosphere of fairness. Including racial and gender
stereotype may distort the result and create a feeling of unfairness. So we must
avoid that situation as long as we can.

6. Effective assessment requires the specifications of criteria for judging successful


performance.

There must be some criteria in assessing student so we can classify them based
on their intelligence. If students’ performance is surpassed of that of others it’s
included to an excellent performance. If the performance was lower than those of
others it’s included into poor performance. Establishing criteria for assessing
performance is very difficult. But it can be easier if the intended learning
outcomes is clearly stated first.

7. Effective assessment requires feedback to students that emphasizes strengths of


performance and weakness to be corrected.

Feedback of assessment results to students is an essential factor. Feedback


must meet the following criteria in order to be effective.

a. Should be given immediately following or during the assessment

b. Should be detailed and understandable to students.


c. Should focus on successful elements of the performance and errors to be
corrected

d. Should provide remedial suggestions for correcting errors.

Should be positive and provide a guide for improving both performance and
self assessment.

8. Effective assessment must be supported by comprehensive grading and reporting


system.

If the learning outcomes are assessed by tests and by performance


assessments, and a single grade is used, each type of assessment should receive
equal weight in the grade. It may be necessary to use both letter grade and more
elaborate report. The grading and reporting systems should reflect and support the
assessment procedures be made clear to students at the beginning of instruction,
and provide for periodic feedback to students concerning their learning progress.

D. Validity and Reliability in Assessment Planning

The main characteristics of a good assessment procedure are validity and reliability.
Validity means the appropriateness and meaningfulness of the inference we make from
assessment result. Performance assessment is sometimes viewed more valid assessment than
the paper-and-pencil tests because performance assessment more focuses on the performance
taught, simply said it focuses directly on practice. For example, if we want to determine
whether the students can read, we ask them to read. If we want to determine whether the
students can speech, we ask them to speech. Those tasks have the appearance of being valid.
However, each requires its own specifications and scoring criteria. The time consuming of
performance assessment also makes the sampling becomes limited.

Reliability refers to the consistency of the assessment result. It means that in any kind
of situation and condition the students are expected to get the same result. For example, if
students get 70 in chemistry test, they will get at least the same score when we tested them at
a different time or with different sample of equivalent items. This consistency of results
indicates that they are relatively free from errors.

However, we can’t expect that the result would be perfectly consistent in any different
occasions or different samples. Many factors can influence the result. Such as ambiguities,
various kind of samples, fluctuation o motivation and attention of students and also luck can
cause the errors occur to the assessment result. So the point is, we have to minimize the errors
that the result of the assessment can be as reliable as possible. The relation between validity
and reliability is that reliability provides the consistency of result that makes valid inferences
possible. Both validity and reliability of assessment result can be provided for during the
preparation of assessment procedures. When we include enough number of tests, and we use
not an ambiguity procedure, irrelevant source of difficulty, we are providing both reliability
and validity.

There are some important features that can enhance the validity and reliability of
assessment result.

Desired Features Procedures to Follow


1. Clearly specified set of learning 1. State intended learning outcomes
outcomes in performance terms
2. Representative sample of a clearly 2. Prepare a description of the
defined domain of learning tasks achievement domain to be
assessed and the sample of the
tasks to be used
3. Tasks that are relevant to the 3. Match assessment tasks to the
learning outcomes to be measured specified performance stated in
the learning outcomes
4. Tasks that are at the proper level of 4. Match assessment task difficulty
difficulty to the learning task, students’
abilities and the use to be made
of the result
5. Tasks that function effectively in 5. Follow general guidelines and
distinguishing between achievers specific rules for preparing
and nonachievers. assessment procedures and be
alert for factors that can distort
the result
6. Sufficient number of tasks to 6. Where the students’ age or
measure enough sample of available assessment time limit
achievement, provide dependable the number of tasks, make
result, and allow for meaningful tentative interpretations, assess
interpretation of the result more frequently and verify the
result with other evidence.
7. Procedures that contribute to 7. Write clear and arrange
efficient preparation and use procedures for ease of
Desired Features Procedures to Follow
administration, scoring and
interpretation

E. Norm-referenced and Criterion-referenced Assessments

An achievement assessment is used to provide a relative ranking of students or a


description of the learning tasks a student can or cannot do. Each references produce different
result. The result of norm-referenced assessment is interpreted in students’ relative
achievement among others. For example, student A got the 2nd highest score in class of 30
students. The result of criterion-referenced assessment is that students can or cannot
demonstrate a specific knowledge and skills they have about the assessment. For example,
student B can mention parts of microscope and demonstrate to others how to use it.

Both methods are very useful in determining the capability of students. The first
method shows an individual’s performance compared with others and the second one tells
whether individual can do a performance. Norm referenced and criterion referenced only
interprets the result of the performance of students. Norm referenced mostly uses score to
make a rank of the students. While criterion referenced is based on details of student
performance. There are some differences between norm-referenced and criterion-referenced.
It can be seen as follow:

Norm-referenced testing Criterion-referenced


testing
Principal use major Survey testing Mastery testing
emphasis Measures individual Describe tasks students
differences in can perform
achievement
Interpretation of result Compare individual’s Compare performance
performance to others to a clearly specified
achievement domain
Content coverage Typically covers a broad Typically focuses on a
area of achievement limited set of learning
tasks
Nature of test plan Use table of specification Detailed domain
specifications are
favored
Item selection procedures Items are selected that Includes all t\items
provide maximum needed to adequately
discrimination among describe performance
individuals
Performance standards Level of performance is Level of performance is
determined by relative commonly determined
position in some known by absolute standards.
group

Norm-referenced tests are typically used to survey an achievement over a large range
of learning outcomes. While criterion-referenced tests are typically used for mastery testing,
and we can help students to improve learning by determining what tasks they can’t do. Norm-
referenced interpretation is most useful when we concern about relative ranking of students
and to classify students. It’s the best way of combining both two types of test to get a
maximum result.
CHAPTER III

CONCLUSION

Based on the discussion above we can conclude that:

1. The major types of student assessment are selected-response test, supply-response


test, restricted-response performance assessment and extended-response performance
assessment.

2. Selected-response tests are lowest in realism and complexity of tasks assessed but
require little time to administer and can be quickly scored.

Supply-response tests are higher in realism and complexity of tasks than selected-response
tests, but they are more consuming time and more difficult to score.

Performance assessment, both restricted-response performance assessment and extended-


response performance assessment can be designed with high degrees of realism that focus on
complex task, but they require large amount of time and the scoring is highly difficult and
subjective.

3. Effective assessment requires a clear conception of all intended learning outcomes, a


variety of assessment procedures that are relevant to the instruction, an adequate
sample of tasks, procedures that are fair to everyone, criteria for judging scores,
timely and detailed feedback to students and a grading-and-reporting system that is in
harmony with the assessment program.

4. Validity is the appropriateness and meaningfulness of the inference we make from


assessment result. Reliability is the consistency of the assessment result.

5. Norm-referenced is interpreted by comparing a student’s performance to that of


others. Criterion-referenced is interpreted by describing the student’s performance on
a clearly defined set of tasks.

S-ar putea să vă placă și