Sunteți pe pagina 1din 11

The Types Of Reliability Test

There are four procedures in common use for computing the reliability
coefficient (sometimes called the self-correlation) of a test. These are:
1. Test-Retest (Repetition) 2. Alternate or Parallel Forms 3. Split-Half
Technique 4. Rational Equivalence.

1. Test-Retest Method:
To estimate reliability by means of the test-retest method, the same
test is administered twice to the same group of pupils with a given
time interval between the two administrations of the test.

The resulting test scores arc correlated and this correlation coefficient
provides a measure of stability, that is, it indicates how stable the test
results are over a period of time. So it is otherwise known as a measure
of stability.

The estimate of reliability in this case vary according to the length of


time-interval allowed between the two administrations. The product
moment method of correlation is a significant method for estimating
reliability of two sets of scores.

Thus, a high correlation between two sets of scores indicates that the
test is reliable. Means, it shows that the scores obtained in first
administration resemble with the scores obtained in second
administration of the same test.

In this method the time interval plays an important role. If it is too


small say a day or two, the consistency of the results will be influenced
by the carry-over effect, i.e., the pupils will remember some of the
results from the first administration to the second.
If the time interval is long say a year, the results will not only be
influenced by the inequality of testing procedures and conditions, but
also by the actual changes in the pupils over that period of time.

Time gap of retest should not be more than six months. Time gap of
retesting fortnight (2 weeks) gives an accurate index of reliability.

Advantages:
Self-correlation or test-retest method, for estimating reliability
coefficient is generally used. It is worthy to use in different situations
conveniently. A test of an adequate length can be used after an interval
of many days between successive testing.

Disadvantages:
1. If the test is repeated immediately, many subjects will recall their
first answers and spend their time on new material, thus tending to
increase their scores—sometimes by a good deal.

2. Besides immediate memory effects, practice and the confidence


induced by familiarity with the material will almost certainly affect
scores when the test is taken for a second time.

Index of reliability so obtained is less accurate.

4. If the interval between tests is rather long (more than six months)
growth factor and maturity will effect the scores and tends to lower
down the reliability index.

5. If the test is repeated immediately or after a little time gap, there


may be the possibility of carry-over effect/transfer
effect/memory/practice effect.
6. On repeating the same test, on the same group second time, makes
the students disinterested and thus they do not like to take part
wholeheartedly.

Sometimes, uniformity is not maintained which also affects the test


scores.

8. Chances of discussing a few questions after the first administration,


which may increase the scores at second administration affecting
reliability.

2. Alternate or Parallel Forms Method:


Estimating reliability by means of the equivalent form method
involves the use of two different but equivalent forms of the test.
Parallel form reliability is also known as Alternative form reliability or
Equivalent form reliability or Comparable form reliability.

In this method two parallel or equivalent forms of a test are used. By


parallel forms we mean that the forms arc equivalent so far as the
content, objectives, format, difficulty level and discriminating value of
items, length of the test etc. arc concerned.

Parallel tests have equal mean scores, variances and inter co-relations
among items. That is, two parallel forms must be homogeneous or
similar in all respects, but not a duplication of test items. Let the two
forms be Form A and Form B.

The reliability coefficient may be looked upon as the coefficient


correlation between the scores on two equivalent forms of test. The
two equivalent forms are to be possibly similar in content, degree,
mental processes tested, and difficulty level and in other aspects.
One form of the test is administered on the students and on finishing
immediately another form of test is supplied to the same group. The
scores, thus obtained are correlated which gives the estimate of
reliability. Thus, the reliability found is called coefficient of
equivalence.

Gulliksen 1950: has defined parallel tests as tests having equal means,
equal variance and equal inter co-relations.

Guilford: The alternative form method indicates both equivalence of


content and stability of performance.

Advantages:
This procedure has certain advantages over the test-retest
method:
1. Here the same test is not repeated.

2. Memory, practice, carryover effects and recall factors are minimised


and they do not effect the scores.

3. The reliability coefficient obtained by this method is a measure of


both temporal stability and consistency of response to different item
samples or test forms. Thus, this method combines two types of
reliability.

4. Useful for the reliability of achievement tests.

5. This method is one of the appropriate methods of determining the


reliability of educational and psychological tests.
Limitations:
1. It is difficult to have two parallel forms of a test. In certain situations
(i.e. in Rorschach) it is almost impossible.

2. When the tests are not exactly equal in terms of content difficulty,
length, the comparison between two set of scores obtained from these
tests may lead to erroneous decisions.

3. Practice and carryover factors cannot be completely controlled.

4. Moreover, administering two forms simultaneously creates


boredom. That is why people prefer such methods in which only one
administration of the test is required.

5. The testing conditions while administering the Form B may not be


the same. Besides, the testes may not be in a similar physical, mental
or emotional state at both the times of administration.

6. Test scores of second form of the test are generally high.

Although difficult, carefully and cautiously constructed parallel forms


would give us reasonably a satisfactory measure of reliability. For well-
made standardised tests, the parallel form method is usually the most
satisfactory way of determining the reliability.

3. Split-Half Method or Sub-divided Test Method:


Split-half method is an improvement over the earlier two methods,
and it involves both the characteristics of stability and equivalence.
The above discussed two methods of estimating reliability sometimes
seems difficult.

It may not be possible to use the same test twice and to get an
equivalent forms of test. Hence, to overcome these difficulties and to
reduce memory effect as well as to economise the test, it is desirable to
estimate reliability through a single administration of the test.

In this method the test is administered once on the sample and it is


the most appropriate method for homogeneous tests. This method
provides the internal consistency of a test scores.

All the items of the test are generally arranged in increasing order of
difficulty and administered once on sample. After administering the
test it is divided into two comparable or similar or equal parts or
halves.

The scores are arranged or are made in two sets obtained from odd
numbers of items and even numbers of items separately. As for
example a test of 100 items is administered.

The scores of individual based on 50 items of odd numbers like 1, 3,


5,.. 99 and scores based on even numbers 2, 4, 6… 10 are separately
arranged. In part ‘A’ odd number items are assigned and part ‘B’ will
consist of even number of items.

After obtaining two scores on odd and even numbers of test items, co-
efficient of correlation is calculated. It is really a correlation between
two equivalent halves of scores obtained in one sitting. To estimate
reliability, Spearman-Brown Prophecy formula is used.

The Spearman-Brown formula is given by:

in which r11 = the reliability of the whole test.


r11/22 = the coefficient of correlation between two half tests.
Example 1:
A test contains 100 items. All these items are arranged in order of
difficulty as one goes from the first to the hundredth one. Students
answer the test and the test is scored.

The scores are obtained by the students in odd number of items and
even number of items are totaled separately. The coefficient of
correlation found between these two sets of scores is 0.8.

The reliability of the whole test (or)

While using this formula, it should be kept in mind that the variance
of odd and even halves should be equal, i.e.

If it is not possible then Flanagan’s and Rulon’s formulae can be


employed. These formulae are simpler and do not involve computation
of coefficient of correlation between two halves.

Advantages:
1. Here we are not repeating the test or using the parallel form of it
and thus the testee is not tested twice. As such, the carry over effect or
practice effect is not there.
2. In this method, the fluctuations of individual’s ability, because of
environmental or physical conditions is minimised.

3. Because of single administration of test, day-to-day functions and


problems do not interfere.

4. Difficulty of constructing parallel forms of test is eliminated.

Limitations:
1. A test can be divided into two equal halves in a number of ways and
the coefficient of correlation in each case may be different.

2. This method cannot be used for estimating reliability of speed tests.

3. As the lest is administered once, the chance errors may affect the
scores on the two halves in the same way and thus tending to make the
reliability coefficient too high.

4. This method cannot be used in power tests and heterogeneous tests.

Inspite of all these limitations, the split-half method is considered as


the best of all the methods of measuring test reliability, as the data for
determining reliability are obtained upon on occasion and thus
reduces the time, labour and difficulties involved in case of second or
repeated administration.

4. Method of Rational Equivalence:


This method is also known as “Kuder-Richardson Reliability’ or ‘Inter-
Item Consistency’. It is a method based on single administration. It is
based on consistency of responses to all items.

The most common way for finding inter-item consistency is through


the formula developed by Kuder and Richardson (1937). This method
enables to compute the inter-correlation of the items of the test and
correlation of each item with all the items of the test. J. Cronbach
called it as coefficient of internal consistency.

In this method, it is assumed that all items have same or equal


difficulty value, correlation between the items are equal, all the items
measure essentially the same ability and the test is homogeneous in
nature.

Like split-half method this method also provides a measure of internal


consistency.

The most popular formula is Kuder-Richardson i.e. KR-21


which is given below:

q=–p

p=1–q

An example will help us to calculate p and q.

Example 2:
60 students appeared a test and out of them 40 students have given
correct response to a particular item of the test.

p = 40/60 = 2/3
This means y portion of students have given correct response to one
particular item of the test. In which 20 students have given incorrect
response to that item.

Thus q = 20/60 or 1 – 40/60

For each item we are to find out the value of p and q then pq is
summated over all items to get ∑pq . Multiply p and q for each item
and sum for all items. This gives ∑pq.

Advantages:
1. This coefficient provides some indications of how internally
consistent or homogeneous the items of the tests are.

2. Rational equivalence is superior to the split-half technique in


certain theoretical aspects, but the actual difference in reliability
coefficients found by the two methods is often negligible.

3. Split-half method simply measures the equivalence but rational


equivalence method measures both equivalence and homogeneity.

4. Economical method as the test is administered once.

5. It neither requires administration of two equivalent forms of tests


nor it requires to split the tests into two equal halves.

Limitations:
1. The coefficient obtained by this method is generally somewhat lesser
than the coefficients obtained by other methods.

2. If the items of the tests are not highly homogeneous, this method
will yield lower reliability coefficient.

3. Kuder-Richardson and split-half method are not appropriate for


speed test.
4. Different KR formula yield different reliability index.

Related Articles:
1. Estimating Validity of a Test: 5 Methods | Statistics
2. Relation between Validity and Reliability of a Test

Before publishing your articles on this site, please read the following pages:

1. Content Guidelines 2. Prohibited Content 3.Plagiarism Prevention 4. Image Guidelines 5.Content


Filtrations 6. TOS 7. Privacy Policy 8.Disclaimer 9. Copyright 10. Report a Violation

ADVERTISEMENTS

 LATEST

 Importance of Advertising
 Barriers To Delegation of Authority
 Career Development in HRM: Meaning, Need, Stages and Methods
 Employee Training: Objectives, Process, Steps and Methods

 Controlling Function of Management: Meaning, Importance, Process and Need

S-ar putea să vă placă și