Sunteți pe pagina 1din 5

What Is Validity?

The extent to which a test measures what it supposed to measure. It is vital for a test to be valid in order for the results to be accurately applied and interpreted. There are three types of validity which are:

1. CONTENT VALIDITY
When a test has content validity, the items on the test represent the entire range of possible items the test should cover. Individual test questions may be drawn from a large pool of items that cover a broad range of topics. The content of assessment must measure the stated objective and the item or theory must be logic and make sense. An expert will choose the most relevant item according to the contents level of student to be included in the test.

2. CRITERIA VALIDITY
A test is said to have criteria validity when the test has demonstrated its effectiveness in predicting criteria or indicators of a construct. The scores of students represent their outside references and future achievements. For example, do high scores on standard one spelling test accurately predict spelling and writing skill in future grade? There are two types of validity test which are concurrent validity and predictive validity. a. Concurrent Validity occurs when the criterion measures are obtained at the same time as the test scores. This indicates the extent to which the test scores accurately estimate an individuals current state with regards to the criterion. For example, on a test that measures levels of depression, the test would be said to have concurrent validity if it measured the current levels of depression experienced by the test taker. b. Predictive Validity occurs when the criterion measures are obtained at a time after the test. Examples of test with predictive validity are career or aptitude tests, which are helpful in determining who is likely to succeed or fail in certain subjects or occupations.

3. CONSTRUCT VALIDITY
A test has construct validity if it demonstrates an association between the test scores and the prediction of a theoretical trait. Intelligence tests are one example of measurement instruments that should have construct validity. The question is that does the assesment correspond to other significant variables? For example, do the experiment question in examination indicates that the students understand and know how to conduct the experiment?

FACTORS INFLUENCING VALIDITY


History Maturation Instrumentation Experimenter bias

What is Reliability?
Reliability refers to the consistency of a measure. A test is considered reliable if we get the same result repeatedly.

Test-retest method Test-retest method


Test retest method is a condition where the same form of a test on two or more separate occasions was given to the same group of examinees. On many occasions, this approach is not practical because repeated measurements are likely to change the examinees. For example, the examinees will adapt the test format and thus tend to score higher in later tests. Hence, careful implementation of the test-retest approach is strongly recommendation.

Equivalent Form Equivalent Form


An equivalent form are two different forms of test, based on the same content, on one occasion to the same examinees but in Alternate form. After alternate forms have been developed, it can be used for different examinees.As we can see, it is very common in high-stake examination for pre-empting cheating. By using this method,an examinee who took Form A earlier could not share the test items with another student who might take Form B later, because the two forms have different items.

Inter-rater Reliability Inter-rater Reliability


Interrater reliability refers to the concern that a student's score may vary from rater to rater. Students often criticize exams in which their score appears to be based on the subjective judgment of their instructor. For example, one manner in which to analyze an essay exam is to read through the students' responses and make judgments as to the quality of the students' written products. Without set criteria to guide the rating process, two independent raters may not assign the same score to a given response. Each rater has his or her own evaluation criteria.

Scoring rubrics respond to this concern by formalizing the criteria at each score level. The descriptions of the score levels are used to guide the evaluation process. Although scoring rubrics do not completely eliminate variations between raters, a well-designed scoring rubric can reduce the occurrence of these discrepancies.

Internal Consistancy Internal Consistancy


Internal Consistancy is the coefficient of test scores obtained from a single test or survey. For instance, let's say respondents are asked to rate statements in an attitude survey about computer anxiety. One statement is "I feel very negative about computers in general." Another statement is "I enjoy using computers." People who strongly agree with the first statement should be strongly disagreeing with the second statement, and vice versa. If the rating of both statements is high or low among several respondents, the responses are said to be inconsistent and patternless. The same principle can be applied to a test. When no pattern is found in the students' responses, probably the test is too difficult and students just guess the answers randomly.

Split Half Method Split Half Method


Split-half reliability is another subtype of internal consistency reliability. The process of obtaining split-half reliability is begun by splitting in half all items of a test that are intended to probe the same area of knowledge in order to form two sets of items. The entire test is administered to a group of individuals, the total score for each set is computed, and finally the split-half reliability is obtained by determining the correlation between the two total set scores.

Factors Affecting the Reliability Factors Affecting the Reliability


1) When the question was made for the repeated examinees, the format was recommended to be change. However, the two forms of the test actually evaluating the same thing. 2) Poor or unclear directions given during administration or inaccurate scoring can affect reliability.

3) The larger the number of items, the greater the chance for high reliability. For example, it makes sense when you ponder that twenty questions on your leadership style is more likely to get a consistent result than four questions. 4) The reliability also affected by the condition of the examinees. For example, if you took an instrument in August when you had a terrible flu and then in December when you were feeling quite good, we might see a difference in your response consistency. If you were under considerable stress of some sort or if you were interrupted while answering the instrument questions, you might give different responses.

RELATIONSHIP BETWEEN VALIDITY AND RELIABILITY


A good assessment has both validity and reliability, plus the other quality attributes noted above for a specific context and purpose. In practice, an assessment is rarely totally valid or totally reliable If a test is unreliable, it cannot be valid. For a test to be valid, it must reliable. However, just because a test is reliable does not mean it will be valid. Reliability is a necessary but not sufficient condition for validity!

S-ar putea să vă placă și