Documente Academic
Documente Profesional
Documente Cultură
ABSTRACT. The present study investigated whether gender differences were present on
the confidence judgments made by 8th grade Taiwanese students on the accuracy of their
responses to acid–base test items. A total of 147 (76 male, 71 female) students provided
item-specific confidence judgments during a test of their knowledge of acids and bases.
Using the correctness of the answer responses, a confidence rating score, an unweighted
rating score, and a relative confidence rating score were calculated for each respondent.
The correlations between the boys and girls for each score area showed girls as scoring
higher than boys in their knowledge of acids and bases, were more confident in this
knowledge, and more willing to express different levels of confidence among the test items.
could argue, therefore, that being generally more confident toward the
science of chemistry does not mean one is more confident and more
correct on chemistry tests.
The rationale of our study was spurred on by a statement made by
Feng & Tuan (2005) about their Taiwanese high school students as
saying that they “…did not have confidence as to their success in
chemistry learning.” This statement resulted in us asking the question
“Does this lack of confidence toward chemistry learning exist at the
middle school level when Taiwanese students are first introduced to
the complexities of chemistry?” In a country that typically tops the
charts in science and mathematics in assessments such as TIMMS and
PISA, we were initially puzzled by such a statement. After doing a
careful investigation of confidence assessments and discovering that
respondents are typically not accurate in predicting their general
confidence (i.e., self-efficacy) in academic settings and that no studies
had been published in assessing students’ confidence toward item-
specific content, we decided to test the use of item-specific confidence
weighting among Taiwanese middle school students during their first
introduction of chemistry. Since general confidence was also used to
demonstrate that boys were more confident toward science than girls
in a recent study by Chen & Howard (2010), we also wanted to
determine whether this difference of confidence could be seen if
students used item-specific confidence weighting.
This present study sought to investigate whether boys were more
accurate and confident than girls when taking a test on acids and
bases. The teachers who assisted us in this study said that acids and
bases were typically difficult for middle school Taiwanese students to
understand. These teachers felt that the use of item-specific confidence
may help them better understand the confidence of their students
toward this important area of chemistry. These teachers were also
interested in knowing which gender would be more accurate in
expressing their confidence toward correct answers. To our knowledge,
no such studies have been conducted in an Asian culture. We believe
that the results of our study may encourage future research on
investigating item-specific confidence between boys and girls at other
grade levels and in other content areas.
We used Mangan’s (2001) concept of rightness as the theory
underlying the students’ use of item-specific confidence during testing.
He argued that feelings of “rightness” signal a person’s “feeling of
knowing” and correspond to their core feelings of positive evaluation,
coherence, meaningfulness, and knowledge. Item-specific confidence
892 BRADY MICHAEL JACK ET AL.
METHODOLOGY
Participants
The experiment herein involved giving a test of acids and bases after
10 weeks of instruction to 147 (76 male, 71 female) eighth grade
chemistry students from a large metropolitan public school in Southern
Taiwan.
TABLE 1
System of weighted values and point values for multiple-choice test items
Point value
Weighted Significance of
value weighted value Right Wrong
TABLE 2
Example of a student’s weighted values and earned points
Weighted Earned
Test items
value points
2. Which of the following statements is not correct?
(a) Electrolyte solution will be able to conduct electricity, so its liquid
can also be conductive;
(b) When electrolyte solution conducts electricity, a chemical reaction
must take place;
(c) The total charge carried by positive ions and negative ions within
15 15
an electrolyte solution equals the total power;
(d) When electrical current is induced into an electrolyte solution,
positive ions move to the cathode and negative ions move to
the anode.
follow three rules: (1) a response must be selected for every test item; (2)
after each response, the student must write a weighted value next to the
selection; (3) before giving the completed test to the teacher, the student
must make sure all weighted values equal 100 points. If any one of these
three rules is not followed, a score of 0 is automatically given to the
whole test, and any points that the student would have earned for right
responses would be lost.
TABLE 3
Example of generating two total test scores from one set of responses
Response: a d c c b a a d b c
Weighted value: 10 15 5 10 10 15 15 5 10 5
Score
Weighted: 10 15 5 10 10 15 15 0 0 5 80
Unweighted: 10 10 0 10 10 10 10 0 0 10 70
Note: A zero "0" value indicates that an answer selection for an item was not correct
girls as superior in language and liberal arts. Based upon these studies, we
hypothesized that the boys in this study would be willing to show more risk
in varying weighting values toward answer selections than girls.
In order to test these assumptions, a relative confidence score (RCS)
index was constructed (see Table 4). The RCS index lists all possible
combinations of weighted values a respondent could make toward his or
her answer responses within the predefined limits of 100 points. The RCS
index in Table 4 shows six different RCS ratings from 100 to 0. An RCS
rating of 0 indicates an aversion to risk in using different levels of
weighted values among answer selections. An RCS rating of 100 shows a
high willingness to show risk using different levels of weighted values
among answer selections.
The risk inclination model (Jack, Hung, Liu & Chiu, 2009a) was
used to calculate the RCS ratings for each combination of confidence.
A detailed explanation of how we calculated each RCS rating is given
in the Appendix. After the RCS rating index was constructed, we
compared the combination of the confidence values each student used on his
or her test with those listed in the index. Once a combination match was made,
the RCS rating for that combination was assigned to the student. Looking back
at Table 3 and comparing the combination of weighted values used by that
student with the RCS index, one can see that her combination of weighted
values match those under the RCS rating of 84. This RCS rating of 84,
TABLE 4
RCS index
Possible Item 15 15 15 15 15 10
combinations Item 15 15 15 15 10 10
of weighted Item 15 15 15 10 10 10
values assigned Item 15 15 10 10 10 10
to answer Item 15 10 10 10 10 10
selections Item 5 10 10 10 10 10
of test items Item 5 5 10 10 10 10
Item 5 5 5 10 10 10
Item 5 5 5 5 10 10
Item 5 5 5 5 5 10
MEASURING CONFIDENCE IN KNOWLEDGE OF ACIDS AND BASES 897
therefore, was assigned to this student as her relative confidence score for this
test.
After a RCS rating was assigned to each student for each gender, we
used an independent samples t test to measure the difference between the
boys and girls. Cohen’s d effect size analysis was used to measure the
relative importance of this difference.
RESULTS
Scoring Accuracy
TABLE 5
Weighted versus unweighted score results between genders
Students
CONCLUSION
IMPLICATIONS
of these strategies during testing. As the old saying goes, “What you do in
practice, you will do in the game.” To prove the validity of this saying in the
use of item-specific confidence weighting, we encourage researchers to
consider the testing of two hypotheses: (1) Using item-specific confidence
weighting on classroom exercises that emphasize the practice of specific
content improves weighted and unweighted scores on formative tests. (2)
Using item-specific confidence weighting on classroom exercises and
formative tests over a period of time can predict students’ self-efficacy within
a specific domain.
ACKNOWLEDGMENTS
The risk equation in RIM (Jack et al. 2009a) was used to calculate each
RCS rating score (see Table 6).
TABLE 6
RCS index
where TPV is the total point value of the test and I represents an
inclination to use different weighted values throughout the combination
set, calculated as
( )
Xn
I¼ ðW0 Wi Þ i = TPV ð2Þ
i¼1
TABLE 7
P
n
I¼ ðW0 Wi Þ i = TPV
i¼1
(10 − 15) × 1 = −5
(10 − 15) × 2 = –10
(10 − 15) × 3 = –15
(10 − 10) × 4 = 0
(10 − 10) × 5 = 0
(10 − 10) × 6 = 0
(10 − 10) × 7 = 0
(10 − 5) × 8 = 40
(10 − 5) × 9 = 45
(10 − 5) × 10 = 50
I = 105/100
MEASURING CONFIDENCE IN KNOWLEDGE OF ACIDS AND BASES 903
TABLE 8
( )
P
n=2 P
n
Imax ¼ ðW0 Wmax Þ i þ ðW0 Wmin Þ i
i¼1 n=2þ1
where Wmax is the maximum weighted value that could have been used.
In this study, the Wmax is 15 and Wmin is the minimum weighted value
that could have been used. In this study, the Wmin is 5.
The following weighted values (10, 15, 5, 10, 10, 15, 15, 5, 10, 5)
represent the weighted values of a student shown in Table 3 of the
manuscript.
These weighted (Wi) values are redistributed in descending order: 15,
15, 15, 10, 10, 10, 10, 5, 5, 5. Wmax (i.e., student’s maximum weighted
value) is 15 and the Wmin (i.e., student’s minimum weighted value) is 5.
Table 7 shows the use of Eq. 2 to calculate I for these redistributed
values. Using Eq. 2, I equals 1.05. Next, we use Eq. 3 to calculate Imax
(see Table 8).
Using Eq. 3, Imax equals 1.25. Finally, we use Eq. 1 to calculate R,
which is the RCS rating in this study (see Table 9).
TABLE 9
R ¼ I =Imax TPV
The RCS rating score for the student’s distribution of the weighting
values 15, 15, 15, 10, 10, 10, 10, 5, 5, and 5 is 84. This score of 84 is the
RCS rating score for this same distribution shown in Tables 4 and 6.
REFERENCES
Chen, C.-H. & Howard, B. (2010). Effect of live simulation on middle school students'
attitudes and learning toward science. Educational Technology & Society, 13(1), 133–
139.
Ebel, R. L. (1965). Confidence weighting and test reliability. Journal of Educational
Measurement, 2(1), 49–57.
Echternacht, G. J. (1972). The use of confidence testing in objective tests. Review of
Educational Research, 42(2), 217–236.
Else-Quest, N. M., Hyde, J. S. & Linn, M. C. (2010). Cross-national patterns of gender
differences in mathematics: A meta-analysis. Psychological Bulletin, 136(1), 103–127.
Feng, S.-L. & Tuan, H.-L. (2005). Using ARCS model to promote 11th graders'
motivation and achievement in learning about acids and bases. International Journal of
Science and Mathematics Education, 3, 463–484.
Glenberg, A. M. & Epstein, W. (1987). Inexpert calibration of comprehension. Memory &
Cognition, 15, 84–93.
Halpern, D. F., Aronson, J., Reimer, N., Simpkins, S., Star, J. R. & Wentzel, K. (2007).
Encouraging girls in math and science—Institute of Education Sciences. Washington:
National Center for Education Research.
Hastedt, D. & Sibberns, H. (2005). Differences between multiple choice items and
constructed response items in the IEA TIMSS surveys. Studies in Educational
Evaluation, 31, 145–161.
Hong, Z.-R., Veach, P. M. & Lawrenz, F. (2003). An investigation of the gender
stereotyped thinking of Taiwanese secondary boys and girls. Sex Roles, 48(11/12), 495–
504.
Isaacson, R. M. & Fujita, F. (2006). Metacognitive knowledge monitoring and self-
regulated learning: academic success and reflections on learning. Journal of the
Scholarship of Teaching and Learning, 6(1), 39–55.
Jack, B. M., Hung, K. M., Liu, C. J. & Chiu, H. L. (2009a). Utilitarian Model of
Confidence Testing for Knowledge-Based Societies. Paper presented at the American
Education Research Association (AERA), San Diego, CA, 13–17 April 2009.
Jack, B. M., Liu, C. J., Chiu, H. L. & Shymansky, J. A. (2009b). Confidence Testing for
Knowledge-Based Global Communities. Paper presented at the American Education
Research Association (AERA), San Diego, CA, 13–17 April 2009.
Lundeberg, M. & Mohan, L. (2009). Context matters: Gender and cross-culture
differences in confidence. In D. J. Hacker, J. Dunlosky & A. C. Graesser (Eds.),
Handbook of metacognition in education (pp. 222–239). New York: Routledge.
Mangan, B. (2001). Sensation's ghost: The non-sensory "fringe" of consciousness. Psyche,
7(18). http://www.theassc.org/files/assc/2509.pdf.
Martin, M. O., Mullis, I. V. S., Foy, P., Olson, J. F., Erberber, E., Preuschoff, C., Galia, J.
(2008). TIMSS 2007 International Science Report Findings from IEA’s Trends in
International Mathematics and Science Study at the Fourth and Eighth Grades.
Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College.
MEASURING CONFIDENCE IN KNOWLEDGE OF ACIDS AND BASES 905
Odom, A. L. & Barrow, L. H. (2007). High school biology students' knowledge and
certainty about diffusion and osmosis concepts. School Science and Mathematics, 107
(3), 94–101.
Pajares, F. (1996). Self-efficacy beliefs in academic settings. Review of Educational
Research, 66(4), 543–578.
Planinic, M., Boone, W. J., Krsnik, R. & Beilfuss, M. L. (2006). Exploring alternative
conceptions from Newtonian dynamics and simple DC circuits: Links between item
difficulty and item confidence. Journal of Research in Science Teaching, 43(2), 150–
171.
Rahayu, S., Chandrasegaran, A. L., Treagust, D. F., Kita, M. & Ibnu, S. (2011).
Understanding acid–base concepts: Evaluating the efficacy of a senior high school
student-centered instructional program in Indonesia. International Journal of Science
and Mathematics Education doi:10.1007/s10763-010-9272-x.
Schraw, G. (2008). A conceptual analysis of five measures of metacognitive monitoring.
Metacognition Learning, 4, 33–45.
Soderquist, H. O. (1936). A new method of weighting scores in a true–false test. Journal
of Educational Research, 30, 290–292.
Tuan, H.-L., Chin, C.-C. & Shieh, S.-H. (2005). The development of a questionnaire to
measure students' motivation towards science learning. International Journal of Science
Education, 27(6), 639–654.
United States Congress (2009). Encouraging the participation of female student in stem
Fields:Hearing before the Subcommittee on Research and Science Education,
Committee on Science and Technology, House of Representatives, Serial No. 111-45,
111 CONG.
Ziller, R. C. (1957). A measure of the gambling response-set in objective tests.
Psychometrika, 22, 289–292.
Zoller, U., Fastow, M., Lubezky, A. & Tsaparlis, G. (1999). Students' self-assessment in
chemistry examinations requiring higher- and lower-order cognitive skills. Journal of
Chemical Education, 76, 112–113.
Houn-Lin Chiu
Chemistry Department
Kaohsiung Normal University
Kaohsiung, Taiwan, Republic of China
Chun-Yen Tsai
Department of Information Management
Cheng Shiu University
Kaohsiung, Taiwan, Republic of China