2 QuadraticEquations

BRADY MICHAEL JACK, CHIA-JU LIU, HOUN-LIN CHIU and CHUN-YEN TSAI
MEASURING THE CONFIDENCE OF 8TH GRADE TAIWANESE

STUDENTS’ KNOWLEDGE OF ACIDS AND BASES
Received: 3 March 2011; Accepted: 18 April 2011
ABSTRACT. The present study investigated whether gender differences were present on
the confidence judgments made by 8th grade Taiwanese students on the accuracy of their
responses to acid–base test items. A total of 147 (76 male, 71 female) students provided
item-specific confidence judgments during a test of their knowledge of acids and bases.
Using the correctness of the answer responses, a confidence rating score, an unweighted
rating score, and a relative confidence rating score were calculated for each respondent.
The correlations between the boys and girls for each score area showed girls as scoring
higher than boys in their knowledge of acids and bases, were more confident in this
knowledge, and more willing to express different levels of confidence among the test items.
KEY WORDS: acid–base, chemistry education, item-specific confidence, knowledge,

science education
Studies investigating motivation and achievement (e.g., Feng & Tuan

2005) and the perceptions of competence in carrying out inquiry activities
(e.g., Rahayu, Chandrasegaran, Treagust, Kita, & Ibnu, 2011) have used
confidence ratings as a way to measure students’ feelings of certainty
about what they had learned in the chemistry classroom [e.g., Students'
Motivation toward Science Learning (SMTSL): Tuan, Chin, & Shieh,
2005]. Researchers have found that engaging students in alternative
assessment activities that involve self-awareness of their test performance
and progress in learning enhances students’ strengths, reduces their
weaknesses, and develops their higher-order thinking skills, all of which
are important goals of chemistry and science education (Zoller, Fastow,
Lubezky, & Tsaparlis, 1999). The purpose of the present study was to
investigate eighth grade Taiwanese students’ assessments of their item-
specific confidence about their content knowledge of chemistry (acids and
bases) while taking a test.
Two types of confidence measurements exist: general confidence and
item-specific confidence. General confidence is a measure of one’s
beliefs, such as a grade prediction or an assessment of one’s potential
ability to accomplish a task, and is typically used to measure generalized
self-efficacy (Pajares, 1996). Pajares defined self-efficacy by quoting
International Journal of Science and Mathematics Education (2012) 10: 889Y905

# National Science Council, Taiwan (2011)
890 BRADY MICHAEL JACK ET AL.
Bandura as saying self-efficacy consists of “beliefs in one's capabilities to

organize and execute the courses of action required to manage
prospective situations.” An example of a statement that measures this
type of general confidence comes from the SMTSL, which states, “I am
not confident about understanding difficult science concepts.” Feng and
Tuan’s (2005) study used this statement and similar ones to measure
changes in the students’ self-efficacies among two high school chemistry
classes. The students’ responses on the items that focused on self-efficacy
allowed the researchers to measure the generalized feelings of confidence
between two groups of students before and after a 10-week period of
instruction. Although this previous study reported that both groups
comprised an equal number of boys and girls, the study did not focus on
the differences between the gender groups.
Item-specific confidence, in contrast to general confidence, is interested
in measuring two variables: (1) what an examinee believes is the correct
answer and (2) how confident he or she is toward this belief (Echternacht,
1972). Studies investigating students’ alternative conceptions of difficult
concepts in high school biology and physics (Odom & Barrow 2007;
Planinic, Boone, Krsnik, & Beilfuss, 2006) have used item-specific
confidence to measure how certain students were toward their alternative
conceptions. Neither of these studies were concerned with the differences
of item-specific confidence toward the content knowledge between boys
and girls. We were unable to locate any published studies that measured
item-specific confidence of middle school boys and girls toward their
content knowledge of chemistry in an Asian culture. The purpose of the
present study, therefore, was to investigate the differences between
Taiwanese boys and girls in their use of item-specific confidence toward
their content knowledge of chemistry during testing.
Studies on gender differences among middle school Taiwanese boys
and girls have shown boys to be more positive toward the notion of
science attitudes, such as discovering new things, as well as showing
more interest in careers in science than girls (Chen & Howard, 2010).
Similar studies in Western countries support the conclusion that girls
are generally less confident toward math and science than boys
(Lundeberg & Mohan, 2009). To our knowledge, no such conclusions
have been made between boys and girls in their item-specific
confidence on chemistry tests. Previous studies have shown that
students tend to be more overconfident when they are asked to make
predictions to hypothetical task situations (e.g., self-efficacy of one’s
performance on a test) than when faced with a specific performance of
a task (Glenberg & Epstein, 1987; Lundeberg & Mohan, 2009). One
MEASURING CONFIDENCE IN KNOWLEDGE OF ACIDS AND BASES 891
could argue, therefore, that being generally more confident toward the
science of chemistry does not mean one is more confident and more
correct on chemistry tests.
The rationale of our study was spurred on by a statement made by
Feng & Tuan (2005) about their Taiwanese high school students as
saying that they “…did not have confidence as to their success in
chemistry learning.” This statement resulted in us asking the question
“Does this lack of confidence toward chemistry learning exist at the
middle school level when Taiwanese students are first introduced to
the complexities of chemistry?” In a country that typically tops the
charts in science and mathematics in assessments such as TIMMS and
PISA, we were initially puzzled by such a statement. After doing a
careful investigation of confidence assessments and discovering that
respondents are typically not accurate in predicting their general
confidence (i.e., self-efficacy) in academic settings and that no studies
had been published in assessing students’ confidence toward item-
specific content, we decided to test the use of item-specific confidence
weighting among Taiwanese middle school students during their first
introduction of chemistry. Since general confidence was also used to
demonstrate that boys were more confident toward science than girls
in a recent study by Chen & Howard (2010), we also wanted to
determine whether this difference of confidence could be seen if
students used item-specific confidence weighting.
This present study sought to investigate whether boys were more
accurate and confident than girls when taking a test on acids and
bases. The teachers who assisted us in this study said that acids and
bases were typically difficult for middle school Taiwanese students to
understand. These teachers felt that the use of item-specific confidence
may help them better understand the confidence of their students
toward this important area of chemistry. These teachers were also
interested in knowing which gender would be more accurate in
expressing their confidence toward correct answers. To our knowledge,
no such studies have been conducted in an Asian culture. We believe
that the results of our study may encourage future research on
investigating item-specific confidence between boys and girls at other
grade levels and in other content areas.
We used Mangan’s (2001) concept of rightness as the theory
underlying the students’ use of item-specific confidence during testing.
He argued that feelings of “rightness” signal a person’s “feeling of
knowing” and correspond to their core feelings of positive evaluation,
coherence, meaningfulness, and knowledge. Item-specific confidence
can therefore reflect the students’ conscious feelings of rightness about

their content knowledge of chemistry. Once these feelings are
revealed, educators can determine the levels of accuracy toward this
feeling of knowing. In this study, we sought to answer four research
questions (RQ).
RQ 1. Are there any differences in grade scores between genders if

confidence weighting is used?
RQ 2. Are there any differences in grade scores between genders if
unweighted scoring is used?
RQ 3. Which gender was more willing to show risk by varying their
levels of confidence among the test items?
METHODOLOGY
Participants
The experiment herein involved giving a test of acids and bases after
10 weeks of instruction to 147 (76 male, 71 female) eighth grade
chemistry students from a large metropolitan public school in Southern
Taiwan.
Test Item Weighting Procedures

The test was composed of ten multiple-choice questions and constructed
so that students were required to write a weighted value after each
response to indicate his or her degree of item-specific confidence that the
answer response was correct. This method of using weighted values
toward specific test items, also called confidence weighting (e.g., Ebel,
1965; Soderquist, 1936), is intended to reduce the chance error
component of a test score. Ebel said that some of this error was a result
of an examinee’s good or bad luck in guessing.
Table 1 shows the strategy of assigning confidence weighted values to
specific test items. This method was used by Jack, Liu, Chiu &
Shymansky (2009b) during their pilot test of confidence weighting
among eighth grade science and mathematics Taiwanese students. During
a test, the students select a response, write a weighted value next to the
response that best represents their level of confidence that the response is
correct, and make sure the total of these weighted values equal the total
point value (TPV) of the test. Researchers of metacognition (e.g., Schraw,
TABLE 1
System of weighted values and point values for multiple-choice test items
Point value
Weighted Significance of
value weighted value Right Wrong
15 I am MORE confident this response is 15 0

correct when compared to my confidence
toward other responses.
10 I am confident this response is correct. 10 0
5 I am LESS confident this response is correct 5 0
when compared to my confidence toward other responses.
2008) would term this item-specific weighted value toward a correct

selection as its absolute accuracy. Schraw stated: “Absolute accuracy
provides a measure of the precision of a judgment on a specific task.” In
our study, the weighted value provides an absolute measure of accuracy
of a respondent’s feeling of knowledge.
The middle weighted value is determined by the TPV divided by the
total number of test items. In this study, the TPV was 100 points, so the
middle weighted value was 10. Restricting the sum of all weighted values
to equal the TPV of the test was important for two reasons. First, this
restriction prevents a respondent from weighting all responses with a high
weighted value on the chance of being a lucky guesser. Second, this
restriction encourages the respondent to differentiate levels of confidence
toward their feeling of knowing among answer selections. To our
knowledge, no other study using confidence weighting has implemented
such a restriction.
Table 2 shows an example of a student’s weighted values and earned
points on two test items. The answer selection of test item 2 is right, so
the weighted value of 15 given by the student is allotted to her earned
points for the test. Test item 3, on the other hand, is wrong, so the
weighted value of 5 given by the student is not allotted to her earned
points for the test. Instead, an earned point score of 0 is given to this item.
Rules for Weighting Responses

To help students understand how to use weighting values, the teachers
used the analogy of playing a game of chance. In this analogy, the game
of chance is the test. To win points during this game, the students need to
TABLE 2
Example of a student’s weighted values and earned points
Weighted Earned
Test items
value points
2. Which of the following statements is not correct?
(a) Electrolyte solution will be able to conduct electricity, so its liquid
can also be conductive;
(b) When electrolyte solution conducts electricity, a chemical reaction
must take place;
(c) The total charge carried by positive ions and negative ions within
15 15
an electrolyte solution equals the total power;
(d) When electrical current is induced into an electrolyte solution,
positive ions move to the cathode and negative ions move to
the anode.
3. Within Calcium chloride solution, what is the ratio of chloride and

calcium? 5 0
(a) 1 2 (b) 2 1 (c) 1 1 (d) 2 2
Note: = correct answer; = student’s selection
follow three rules: (1) a response must be selected for every test item; (2)
after each response, the student must write a weighted value next to the
selection; (3) before giving the completed test to the teacher, the student
must make sure all weighted values equal 100 points. If any one of these
three rules is not followed, a score of 0 is automatically given to the
whole test, and any points that the student would have earned for right
responses would be lost.
Test Scoring Procedures

Once a test paper showed the student as complying with the above
three rules, each answer response was checked for correctness. If a
response was right, the weighted value of the item was added to the
weighted score. Table 3 shows that the response for test item 2 was
correct, so the weighted value of 15 was added to the earned points
score. After all the responses were checked for correctness, the sum of
all earned points was written as the total test score for weighted
accuracy for the test. Table 3 shows the total test score using weighted
values was 80 points.
TABLE 3
Example of generating two total test scores from one set of responses
Test item: 1 2 3 4 5 6 7 8 9 10 Total test score
Response: a d c c b a a d b c
Weighted value: 10 15 5 10 10 15 15 5 10 5
Score
Weighted: 10 15 5 10 10 15 15 0 0 5 80
Unweighted: 10 10 0 10 10 10 10 0 0 10 70
Note: A zero "0" value indicates that an answer selection for an item was not correct
Next, an unweighted score was assigned to each test item. This

unweighted score represented how many points the student would have
earned if weighted values were not used.
Table 3 shows that if a weighted score was 5, 10, or 15, its unweighted
counterpart was 10. If a weighted score was 0, its unweighted counterpart
was 0. The sum of unweighted scores of all items was written as the total
test score for unweighted accuracy for the test. Table 3 shows that the
total test score using unweighted values was 70 points. These two total
test scores constituted the absolute measure of accuracy of a respondent’s
feeling of knowledge using weighted and unweighted scoring.
After calculating the weighted and unweighted total test scores for each
student, we used an independent samples t test to measure differences
between boys and girls. We also used the Cohen’s d effect size analysis to
measure the relative importance of these differences.
Relative Confidence Score Index

After scoring responses for absolute accuracy, we next measured a
relative confidence score directly from the weighted values a respondent
made toward all test items. This relative confidence score represented the
level of risk a respondent was willing to take in assigning different
weighted values to answer responses. We were unable to find any studies
in science or chemistry education which measured differences in risk
among boys and girls during testing. Ziller (1957) stated that a high risk
taker was typically self-confident, physically and socially adequate,
competitive, self-expressive, secure, and strongly identified with the
masculine role. In Hong, Veach, & Lawrenz’s study (2003) of differences in
stereotypical thinking among Taiwanese secondary students, students
stereotypically thought of boys as superior in mathematics and science and
girls as superior in language and liberal arts. Based upon these studies, we
hypothesized that the boys in this study would be willing to show more risk
in varying weighting values toward answer selections than girls.
In order to test these assumptions, a relative confidence score (RCS)
index was constructed (see Table 4). The RCS index lists all possible
combinations of weighted values a respondent could make toward his or
her answer responses within the predefined limits of 100 points. The RCS
index in Table 4 shows six different RCS ratings from 100 to 0. An RCS
rating of 0 indicates an aversion to risk in using different levels of
weighted values among answer selections. An RCS rating of 100 shows a
high willingness to show risk using different levels of weighted values
among answer selections.
The risk inclination model (Jack, Hung, Liu & Chiu, 2009a) was
used to calculate the RCS ratings for each combination of confidence.
A detailed explanation of how we calculated each RCS rating is given
in the Appendix. After the RCS rating index was constructed, we
compared the combination of the confidence values each student used on his
or her test with those listed in the index. Once a combination match was made,
the RCS rating for that combination was assigned to the student. Looking back
at Table 3 and comparing the combination of weighted values used by that
student with the RCS index, one can see that her combination of weighted
values match those under the RCS rating of 84. This RCS rating of 84,
TABLE 4
RCS index
Direction of risk: High GG- - - - - - - - - - - - - - - - - - - - - - - - - - - - 99 Averse
RCS rating: 100 96 84 64 36 0
Possible Item 15 15 15 15 15 10
combinations Item 15 15 15 15 10 10
of weighted Item 15 15 15 10 10 10
values assigned Item 15 15 10 10 10 10
to answer Item 15 10 10 10 10 10
selections Item 5 10 10 10 10 10
of test items Item 5 5 10 10 10 10
Item 5 5 5 10 10 10
Item 5 5 5 5 10 10
Item 5 5 5 5 5 10
therefore, was assigned to this student as her relative confidence score for this
test.
After a RCS rating was assigned to each student for each gender, we
used an independent samples t test to measure the difference between the
boys and girls. Cohen’s d effect size analysis was used to measure the
relative importance of this difference.
RESULTS
Scoring Accuracy
Confidence Accuracy Scores. The difference between boys’ and girls’

confidence accuracy scores was statistically significant in favor of
girls [mean = 69.44 (SD = 29.37) versus mean = 58.82 (SD = 28.31)]
(t = 2.23, p = 0.027). The girls demonstrated larger and more accurate
levels of confidence about their knowledge of acids and bases
compared with boys. The Cohen’s d effect results revealed a d effect
size of 0.37 between genders.
Unweighted Accuracy Scores. The difference between boys’ and girls’

unweighted accuracy scores was statistically significant in favor of
(t = 2.08, p = 0.03). These results demonstrate that girls had a larger level of
accuracy about their knowledge of acids and bases. The Cohen’s d effect size
of this finding was 0.34, which was slightly smaller than the magnitude
found in the confidence accuracy score results.
Choice of Weighted Versus Unweighted Scoring between Genders

After the tests were graded, two scores, a weighted score and an
unweighted score, were written on each test. Teachers passed back the
tests and allowed students to choose which score they wanted entered into
the grade book. Table 5 shows that only 14 students out of 147 students
wanted the unweighted score, 36 wanted the weighted score, and 97
students stated that either score was OK.
RCSs Between the Genders

An independent samples t test revealed a significant difference in the
RCS ratings between the genders. This significance was in favor of
TABLE 5
Weighted versus unweighted score results between genders
Students
Weighted versus unweighted scores Boys Girls
Weighted score higher than unweighted score 14 22

Weighted score lower than unweighted score 9 5
Weighted score equal to unweighted score 53 44
(t = 2.18, p = 0.031). The Cohen’s d effect size was calculated to be 0.36.

These results indicated that the girls were more willing than boys to take risks
toward varying their levels of confidence toward their answer responses.
CONCLUSION
The purpose of this study was to measure the accuracy of self-regulated

levels of confidence toward the correctness of answer selections on a
chemistry test. Unlike previous studies that used self-efficacy to measure
confidence toward the learning of chemistry, the students in this study
used item-specific confidence weighting to indicate how confident they
were toward specific chemistry content that had been previously learned.
Based upon previous studies, we hypothesized that boys would be more
confident and more accurate in their test answer selections than girls. The
results of our analyses revealed the opposite.
Our first analysis sought to answer this question: Are there any
differences in grade scores between genders if confidence weighting is
used? The results showed girls as scoring significantly higher than boys
when confidence weighting was used. These results indicate that girls had
more confidence toward correct answers than boys. These results appear to
differ from the results of a TIMMS 2007 report (Martin, Mullis, Foy, Olson,
Erberber, Preuschoff & Galia, 2008) on the self-confidence of Taiwanese
eighth grade students toward science learning. This report stated that boys
were significantly higher in self-confidence in learning science than girls. To
assess students’ confidence toward science learning, TIMMS researchers
had students respond to four statements: (1) I usually do well in science; (2)
Science is harder for me than for many of my classmates; (3) I am just not
good at science; (4) I learn things quickly in science. Since these items were
not related to any specific science content, we did not understand how
objective conclusions regarding the self-confidence in learning science could

be drawn. The use of item-specific confidence weighting, on the other hand,
asks students to assess their confidence toward specific content taught them
in class. By matching confidence to correctness on specific content, a clearer,
a more objective, and more accurate assessment of self-confidence toward
science learning could be achieved.
A second research question followed and asked: Are there any
differences in grade scores between genders if unweighted scoring is
used? The results showed that if the typical method of unweighted
scoring (i.e., all items had a predefined point value) were used, girls were
significantly more accurate in selecting the correct answer than boys. The
TIMMS 2007 report showed boys as being slightly but not significantly
more accurate than girls. Recent studies have shown that males and
females demonstrate relatively equal levels of achievement in science and
mathematics during elementary, middle, and even high school (e.g., Else-
Quest, Hyde & Linn, 2010). The current problem seen by many educators
is not the differences in academic scores but the differences in confidence.
The issue of prime concern has been encouraging and nurturing young
girls to be actively involved in mathematics, chemistry, and physics at the
primary and secondary levels of education (United States Congress,
2009). The Institute of Education Sciences stated that girls who have low
confidence toward mathematics and science in early adolescence show
less interest in mathematics or science careers (Halpern, Aronson, Reimer,
Simpkins, Star & Wentzel, 2007). The use of item-specific confidence
weighting during formative classroom assessments is an easier and a more
objective way to measure self-confidence toward learning while measur-
ing content accuracy.
Since tests are typically composed of items which assess different
yet interrelated aspects of content, we wanted to measure the overall or
relative confidence of the respondent toward the test as a whole. This
assessment was not concerned with accuracy; it was concerned with
testing whether a respondent was willing to use confidence weighting
to indicate different levels of perceived difficulty among test items.
Since the respondent knew that applying a low or a high confidence to
a correct or incorrect answer selection held the potential risk of loss or
gain, we sought to answer this third research question: Which gender
was more willing to show risk by varying their levels of confidence
among the test items? The results of our third assessment showed girls
as being more willing to show risk by varying their levels of
confidence among test items. One possible reason for this is that girls
are known to perform better when they are given the opportunity to
reflect upon a test item from a broader or open-ended context (Hastedt

& Sibberns, 2005). As a result, girls may see more possible variations
than boys who are typically less competent than girls in a broader or
open-ended context.
IMPLICATIONS
The educational value of using item-specific confidence weighting during

testing is that it provides an educator a way to measure the accuracy of a
student’s content-specific knowledge and the confidence underlying this
knowledge. Instead of using self-efficacy to measure the general
confidence a student has toward the test content as a whole, we used
the concept of risk inclination to measure the general confidence the
student has toward varying his or her confidence among test items.
Researchers have stated that the characteristics of effective and self-
regulated learners are their ability to adjust their efforts based on
awareness of their own understanding and the level of difficulty of an
upcoming task (Isaacson & Fujita, 2006). Zoller et al. (1999) stated: “If
we engage students as partners in activities involving self-awareness and
self-evaluation of their test performance and progress in learning, they can
not only enhance their strengths and reduce their weaknesses but also
learn in greater depth and develop their higher-order cognitive skills-
requiring capabilities.” Involving students in the use of item-specific
confidence weighting during testing engages them in being aware of how
confident they are toward their knowledge of specific content. Having
students assign point values to answer selections as a way of predicting
their level of confidence toward the accuracy of the selection motivates
them to carefully evaluate the correctness of the selection. The results of
these predicted evaluations of correctness provide the teacher and the
students with an objective way to evaluate test performance and progress
in learning. As both parties identify where confidence is placed toward
correct answers, the knowledge confidence of the student is strengthened
and the teaching self-efficacy and outcome expectations of the teacher is
improved. Just as important, as both parties identify where confidence is
placed toward incorrect answers, the teacher is provided with an
opportunity to involve students as partners in identifying reasons
underlying their confidence in these incorrect answers and in exploring
possible solutions to correct this situation.
If we want students to use self-awareness and self-assessment strategies in
learning, we need to show them that their test grade is dependent upon the use
of these strategies during testing. As the old saying goes, “What you do in
practice, you will do in the game.” To prove the validity of this saying in the
use of item-specific confidence weighting, we encourage researchers to
consider the testing of two hypotheses: (1) Using item-specific confidence
weighting on classroom exercises that emphasize the practice of specific
content improves weighted and unweighted scores on formative tests. (2)
Using item-specific confidence weighting on classroom exercises and
formative tests over a period of time can predict students’ self-efficacy within
a specific domain.
ACKNOWLEDGMENTS
We would like to thank Mr. Marvin G. Connatser, Dr. Chun-Ming Shih,

Professor Yuh-Yih Wu, Professor Kuan-Ming Hung, and the teachers and
students who assisted us in this project. Without the assistance of these
friends, this work would not have been possible.
APPENDIX. RCS INDEX
The risk equation in RIM (Jack et al. 2009a) was used to calculate each
RCS rating score (see Table 6).
TABLE 6
RCS index
Direction of risk: High GG - - - - - - - - - - - - - - - - - - - - - - - - - - - 99 Averse
RCS rating score: 100 96 84 64 36 0
ith item of 1st item 15 15 15 15 15 10

weighted 2nd item 15 15 15 15 10 10
values 3rd item 15 15 15 10 10 10
4th item 15 15 10 10 10 10
5th item 15 10 10 10 10 10
6th item 5 10 10 10 10 10
7th item 5 5 10 10 10 10
8th item 5 5 5 10 10 10
9th item 5 5 5 5 10 10
10th item 5 5 5 5 5 10
FORMAL DERIVATION OF THE RISK EQUATION
In order to calculate each RCS for each combination of possible weighted

values, we used the following risk equation:
R ¼ I =Imax TPV ð1Þ
where TPV is the total point value of the test and I represents an
inclination to use different weighted values throughout the combination
set, calculated as
( )
Xn
I¼ ðW0 Wi Þ i = TPV ð2Þ
i¼1
where W0 is the middle weighted value 10 and Wi is weighted value (W)

that the student made toward the ith item in the combination set. The
small letter i is the ith item of the weighted values.
Imax represents the maximum inclination to use different weighted
values throughout the combination set and is calculated as
8 9
<X n=2 Xn =
Imax ¼ ðW0 Wmax Þ i þ ðW0 Wmin Þ i ð3Þ
: i¼1 n=2þ1
;
TABLE 7

P
n
I¼ ðW0 Wi Þ i = TPV
i¼1
(W0 − Wi) × ith item
(10 − 15) × 1 = −5
(10 − 15) × 2 = –10
(10 − 15) × 3 = –15
(10 − 10) × 4 = 0
(10 − 10) × 5 = 0
(10 − 10) × 6 = 0
(10 − 10) × 7 = 0
(10 − 5) × 8 = 40
(10 − 5) × 9 = 45
(10 − 5) × 10 = 50
I = 105/100
TABLE 8
( )
P
n=2 P
n
Imax ¼ ðW0 Wmax Þ i þ ðW0 Wmin Þ i
i¼1 n=2þ1
((W0 − Wmax) × ith item) ((W0 − Wmin) × ith item)
((10 − 15) × 1) + ((10 − 5) × 6) = 25

((10 − 15) × 2) + ((10 − 5) × 7) = 25
((10 − 15) × 3) + ((10 − 5) × 8) = 25
((10 − 15) × 4) + ((10 − 5) × 9) = 25
((10 − 15) × 5) + ((10 − 5) × 10) = 25
Imax = 125/100
where Wmax is the maximum weighted value that could have been used.
In this study, the Wmax is 15 and Wmin is the minimum weighted value
that could have been used. In this study, the Wmin is 5.
APPLICATION OF RISK EQUATION IN CALCULATING RCS RATING SCORE
The following weighted values (10, 15, 5, 10, 10, 15, 15, 5, 10, 5)
represent the weighted values of a student shown in Table 3 of the
manuscript.
These weighted (Wi) values are redistributed in descending order: 15,
15, 15, 10, 10, 10, 10, 5, 5, 5. Wmax (i.e., student’s maximum weighted
value) is 15 and the Wmin (i.e., student’s minimum weighted value) is 5.
Table 7 shows the use of Eq. 2 to calculate I for these redistributed
values. Using Eq. 2, I equals 1.05. Next, we use Eq. 3 to calculate Imax
(see Table 8).
Using Eq. 3, Imax equals 1.25. Finally, we use Eq. 1 to calculate R,
which is the RCS rating in this study (see Table 9).
TABLE 9
R ¼ I =Imax TPV
(I / Imax) × TPV = RCS
(1.05 / 1.25) × 100 = 84

The RCS rating score for the student’s distribution of the weighting
values 15, 15, 15, 10, 10, 10, 10, 5, 5, and 5 is 84. This score of 84 is the
RCS rating score for this same distribution shown in Tables 4 and 6.
REFERENCES
Chen, C.-H. & Howard, B. (2010). Effect of live simulation on middle school students'
attitudes and learning toward science. Educational Technology & Society, 13(1), 133–
139.
Ebel, R. L. (1965). Confidence weighting and test reliability. Journal of Educational
Measurement, 2(1), 49–57.
Echternacht, G. J. (1972). The use of confidence testing in objective tests. Review of
Educational Research, 42(2), 217–236.
Else-Quest, N. M., Hyde, J. S. & Linn, M. C. (2010). Cross-national patterns of gender
differences in mathematics: A meta-analysis. Psychological Bulletin, 136(1), 103–127.
Feng, S.-L. & Tuan, H.-L. (2005). Using ARCS model to promote 11th graders'
motivation and achievement in learning about acids and bases. International Journal of
Science and Mathematics Education, 3, 463–484.
Glenberg, A. M. & Epstein, W. (1987). Inexpert calibration of comprehension. Memory &
Cognition, 15, 84–93.
Halpern, D. F., Aronson, J., Reimer, N., Simpkins, S., Star, J. R. & Wentzel, K. (2007).
Encouraging girls in math and science—Institute of Education Sciences. Washington:
National Center for Education Research.
Hastedt, D. & Sibberns, H. (2005). Differences between multiple choice items and
constructed response items in the IEA TIMSS surveys. Studies in Educational
Evaluation, 31, 145–161.
Hong, Z.-R., Veach, P. M. & Lawrenz, F. (2003). An investigation of the gender
stereotyped thinking of Taiwanese secondary boys and girls. Sex Roles, 48(11/12), 495–
504.
Isaacson, R. M. & Fujita, F. (2006). Metacognitive knowledge monitoring and self-
regulated learning: academic success and reflections on learning. Journal of the
Scholarship of Teaching and Learning, 6(1), 39–55.
Jack, B. M., Hung, K. M., Liu, C. J. & Chiu, H. L. (2009a). Utilitarian Model of
Confidence Testing for Knowledge-Based Societies. Paper presented at the American
Education Research Association (AERA), San Diego, CA, 13–17 April 2009.
Jack, B. M., Liu, C. J., Chiu, H. L. & Shymansky, J. A. (2009b). Confidence Testing for
Knowledge-Based Global Communities. Paper presented at the American Education
Research Association (AERA), San Diego, CA, 13–17 April 2009.
Lundeberg, M. & Mohan, L. (2009). Context matters: Gender and cross-culture
differences in confidence. In D. J. Hacker, J. Dunlosky & A. C. Graesser (Eds.),
Handbook of metacognition in education (pp. 222–239). New York: Routledge.
Mangan, B. (2001). Sensation's ghost: The non-sensory "fringe" of consciousness. Psyche,
7(18). http://www.theassc.org/files/assc/2509.pdf.
Martin, M. O., Mullis, I. V. S., Foy, P., Olson, J. F., Erberber, E., Preuschoff, C., Galia, J.
(2008). TIMSS 2007 International Science Report Findings from IEA’s Trends in
International Mathematics and Science Study at the Fourth and Eighth Grades.
Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College.
Odom, A. L. & Barrow, L. H. (2007). High school biology students' knowledge and
certainty about diffusion and osmosis concepts. School Science and Mathematics, 107
(3), 94–101.
Pajares, F. (1996). Self-efficacy beliefs in academic settings. Review of Educational
Research, 66(4), 543–578.
Planinic, M., Boone, W. J., Krsnik, R. & Beilfuss, M. L. (2006). Exploring alternative
conceptions from Newtonian dynamics and simple DC circuits: Links between item
difficulty and item confidence. Journal of Research in Science Teaching, 43(2), 150–
171.
Rahayu, S., Chandrasegaran, A. L., Treagust, D. F., Kita, M. & Ibnu, S. (2011).
Understanding acid–base concepts: Evaluating the efficacy of a senior high school
student-centered instructional program in Indonesia. International Journal of Science
and Mathematics Education doi:10.1007/s10763-010-9272-x.
Schraw, G. (2008). A conceptual analysis of five measures of metacognitive monitoring.
Metacognition Learning, 4, 33–45.
Soderquist, H. O. (1936). A new method of weighting scores in a true–false test. Journal
of Educational Research, 30, 290–292.
Tuan, H.-L., Chin, C.-C. & Shieh, S.-H. (2005). The development of a questionnaire to
measure students' motivation towards science learning. International Journal of Science
Education, 27(6), 639–654.
United States Congress (2009). Encouraging the participation of female student in stem
Fields:Hearing before the Subcommittee on Research and Science Education,
Committee on Science and Technology, House of Representatives, Serial No. 111-45,
111 CONG.
Ziller, R. C. (1957). A measure of the gambling response-set in objective tests.
Psychometrika, 22, 289–292.
Zoller, U., Fastow, M., Lubezky, A. & Tsaparlis, G. (1999). Students' self-assessment in
chemistry examinations requiring higher- and lower-order cognitive skills. Journal of
Chemical Education, 76, 112–113.
Brady Michael Jack and Chia-Ju Liu

Graduate Institute of Science Education
National Kaohsiung Normal University
Kaohsiung, Taiwan, Republic of China
E-mail: bradyjack@gmail.com
Houn-Lin Chiu
Chemistry Department
Kaohsiung Normal University
Chun-Yen Tsai
Department of Information Management
Cheng Shiu University

2 QuadraticEquations

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

2 QuadraticEquations

Încărcat de

Drepturi de autor:

Formate disponibile

BRADY MICHAEL JACK, CHIA-JU LIU, HOUN-LIN CHIU and CHUN-YEN TSAI

MEASURING THE CONFIDENCE OF 8TH GRADE TAIWANESE

KEY WORDS: acid–base, chemistry education, item-specific confidence, knowledge,

Studies investigating motivation and achievement (e.g., Feng & Tuan

International Journal of Science and Mathematics Education (2012) 10: 889Y905

Bandura as saying self-efficacy consists of “beliefs in one's capabilities to

can therefore reflect the students’ conscious feelings of rightness about

RQ 1. Are there any differences in grade scores between genders if

Test Item Weighting Procedures

15 I am MORE confident this response is 15 0

2008) would term this item-specific weighted value toward a correct

Rules for Weighting Responses

3. Within Calcium chloride solution, what is the ratio of chloride and

Note: = correct answer; = student’s selection

Test Scoring Procedures

Test item: 1 2 3 4 5 6 7 8 9 10 Total test score

Next, an unweighted score was assigned to each test item. This

Relative Confidence Score Index

Direction of risk: High GG- - - - - - - - - - - - - - - - - - - - - - - - - - - - 99 Averse

RCS rating: 100 96 84 64 36 0

Confidence Accuracy Scores. The difference between boys’ and girls’

Unweighted Accuracy Scores. The difference between boys’ and girls’

Choice of Weighted Versus Unweighted Scoring between Genders

RCSs Between the Genders

Weighted versus unweighted scores Boys Girls

Weighted score higher than unweighted score 14 22

(t = 2.18, p = 0.031). The Cohen’s d effect size was calculated to be 0.36.

The purpose of this study was to measure the accuracy of self-regulated

objective conclusions regarding the self-confidence in learning science could

reflect upon a test item from a broader or open-ended context (Hastedt

The educational value of using item-specific confidence weighting during

We would like to thank Mr. Marvin G. Connatser, Dr. Chun-Ming Shih,

APPENDIX. RCS INDEX

Direction of risk: High GG - - - - - - - - - - - - - - - - - - - - - - - - - - - 99 Averse

RCS rating score: 100 96 84 64 36 0

ith item of 1st item 15 15 15 15 15 10

FORMAL DERIVATION OF THE RISK EQUATION

In order to calculate each RCS for each combination of possible weighted

R ¼ I =Imax TPV ð1Þ

where W0 is the middle weighted value 10 and Wi is weighted value (W)

(W0 − Wi) × ith item

((W0 − Wmax) × ith item) ((W0 − Wmin) × ith item)

((10 − 15) × 1) + ((10 − 5) × 6) = 25

APPLICATION OF RISK EQUATION IN CALCULATING RCS RATING SCORE

(I / Imax) × TPV = RCS

(1.05 / 1.25) × 100 = 84

Brady Michael Jack and Chia-Ju Liu

S-ar putea să vă placă și