Response Time and Accuracy in Two Successive Levels of Education

Response time & accuracy in two successive levels
of education
Aiden McCall
September 6, 2015
Abstract
This experiment explores how different factors affect mathematical abil-
ity, specifically focusing on the comparison between groups of A-Level stu-
dents and groups of students/academics in higher education. Participants
were asked to answer a series of simple questions on the topic of proba-
bility, the content of which should have been familiar from KS2 or KS3
work. We measured both the mathematical reaction time (the amount of
time it takes someone to solve a mathematical problem) and the accuracy
of the participants and then analysed these results, demonstrating clear
differences between the two aforementioned groups.
1
1 Introduction
In preparation for this research I read a number of research papers, in particular;
Experimental Research about Effect of Mathematics Anxiety, Working Memory
Capacity on Students’ Mathematical Performance With Three Different Type
of Learning Methods1 ; Enhancing primary mathematics teaching and learning2 ;
A questionnaire for surveying mathematics self-efficacy expectations of future
teachers3 .
In the paper by Saeed Daneshamooz, Saeed Darvishian and Hassan
Alamolhodaei, it was found that those who learn from an internet based en-
vironment had a higher Mathematical Anxiety (feelings of tension and anxiety
that interfere with the manipulation of mathematical problems in a wide variety
of ordinary life and academic situations) compared with other forms of learning.
This was achieved by giving university students an exam after they had been
subject to one of the forms of learning. This method was good because it high-
lighted a difference between the forms of learning and showed a psychological
effect on the students based on how the same material was taught. However, it
didn’t include a conclusive way of measuring the students anxiety, making the
results imprecise.
In the paper by the CfBT Education Trust, it was found that there
was a need to find a balance between numeracy, notation, invented strategies,
knowledge, abstract reasoning, symbolism and practical, everyday knowledge
and the skills and problem-solving abilities required for everyday life and to
develop these into teaching methods for primary education. By studying dif-
ferent countries forms of teaching, they designed a guideline for primary school
teachers based on their findings. This method was effective because of the wide
variety of teaching methods incorporated into the design. However, it did not
include any data to show that their method had improved learning.
In the paper by Marc Zimmermann, Christine Bescherer and Chris-
tian Spannagel, it was found that out of their respondent sample, the students
performed better when given a real-world mathematical problem compared to
standard mathematical problems. This method was significant because it sug-
gests that, no matter the mathematical ability of those tested, the real-world
problem on average would get a better score. However, it used a method of self-
evaluation using a 5-point Likert scale (ranging from 1 I am not at all confident,
to 5 I am totally confident.) meaning the data was not objective.
Initially, I developed three experiments to test mathematical reaction
time. The first was a probability experiment using coloured counters in a bag.
For each question the participant would be told how many counters are in a
bag, and how many are subsequently removed, then asked which is the most
1 Experimental Research about Effect of Mathematics Anxiety, Working Memory Capacity
on Students’ Mathematical Performance With Three Different Type of Learning Method:

Saeed Daneshamooz, Saeed Darvishian and Hassan Alamolhodaei
2 Enhancing primary mathematics teaching and learning: CfBT Education Trust
3 A questionnaire for surveying mathematics self-efficacy expectations of future teachers:
Marc Zimmermann, Christine Bescherer, and Christian Spannagel
2
likely to pull from the bag. Each question is timed and whether the question
was correctly answered was recorded, without the knowledge of the participant.
The questions would not be specifically referred to as probability problems. This
was designed to test mathematical reaction time and accuracy, while avoiding
the mathematical anxiety attributed to formal mathematical problems and the
effect of time-pressure on accuracy. This experiment was an abstract idea, hence,
it was fairly straightforward to compare the results with many different socio-
economic factors. The main drawback of the experiment was the inability to
get a large sample size basing it on an ’Android platform’ (an electronic device
that uses Android OS, a phone or a tablet could be an Android platform).
The second was a problem solving experiment using two different capac-
ity jugs to exactly fill another. The same problem would also be given to the
participant, as a pure mathematical question. The responses would be timed
and the outcome of each experiment recorded, without the knowledge of the
participants. The purpose of this experiment was to, by collating data, observe
a difference between those who worked in predominately ’labour intensive work’
(manual labour based job e.g. a builder, a welder etc.) and those who work in
’theoretical labour work’ (office labour based job e.g. office worker, teacher etc.);
our hypothesis was that manual workers would find the problem solving exper-
iment more easily accessible, and office workers more easily complete the pure
mathematical problem. This was beneficial as we could test abstract-thinking
and problem solving without needing any form of mathematics. We could then
compare each participants success with the two experiments, and cross-reference
this with their profession to test our hypothesis. However, after discussion with
Phil Adey4 about designing such an experiment, it was apparent that we could
not easily store the data inputs for the pure mathematical question. To fix this
we would have to make the question more complex, add in brackets for the
participant to use, so the mathematics written can be computed by BODMAS.
The third experiment was a pure mathematics problem with certain
groups being given a stimulus to aid with the experiment. The intention of the
experiment was to compare if a stimulus would increase mathematical reaction
time. This was advantageous as the data, showing specific trends or no cor-
relation, would be easy to interpret. However, we felt that it would be easier
for those with higher mathematical insight (those with a greater knowledge of
mathematics) to complete. Therefore, the stimulus would be more likely to help
those with a lower mathematical insight.
After discussing all three options, we decided to use the counter ques-
tion. We felt this was the most appropriate, as it did not discriminate against
those with less mathematical ability, and hence did not require us to focus on
a narrow subset of participants. The initial drawback of the experiment, not
being able to acquire a large enough sample size, was removed by changing the
android application into a web-based application.
4 Phil was enlisted to program the designs into applications so they could be used on
different devices
3
4
2 ’Counter Experiment’ Development
I drew up an initial paper design for the counter experiment. The first draft in-
cluded 4 colour options that could be selected with a question displayed above,
simplicity was crucial in the design process so all ages would be able to partici-
pate.
Figure 1: Initial paper design
However, before proceeding further with this experiment, we wanted to ensure

the concept of the experiment would produce useful data, and to find any clear
weaknesses. To accomplish this, the experiment was introduced to a small group
of university students (7) studying mathematics; this was not carried out under
strictly controlled conditions as it was predominantly an indicator experiment.
5
The results collected are presented and briefly explained below:
Figure 2: Graph representing data from testing Ambassador group
The graph demonstrates strong vertical correlations for all questions bar ques-
tion 1, meaning all these questions took the respondents a similar time to com-
plete. The first question asked took many respondents significantly longer to
answer compared to the other two questions of a similar nature (questions 2
and 3). We attribute the larger spread for question 1 to the differing reading
and comprehension time needed. This is an important observation as it made
us aware that reading and comprehension time for initial questions would need
to be taken into account when analysing future data.
6
3 Web-based Design
Using this paper design as a basis, I then approached Phil Adey to develop a
web-based application. He advised me that in order to distribute this experiment
we would need a server to host the application on. Phil created an initial version
of the web-based application, based on my paper design.
Figure 3: The start of the experiment had 4 colours to select with an example
question above it.
Following this initial design phase, a number of changes were made to improve
user experience and remove any confusion in the questions.
We created an animation to show how the application functioned; this was
achieved by creating a series of instructions appearing on screen guiding the
user through a practice question.
With further internal testing it was apparent that, when clicking on a colour,
there was no way of knowing which colour had been selected. To remedy this,
Phil programmed the colour to change purple when pressed, this indicates to
the participant that they have chosen this colour. Subsequently, the question
changes as final indication of their selection. The final improvement was em-
boldening the text from question 6 onwards, highlighting to the participant
that the counters were not placed back, unlike the previous questions. All of
these changes were to reduce reading and comprehension time, a limiting factor
recognised in the indicator experiment.
7
4 Personal data collection
After sending Phil an initial list of the data I intended to collect on each respon-
dent, such as county lived in and the country in which you reside, he advised
that some of the sample sizes certain options (e.g. counties) would very small.
For example, we may have had only 5 people from the East Midlands and none
from Cornwall; this data would be incomparable. Hence, we had decided to
consolidate these variables into larger regions of the United Kingdom (North
England, Republic of Ireland etc.). After this adjustment, the experiment was
ready to distribute.
Figure 4: A small section of the data collection page showing the drop-down
options in region
8
5 Distribution of experiment
Our main aim when distributing was to create a large varying sample size con-
sisting of different ages, education and social backgrounds. Asking people we
knew would give us a moderately sized sample, mainly comprising of 16-25/26-
40 year olds from the south of England. Because of this, we felt it was important
to distribute through other online avenues to reach a wider audience. The exper-
iment was shared around on websites such as Facebook and TheStudentRoom;
using social media allowed us to reach a large amount of potential participants,
without geographical constrictions. However, because of time restrictions, we
could not adapt the test for international participants, hence only the U.K. au-
dience reached on the online-forums were able to take part. We expected the
participants engaged via Facebook to largely consist of friends/family of the
distributors, and therefore mainly geographically close. We hoped TheStuden-
tRoom would reach those outside our location, however, we were aware that
TheStudentRoom comprised mainly of 16-25 year olds, and stereotypically high
performing students.
Another method we employed to attempt to reach a more varied au-
dience, was to encourage all participants to redistribute the experiment them-
selves, but we were realistic in our expectation of how many participants would
take the time to redistribute this. Therefore, despite our efforts to reach a varied
audience, the majority of people we expected to collect data on would reside
in the south of England, and mainly fall into the age category 16-25. In future
experiments, we would like to get a greater sample size from all over the United
Kingdom.
9
6 Analysing data
After collecting all of the data, I initially grouped the data to compare different
variables. I then ran a number of comparisons.
The first comparison was between different aspects data collected on
those completed every question correctly. The purpose of this was to see if
those who had achieved full marks all took a similar length of time for each
question; this could suggest that there is an optimal length of time to spend in
order to achieve the correct result. This data was also compared to observe if
those these respondents had similarities such as levels of education or household
income. However, none of these comparisons were conclusive and the data did
not show any correlation between these factors.
My next comparison was between different aspects data collected on all
the participants who were 17 years of age. The aim of this comparison was to
see if there was a general trend for time taken per question of that age group
and then compare that with household income or any previous employment. I
found that there was no correlation for these factors in this data set.
I then carried out a comparison of those with AS/A education and those with
higher education, with the hypothesis that those who were in the AS/A edu-
cation group would have a faster mathematical reaction time, but those from
higher education to perform better. Having compared AS/A education and
higher education, their averages were calculated. This showed an interesting
result.
Table 1: Comparison of average time taken for each question (in seconds)
Question Average time taken (AS/A) Average time taken (Higher education)
1 23.667 16.412
2 19.000 15.598
3 12.690 9.706
4 12.317 9.149
5 11.882 12.938
6 31.856 21.719
7 23.561 18.502
8 22.989 13.964
9 12.404 8.299
Total Average 18.930 14.032
10
I found that, while higher education had, on average, a faster mathematical
reaction time per question, their average score was lower, as seen in the tables
and graph below. This disproved our initial hypothesis that those in higher
education would perform better.
Table 2: Comparison of level of education for average question time (in seconds)
and final score
Level of education Average time taken per question Average final score
AS/A 18.930 8.36
Higher education 14.032 7.72
Figure 5: The start of the experiment had 4 colours to select with an example
question above it.
From this data, we also recognised that it bared a similarity with the indicator
experiment. On average question 1 took significantly longer compared to the
following questions (2 to 5). This supports our hypothesis that reading and
comprehension time would affect the initial question. Similarly, the results
support our expectation (following on from the same hypothesis) that it would
take longer for the participants to complete question 6, as they told in question
5 that subsequently, the questions will take a different form.
A secondary significant comparison I found was between two main gen-
der sets (male and female). We saw there was no significant difference in the
mathematical reaction time and average score, contrary to stereotypical belief.
11
7 discussion
After collecting and analysing the data, the group met to discuss the outcome
of the experiment, points of improvement, and where we could progress going
forward.
A shortfall we considered after completing this experiment was the absence of

time-pressure on the participants. Our decision to time the experiment without
the participants knowledge was taken to alleviate time-pressed errors. However,
in cases when a question is timed with the participants knowledge this pressure
discourages respondents from getting distracted or writing out their working’s
during the experiment, meaning that one can more reasonably assume when
analysing the data that all participants were putting in an equal amount of
effort, and with similar resources. In contrast, we cannot necessarily assume
that those who took longer did so because of a lack of ability. They may not
have been giving the experiment their full attention, or could have, for exam-
ple, be writing out the questions and working out the probabilities formally.
In future experiments we would need to weigh up the benefits against these
drawbacks and decide if it would be beneficial to make the participant aware
at the beginning of the experiment, that the question response times are being
recorded.
Another factor that may have effected our results is the different plat-
forms in which the experiment was taken on. It was suggested that those who
used a smartphone may not spend as long on the application as sitting on by
a desktop. In the future we would like to record which device the participant
used to complete the experiment so we would be able to compare those who
conducted the experiment on the two platforms.
12
8 Future work
On the final day of the project, the working group discussed various factors as
to how we could expand the reach and rigour of our experiment. The mains
ambitions were to create a social media platform, where people could participate
in our experiments. We wanted to achieve this through a blog system were
we would give feedback on each experiment we had completed so those who
participated could see where their input was going. To make it fun and engaging
we had thought of having a ranking system of those who participated in the
experiment, as well as creating blogs or responding to blogs. However, after
further discussion we realised we would need to build a wide following before
we this method would be effective. Therefore, we decided that this may not be
the best option to pursue, as it would take us a large amount of time to create
such a social media platform and if it was not a success money and time would
have been wasted.
After lengthy discussion, we settled on the idea of trying to run the
experiments in local primary and secondary schools. Through Joe Watkin’s5
outreach work, we already have a contact list containing the details of many local
teachers, which will give us the potential to reach thousands of students within
the local area. Rachael6 will also be doing work experience at local schools in
the upcoming academic year, this allows us to publicise any future advances.
We have noted that any publicity carried out will still be limited geographically
until we can release an invitation to participate for schools across the country.
Using an idea of Phils, we decided to call this project Mathscrowd or
Maths to the crowd. There are two clear benefits to running these experiments
initially with local school students. Firstly, the experiments which we run can
be curriculum-focused, giving teachers a selection of lesson starter activities
which students can complete on their phone and tablets. Promoting this as a
starter activity is a familiar concept for teachers as common lesson planning
tool, meaning it will be relatively easy to incorporate the activity. We hope
this will increase uptake by staff, as it requires minimal work for teachers to
implement.
The second benefit of our proposed project is that we can send the
experiment results to the teachers. We intend to offer to give some standard
data analysis with these results, to give teachers some results to discuss as
a short statistics activity, and an examples of how the data can be analysed
(allowing the teacher and students to follow the methods to carry out further
tests). Alternatively, the teachers can opt not to receive this standard analysis
and instead use the data as the core of a Statistics-based project for the students,
possibly comparing the class data to the overall data set. We expect that the
personalisation of the data will increase the students enthusiasm for this project,
another benefit for teaching staff, as mathematics projects are stereotypically
not enjoyed by the majority of students.
5 University of Kent (SMSAS Outreach Officer, Lecturer in Mathematics)
6 University of Kent (Final year Mathematics Undergraduate Student, Senior Outreach
manager)
13
We realise that personalised data has the potential to cause data-protection
issues; to address this concern this we will not identify any student based on
their input into the experiment, all the input data will be anonymous. We will
also advise the teacher to choose the comparisons with the students in mind.
For example, if girls performed significantly lower than boys in a class, it would
be unwise to compare the two. The anonymity of the experiment also makes it
easier for us to gain permission to use school’s data as part of our wider data
collection set. Delivering this experiment to school classes allows us to collect
a vast amount of data in a short time frame, this would be much more effective
than our earlier idea of gaining participants via social media. The final way in
which we are hoping to collect more data is through Outreach programs that
The University of Kent mathematics department holds regularly.
14

Response Time and Accuracy in Two Successive Levels of Education

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Response Time and Accuracy in Two Successive Levels of Education

Încărcat de

Drepturi de autor:

Formate disponibile

Response time & accuracy in two successive levels

on Students’ Mathematical Performance With Three Different Type of Learning Method:

Marc Zimmermann, Christine Bescherer, and Christian Spannagel

Figure 1: Initial paper design

However, before proceeding further with this experiment, we wanted to ensure

Figure 2: Graph representing data from testing Ambassador group

A shortfall we considered after completing this experiment was the absence of

S-ar putea să vă placă și