Sunteți pe pagina 1din 13

Mathematical Exploration

The Galton Board

IB SL Mathematics

Nidhi Rameshan
Contents
Introduction ................................................................................................................ 3

Rationale .................................................................................................................... 3

Aim and Methodology................................................................................................. 4

Aim ......................................................................................................................... 4

Part I: Modelling the curve for SAT scores based on Galton Board ........................ 5

Part II: Analysis by comparison of actual and modelled SAT scores ...................... 9

Conclusion ............................................................................................................... 11

Works cited .............................................................................................................. 13


Introduction

The bean machine, also known as the Galton Board or quincunx, is a 7.5" by

4.5" desktop machine, invented by Sir Francis Galton to demonstrate probability. It

consists of an upright board with evenly spaced nails or pegs, driven into its upper

half, where the nails are arranged in staggered order (a quincunx), and the lower half

divided into a number of evenly-spaced rectangular slots. The upper edge has a funnel

with balls, each of diameter less than the distance between the pegs. The funnel is

located precisely above the central peg of the second row, so that each ball would fall

vertically on the uppermost point of the nail's surface. As the board is rotated on its

axis, the balls would be released from the funnel at the top of the board and have a

50% chance of going either way every time they would hit a pin, and eventually fall

into their respective slots. Each slot in the board has an associative probability and

looking at all the slots simultaneously gives a distribution. The probability that a ball

would end up in an outer slot is very small. Thus, we see a tendency for the balls to

go towards the centre and create what’s called a normal distribution, or bell-shaped

curve. As the beads accumulate in the slots, they approximate a bell curve.

The Galton board shows the emergence of an orderly bell curve from the chaos

of numerous falling balls bouncing off pegs. This shows the distribution of balls into

the slots as they fall either left or right of the pegs.

Rationale

The Galton Board, or bean machine as I like to call it, played the role of a toy

during my childhood. It was situated on the table in my father's office, and I used to

play with it whenever I could. I was fascinated by the way the tiny metal balls would

hit the pegs and change their course, to finally fall into one of the slots, and form a
curve. I used to wonder as to why the pegs were placed in the shape of a triangle, and

if that affected the way the balls moved and fell into the slots. My father tried to explain

the rules of probability to me, but I was too young to understand, and one day, the

bean machine broke. When we started learning probability in the lower classes, I

vaguely remembered the term, but could not associate it with anything I could

remember. Also, being a high school student, the SATs are a mandatory exam, and

after understanding the principle of the Galton Board, I understood that the distribution

of scores allowed for a good investigation of the board and how it can be used to

analyse real life situations.

Aim and Methodology

Aim

The aim of the exploration is to use the principle of the Galton Board to analyse

and predict real life situations having random probability, which follow normal

distribution. In this case, the SAT scores for October 2017 will be analysed, due to

the large number of test takers showing random probability, to estimate the predicted

scores, and compare them with the actual scores obtained by the test takers.

The exploration will be divided into two parts, part I and part II. Part I is the

modelling of the curve, which consists of using the concept of the Galton Board in

tandem with regression to make a statistical model of the SAT, with the beads in the

Galton Board representing the test takers and the pegs representing the questions in

the test. Part II is the analysis, where the cumulative distribution function and related

error function will be used to analyse the differences between the predicted model and

its real-life equivalence, using calculus to show the difference between the actual
scores and the predicted scores, therefore showing whether the test was harder,

easier or on par with the expectations.

Part I: Modelling the curve for SAT scores based on Galton Board

The Galton Board works on the principle of the central limit theorem, in order to

use a binomial distribution to approximate a normal distribution. According to the

central limit theorem (CLT), the sum of a large number of random variables is

approximately normal, and the mean of all samples in the data set will be

approximately equal to the mean of the entire data set. The Galton Board is similar to

Pascal's Triangle, and can be used to model this situation. Similar to the flipping of a

coin, where the more flips done the higher chances of getting a 50% chance of each

outcome, in the Galton board, the higher the number of balls and pegs on the board,

the higher chances of the formation of a bell curve. Every time a ball hits a peg, it has

an equal (50%) chance of bouncing to the left or right of the peg, with probability p

(and q= 1-p). The nails are symmetrically placed in the form of a quincunx, so the balls

bounce with equal probability i.e., p=q=1/2. When each ball reaches the bottom row,

it hits the nth peg from the left, exactly when it has taken n right turns. This occurs with

probability

and this gives rise to a binomial distribution.

If the number of balls is sufficient large and p=q=1/2, then according to the weak

law of large numbers, the distribution will approximate that of a normal distribution,

where

𝑙𝑖𝑚 𝑃(𝑥) = 0.5


𝑥→∞
A normal distribution has a property where the mean, median and mode are all the

same. The standard normal distribution equation is

1 1𝑥 2

×𝑒 2
√2𝜋

and for a random variable X, the probability that X will take on a value which is less

than, equal to or greater than x is called cumulative distribution function,

𝑥 𝑡2
1
𝑓(𝑥) = ×∫ 𝑒2 𝑑𝑡
√2𝜋 ∞

While modelling the SAT scores, the test takers will be considered to be the

balls in the board, and each peg will represent a question in the test. While using a

simulation of the Galton Board, the number of columns will be taken as 12 columns,

and the number of beads will be set at 100000. The number of columns is set at 12,

because the range for the SAT scores is between 400 and 1600, and thus, the

difference between each column represents 100 points, and the simulation was done

with 100000 beads because the mean number of test takers per year is 1.7 million. As

the total number of questions in the SAT is 154, this allows me to assume that each

peg represents 14 questions, and each and each bead represents 17 test takers. Also,

while modelling the graph, it is assumed that each student had a 50% chance of

getting each question right, considering the difficulty of each question. In order to find

the points on the curve to be used in a regression, I chose the midpoints of each

column to be the average score.


Figure 1 Simulation for Normal Distribution Curve

Thus, the twelve points that were used to model the curve are given below

𝑥 𝑦

450 00.043

550 00.530

650 02.614

750 08.008

850 16.366

950 22.527

1050 22.412

1150 16.142

1250 08.033
1350 02.727

1450 00.540

1550 00.049

In order to create a function to model the graph, polynomial regression was used by

inputting the values in an algorithm. In the equation below, y refers to the percentage

of test takers, x refers to the scores gained, β is the coefficient of the equation and ε

represents the error.

The inputted values used in the polynomial regression equation calculated the given

function.

𝑦 = (−6.870986513 × 10−5 × 𝑥 2 ) + (1.374296953 × 10−1 × 𝑥) + (−52.19932124)

The model formed from the function is accurate between 750 and 1450 due to the

related error function.


Figure 2 Model of SAT scores

Part II: Analysis by comparison of actual and modelled SAT scores

Due to the cumulative distribution frequency, at 1000,

1
𝐹(𝑥 = 1000) =
2

Considering the area under the graph as the number of people attempting the test,

1450
∫ 𝑓(𝑥)𝑑𝑥 = 9520𝑝𝑒𝑜𝑝𝑙𝑒
750

𝑎
9520
∫ 𝑓(𝑥)𝑑𝑥 = = 4760
750 2

𝐹(𝑎) − 𝐹(750) = 4760

𝑎 ≈ 1050
𝑀𝑒𝑎𝑛𝑠𝑐𝑜𝑟𝑒 ≈ 1050, which is similar to the mean score in the model i.e.1000. This

proves that the modelled function is mostly accurate when compared to the actual

values, also accounting for error.

Figure 3 Actual SAT scores 2017

Figure 4 Actual SAT Scores 2018


In order to compare the values of the actual scores and the modelled scores, the ratio

of the value at the 50th percentile and the entire value of the area under the curve
𝑎 𝑐
must be compared for both situations. Let be the ratio for the actual scores, and
𝑏 𝑑

be the ratio for the modelled scores.

𝐹(𝑥) = ∫ 𝑓(𝑥)𝑑𝑥

𝐹(𝑥 = 1000)
𝐴𝑐𝑡𝑢𝑎𝑙𝑆𝑐𝑜𝑟𝑒𝑠 (2017) = = 0.567456
𝐹(𝑥 = 1600)

𝑓(𝑥 = 1000)
𝐴𝑐𝑡𝑢𝑎𝑙 𝑆𝑐𝑜𝑟𝑒𝑠 (2018) = = 0.5553
𝑓(𝑥 = 1600)

𝐹(𝑥 = 1000)
𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑑𝑆𝑐𝑜𝑟𝑒𝑠 (2017 𝑎𝑛𝑑 2018) = = 0.413585
𝐹(𝑥 = 1600)
𝑎 𝑐 𝑒
> 𝑑 >𝑓
𝑏

0.56 > 0.55 > 0.41

Conclusion

Through the exploration, it was seen that the SAT scores form a normal

distribution curve, and the curve modelled based on the values in the simulation was

similar to the curve formed by the actual scores gained in October 2017, with slight

error, which was expected. While comparing the values of both the model and the
𝑎 𝑐
actual curve, it was seen through calculus, that > 𝑑, meaning that the actual scores
𝑏

were higher than the scores estimated based on the simulation and model created.

Since the simulation represented the average scores each year and the actual grades

were higher than the estimated score, it can be said that based on previous SATs, the

October 2017 SAT was easier than expected. If the estimated scores were higher than
the actual scores, then the test would have been harder than other tests and the level

of difficulty would be higher than expected.

The difficulties faced while carrying out this exploration were extensive, such

as the excessive error in the model between 400 and 750 marks, and 1450 and 1600

points, which could thus lead to vast inaccuracy while comparing the model with the

actual values. Also, the values to be inputted in the polynomial regression function

were extremely small, and thus could not be calculated directly and had to be

calculated with the assistance of an algorithm. The values were established based on

several assumptions, such as assuming each student had a 50% chance of getting

the question correct. However, the data set of the actual October 2017 SAT scores

was not hard to obtain, giving me a accurate comparable data set, thus strengthening

my comparison and ensuring accuracy for the most part.

The findings of my exploration show that any real life situation with a random

probability with a large enough sample size can be modelled and predicted or

estimated based on the Galton Board and it’s principle, and will form a natural

distribution curve. This can be applied to the prediction of scores for any other exams

or tests, and also the prediction of height, strength and other abilities in humans and

animals.
Works cited

1. “Central Limit Theorem.” Exponential Distribution | Definition | Memoryless

Random Variable,

www.probabilitycourse.com/chapter7/7_1_2_central_limit_theorem.php.

2. “Galton Board.” From Wolfram MathWorld,

mathworld.wolfram.com/GaltonBoard.html

3. “Galton Board.” Pascaline – Interactive Simulations – EduMedia, www.edumedia-

sciences.com/en/media/905-galton-board

4. “Galton Board.” PhysLab, www.physlab.org/class-demo/galton-board/.

5. Kozlov, V. V. and Mitrofanova, M. Yu. "Galton Board." Regular Chaotic

Dynamics 8, 431-439, 2002.

6. Learner.org. (2018). Mathematics Illuminated | Unit 7 | 7.5 The Galton Board

Revisited. [online] Available at:

https://www.learner.org/courses/mathilluminated/units/7/textbook/05.php [Accessed

27 May 2018].

7. Weisstein, Eric W. "Distribution Function." From MathWorld--A Wolfram Web

Resource. http://mathworld.wolfram.com/DistributionFunction.html

S-ar putea să vă placă și