Sunteți pe pagina 1din 4

In this project each student in our class was to buy a 2.

17 ounce bag of
Original Skittles and then count how many of each color were in the bag. We
combined the data from the class which is the population and then created charts in
order to better understand and analyze the data. This is to help us implement what
we are learning in our statistics course as well as give us a tangible experiment and
define what some of the best ways are to analyze data.
The following Pie Chart and Pareto Graph contain the results of the data
obtained from the experiment.

Skittles
0.22

120%

0.22

100%

0.21
0.21

80%

0.2

60%

0.2
0.19

40%

0.19

20%

0.18
0.18

Purple

Red

Yellow

Green

Cumulative Percentage

Orange

0%

Skittles

0.2

0.22

0.19

0.2
0.2

Red

Orange

Yellow

Green

Purple

We can see that the purple skittle is the most common but there really is not
that big of a difference. They all round off to be about .2 of the population or 20%. I
did expect the purple skittle to be slightly more common simply because when I
think of skittles or eating them I remember the purple and green skittles being more
prevalent than the others on almost every occasion. The cumulative percentage on
the Pareto graph has a steady rise due to having about the same amount of skittles
per color.
This graph on the left shows that more than 15 bags contained between 55
and 65 skittles. There was somewhat of an outlier that had less than 35 which could
indicate that those bags were probably the wrong size. The right side of the graph
appears to be a normal distribution

This boxplot graph also shows that


most of the bags contained around 60
skittles. The line starts just under 30 with the
lowest amount and ends just over 70 with
the highest amount. The summary statistics
below contains the exact numbers. My total
amount of skittles does fall within this range at exactly 60. This was to be expected
given the weight of the bags. What really does not make much sense is the outlier
above at 27. It seems very unlikely that a 2.17 oz. bag that contains around 60
skittles can come in at half of that.

Column
SortBy(Total per Bag)

Mean

Variance Std. dev. Std. err. Median Range Min Max Q1

21 58.952381 71.047619 8.428975 1.8393531

61

The following table contains a raw data comparison from my personal results
and the population results. Taking into consideration that we are dealing with low

46 27

73 59

numbers, we can see that the sample data does not necessarily reflect the
population data.
Person
al
Sampl
e

Sampl
e
Amou
nt

Class
Populati
on
Amount

Sample
Probabil
ity

Populati
on
Probabil
ity

Purple
Red
Yellow
Green
Orang
e
Total

11
15
9
16

268
247
245
244

0.1833
0.25
0.15
0.2667

0.2165
0.1995
0.1979
0.197

18.30%
25%
15%
26.70%

21.60%
20%
19.80%
19.70%

9
60

234
1238

0.15
1

0.189
1

15%
100%

18.90%
100%

Sample
Percent
age

Populati
on
Percent
age

The difference between categorical and quantitative data is for instance the
color of the skittle falls under a category of color while the amount of that color is
quantitative. A graph that makes sense for using categorical data would be a bar
graph or a pie chart because they both show help us compare the percentage or
size of the category such as how many blue candies versus how many red ones.
Some good graphs for quantitative data would be a histogram or a stemplot
because they show us the shape of the distribution and help us identify outliers.
They also organize data in ascending or descending order. A boxplot also is helpful
by showing us the quartiles or dividing the data into 4 sections or 25% increments
and showing us the data that falls in those quartiles. This is great for illustrating
what is normal or average and also shows outliers. Quantitative graphs would be
best for studies on test scores or when comparing measurements of anything in
general.

S-ar putea să vă placă și