Documente Academic
Documente Profesional
Documente Cultură
In order to conduct our experiment, we need to complete the first step: obtaining a
sample. A sample is a set of data collected or selected that becomes a subset of the population
that is being studied. In order to collect the data, our professor assigned each student to purchase
a simple 2.17 oz regular bag of Skittles. We then compiled the data, to which became our
sample. But, how do we know this sample is okay to use? Samples are at their best when the
group is selected at random. When we were asked to purchase a bag of Skittles the variable of
randomization became its own. Every student purchased their Skittle bag, however, not every
student went to the same store and not every student lives in the same area. Therefore, the
sample can be viewed as random. Below lies the numerical data found in my own skittle bag:
Red Orange Yellow Green Purple
Expected Prop. 0.2 0.2 0.2 0.2 0.2
Observed Prop. 0.204 0.191 0.213 0.195 0.198
After collecting our own Skittles bag, we were asked what proportion or percentage of
each color we expected to see in our bag of skittles. I then answered the following: “I expect that
there should be an equal proportion of all skittle colors. I believe that there should be an equal
amount of skittles for each color available. A rainbow should have equal representation of colors
and as their slogan says “Taste the rainbow” I believe there should be equal representation for
the skittles of each color”. This statement will be considered our hypothesis. Our expected
outcome once gathering enough data to test if our guess is correct and Skittles produces skittles
in equal amounts.
Now that we have collected our personal data we must complete the next step: organize
and compile the data along with that of our fellow classmates. In order to facilitate the
interpretation of the data, we will use graphs to display the gathered data. Graphical data allow
our brain to more easily view and comprehend the general trends occurring within the sample
data. Below are graphical summaries of the skittle proportions gathered by the entire class.
(Total: 2291 Skittles)
My Bag Proportions and Count vs. The Class Proportions and Count
Red Orange Yellow Green Purple Total Ct.
My Bag 0.203 0.186 0.254 0.153 0.203 59
(12) (11) (15) (9) (12) (1.00)
Class 0.204 0.191 0.213 0.195 0.198 2291
Counts (467) (437) (487) (447) (453) (1.00)
Once again, How do we know this sample is random? and What population would be the
one studied exactly? The data the class has collected was random. Each student had the task to
go to a store and purchase a bag of skittles. When given this task, it is very likely that not all
students purchased their bag of skittles in the same store. Likewise, it is also likely that not all
students live within the same city or county. When arriving at the store and picking out their bag
of skittles, the students may have gone different ways in choosing their bag. One student may
have taken the one in the back, another may have grabbed one in the middle, one may have
gotten the bag all the way in the back. These bags of skittles were therefore randomly selected in
order to conduct this sampling project. When thinking about the population, this would be all of
the bags of skittles that are out there in the world. The reason being, we are viewing the
proportions that are contained within the skittle bags in general. Meaning all of the skittle bags
existing.
Now, as I view the graphs. They seem to represent the data close to what I had expected
it to be. I had previously expected there to be equal representation of all the colors within a
single bag of skittles. When compiling the data I noticed that the proportions are in fact, very
close to what I had expected. Although our graphs represent what I seem to have expected, we
need to make sure there are not any outliers within the data collected. Outliers are very extreme
values that may affect the studies outcome. However, On average there seems to be an equal
proportion of skittles though out the data. When observing the evidence there is no significant
difference within the proportions. It can be concluded that there does not appear to be any
outliers. Lastly, we must compare the once compiled data with that of our own bag. The
distribution of colors in the total class does not fully match those of my own bag. However, they
are very similar. The only very drastic difference is the proportion of yellow skittles in my own
bag. This is explained because, when one collects data on a single individual, the data may
appear to be skewed or uneven. Yet, when collecting data on various individuals the proportions
observed will be closer to those of the true proportion.
Once we have compiled and organized our data it is now time for step three: data
summarization and analysis. The data we have compiled can be summarized in 5 simple
numbers. Min, Q1, Mean, Q3, and the Max. Our Min, is the smallest outcome in out results also
known as the minimum. Our Q1, is our middle value between the minimum and the mean. The
mean is the average of ALL our data. Our Q3, is the middle value between the mean and the
maximum. Our Max, is our greatest value in our outcomes also known as our Maximum.
Another important summary statistic is the standard deviation, which is a measure to define the
variation within our sample.
Using the total number of candies in each bag in our class sample. I will be computing
the following: mean number of candies per bag, standard deviation of the number of candies per
bag, and the respective 5-number summary. (All rounded to one decimal place). When
summarizing the number of Skittles per 2.17 oz bag, the following summary stats arose:
In order to compute the mean we will add all of the totals and divide by the number of totals.
(59+61+57+61+59+59+60+60+60+59+59+59+63+59+57+59+57+37+63+57+58+61+53+58+60+58+60+56+59+60+59+58+93+61+64+78+72+58)
38
The mean for the number of candies per bag, in our class sample is 60.3
In order to calculate the standard deviation we will use technology
The resulting S.D is 7.8
Finally, we will continue to use technology in order to get our 5-number summary.
- Min: 37, Q₁: 58, Med: 59, Q₃: 61, Max: 93
- Statistics
o Min=37
o Q1= 58
o Med=61
o Q2=61
o Max=93
o Mean = 60.3
o Standard Dev. = 7.8
- Proportions
o Yellow = 20.4%
o Red = 19.1%
o Purple = 21.2%
o Green = 19.5%
o Orange = 19.8%
Construct a 95% Confidence Interval Estimate for the population proportion of yellow
candies.
o The formula used to compute a confidence interval for proportions is the
following:
o Z represents the critical value on a standard normal distribution. To find the test
statistic we will use the inverse normal function on our calculator.
Invnorm(.975 , 0 , 1)
1.959963987 or 1.96
o T represents the critical value of a normal t-distribution. In order to find this value
we will use the inverse t function on our calculator.
invT ( 0.975, 37)
2.026192447 or 2.026
o Now we will insert our information into out formula
When calculating the second part of the formula we get: 2.56
o We will now add and subtract the result from our sample mean:
60.3 – 2.56 = 57.74
60.3 + 2.56 = 62.86
o The interval estimate is (57.74, 62.86)
- Now we will interpret the results:
o I am 95% confident that the true mean number for skittle candies lies between
57.74 and 62.86 candies per bag.
We can expect to have a the proportion of yellow skittles to be within 13.9% and 26.9%
out of all of the skittles. And we can expect to have at least 57 to 62 candies per bag. Confidence
intervals however are only confident to an extent. Since we are only 95% confident within our
calculations there is a possibility that we get values outside of our perspectives due to having that
5% uncertainty.
All in all, this project has now summarized some of the basic concepts and tests needed to begin
a small path into the course of statistics. Everywhere we go, everywhere we look, we are
surrounded by statistics. The billboards regarding drinking and driving, news on certain foods
and diabetes among much more. Statistics are the foundation to major research projects going on
each and every day. There is a possibility to conduct a statistical experiment on pretty much
anything you want. Just go out there are look for your next statistical encounter or curiosity.