Sunteți pe pagina 1din 9

Austin Mineer

Math 1040
August 4, 2016
Professor Damon McCafferty
Term Project
In this term project, I will utilize the knowledge gained from many of the concepts that
we learned about this semester. I will organize data, draw conclusions with the use of confidence
intervals, and practice accurate hypothesis testing. I will overall perform an analysis on skittles
and the amounts of their contents, using a sample size of 18 2.17 ounce bags.

Sum of Each Color of Skittles


250
200
150
100
50
0
Purple

Orange

Green

Red

Yellow

Proportion of Each Color of Skittles

205

235

208

224
209

Purple

Orange

Green

Red

Yellow

Proportions
of my bag
of skittles
Proportions
of
everybodys
bag of
skittles

Purple
.224

Orange
.241

Green
.121

Red
.224

Yellow
.190

.217

.207

.193

.192

.190

Mean
Standard Deviation
Minimum
Q1
Median
Q3
Maximum

Total Number of
Skittles in Each Bag
60.1
2.36
57.0
58.0
60.0
61.8
65.0

The pie chart and pareto chart of the class


data reflect what I expected to see: an even
proportion among all colors with some variation.
The purple had a somewhat high proportion, but

nothing too extreme. My personal sample, however, showed more variation among itself
between colors of skittles. This is likely due to my one 2.17 ounce bag of skittles producing a
small sample size. Generally, larger samples produce more accurate results.

Frequency Histogram
5
4
3
Frequency 2
1
0

55 56 57 58 59 60 61 62 63 64 65 More
Number of Skittles per Bag

55

56

57

58

59

60

61

62

63

64

65

Boxplot for Number of Skittles per Bag

I expected the histogram to be normally distributed. Instead, it appears to be slightly


skewed to the right. Even then, its still somewhat irregular. I did expect some irregularity
though, because the sample size is small (only 18 bags). My bag of skittles had 58 candies in it.
58 candies was one of the highest frequencies in the frequency histogram, so this makes sense.
Categorical data is data that consists of names or labels that arent numbers representing
counts or measurements. Some examples are eye color, name, political party affiliations, etc.
Quantitative data consists strictly of numbers that do represent count or measurements. Some
examples are age, height, shoe size, etc. Quantitative data usually uses charts like boxplots and
histograms because they are very good at showing numbered data. Categorical data usually uses
charts like pareto charts or pie charts because it displays the data and the categorical information.
Calculations such as the five number summary and mean do not make sense for categorical data,
but they are generally very useful for quantitative data. This is because things like eye color
cant be organized in any meaningful order, and calculations like mean and median would simply
not make sense.
A confidence interval is an interval of values that estimates the true value of a population
parameter. It tells us how good of an estimate something is. It is always associated with a

confidence level that gives the success rate of the procedure that was used to make the
confidence interval. Generally a confidence level of 95% is most commonly used.

In the first calculation, I was able to determine that we are 99% confident that the true
value of the population proportion of yellow candies is between the values of .159 and .221. In
the second calculation, I was able to determine that we are 95% confident that the true mean of
skittles per 2.17 ounce bag is between 58.926 and 61.274. In the third calculation, I was able to
determine that we are 98% confident that the true value of the standard deviation of skittles per
2.17 ounce bag is between 1.683 and 3.844.

In the first hypothesis test, we fail to reject the null hypothesis that 20% of all skittle
candies are red, since the test statistic didnt fall in the critical region. In the second hypothesis
test, we reject the null hypothesis that the mean number of candies in a bag of skittles is 55, since
the test statistic fell in the critical region.

Generally for confidence intervals, we need a population parameter that we are trying to
estimate, a confidence level, information from a relevant sample, and a margin of error. From
this information, we can determine a good estimate of the population parameter. For hypothesis
testing, we need a claim about a population to test, relevant information from a sample, a
significance level, and a test statistic. From this information, we can choose to reject a claim or
fail to reject a claim. Our samples met all of these conditions for both the hypothesis testing and
confidence intervals. Some errors might have been calculational or using the tables incorrectly.
The sampling method could be improved by increasing the sample size. I stated the conclusions
in earlier paragraphs.

Reflective Writing for ePortfolio


This project helped me practice the skills that I gained from this class. I was able to
practice using Microsoft Excel to find important data such as the five number summary. I was
also able to create charts such as histograms and box plots through Excel. Using that data, I was
then able to practice confidence intervals and hypothesis testing, and analyzed the results.
As a pre-med student, this project helped me learn how to do important things like
hypothesis testing and confidence intervals. This will be important for me in the future with any
research I perform. I was also able to maintain basic mathematical skills that I will need for
future pre-med classes. This project also helped me become more comfortable with using
Microsoft Excel, and this skill will be useful in both my schooling and career.
In future classes such as genetics and organic chemistry, tests and research will need to be
performed. Hypotheses will need to be made, and statistical analysis will be important. This
project has helped me become better at hypothesis testing, for example, that will have application
in my future classes.
This project is a representation of the course in the sense that it requires you to think
differently from your other classes. You have to approach the problem the correct way, and you
need to know how to identify which is the correct way. For instance, in this project, we needed
to know how to determine whether the test statistic is a z-score or a t-score, so that we could
conclude whether to reject or fail to reject the null hypothesis.
Real world math problems are always going to have variation and many aspects to
consider. This project was the same way. We had to consider all aspects before coming to
conclusions. This project required a different kind of thinking, just like real world problems.

S-ar putea să vă placă și