Sunteți pe pagina 1din 5

How to Calculate the Standard Deviation

How to Calculate the Standard Deviation Standard deviation is a widely used measure of variability or diversity used in statistics and probability theory. It shows how much variation or "dispersion" exists from the average (mean, or expected value). A low standard deviation indicates that the data points tend to be very close to the mean, whereas high standard deviation indicates that the data points are spread out over a large range of values. The standard deviation of a random variable, statistical population, data set, or probability distribution is the square root of its variance. It is algebraically simpler though practically less robust than the average absolute deviation. A useful property of standard deviation is that, unlike variance, it is expressed in the same units as the data. First, you need to determine the mean. The mean of a list of numbers is the sum of those numbers divided by the quantity of items in the list (read: add all the numbers up and divide by how many there are). Then, subtract the mean from every number to get the list of deviations. Create a list of these numbers. It's OK to get negative numbers here. Next, square the resulting list of numbers (read: multiply them with themselves). Know More About How to Find Confidence Interval

Math.Tutorvista.com

Page No. :- 1/5

Add up all of the resulting squares to get their total sum. Divide your result by one less than the number of items in the list. To get the standard deviation, just take the square root of the resulting number I know this sounds confusing, but just check out this example: your list of numbers: 1, 3, 4, 6, 9, 19 Mean: (1+3+4+6+9+19) / 6 = 42 / 6 = 7 list of deviations: -6, -4, -3, -1, 2, 12 squares of deviations: 36, 16, 9, 1, 4, 144 sum of deviations: 36+16+9+1+4+144 = 210 divided by one less than the number of items in the list: 210 / 5 = 42 square root of this number: square root (42) = about 6.48 The standard deviation we obtain by sampling a distribution is itself not absolutely accurate. This is especially true if the number of samples is very low. This effect can be described by the confidence interval or CI. For example for N=2 the 95% CI of the SD is from 0.45*SD to 31.9*SD. In other words the standard deviation of the distribution in 95% of the cases can be up to a factor of 31 larger or up to a factor 2 smaller! For N=10 the interval is 0.69*SD to 1.83*SD, the actual SD can still be almost a factor 2 higher than the sampled SD. For N=100 this is down to 0.88*SD to 1.16*SD. So to be sure the sampled SD is close to the actual SD we need to sample a large number of points. Learn More Normal Approximation Math.Tutorvista.com

Page No. :- 2/5

Box Whisker Plot


Box Whisker Plot Statistics assumes that your data points (the numbers in your list) are clustered around some central value. The "box" in the box-and-whisker plot contains, and thereby highlights, the middle half of these data points. To create a box-and-whisker plot, you start by ordering your data (putting the values in numerical order), if they aren't ordered already. Then you find the median of your data. The median divides the data into two halves. To divide the data into quarters, you then find the medians of these two halves. Note: If you have an even number of values, so the first median was the average of the two middle values, then you include the middle values in your sub-median computations. If you have an odd number of values, so the first median was an actual data point, then you do not include that value in your sub-median computations. That is, to find the sub-medians, you're only looking at the values that haven't yet been used. You have three points: the first middle point (the median), and the middle points of the two halves (what I call the "sub-medians").

Math.Tutorvista.com

Page No. :- 3/5

These three points divide the entire data set into quarters, called "quartiles". The top point of each quartile has a name, being a "Q" followed by the number of the quarter. So the top point of the first quarter of the data points is "Q1", and so forth. Note that Q1 is also the middle number for the first half of the list, Q2 is also the middle number for the whole list, Q3 is the middle number for the second half of the list, and Q4 is the largest value in the list. Once you have these three points, Q1, Q2, and Q3, you have all you need in order to draw a simple box-and-whisker plot. Here's an example of how it works. Draw a box-and-whisker plot for the following data set: 4.3, 5.1, 3.9, 4.5, 4.4, 4.9, 5.0, 4.7, 4.1, 4.6, 4.4, 4.3, 4.8, 4.4, 4.2, 4.5, 4.4 My first step is to order the set. This gives me: 3.9, 4.1, 4.2, 4.3, 4.3, 4.4, 4.4, 4.4, 4.4, 4.5, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1 The first number I need is the median of the entire set. Since there are seventeen values in this list, I need the ninth value: 3.9, 4.1, 4.2, 4.3, 4.3, 4.4, 4.4, 4.4, 4.4, 4.5, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1 The median is Q2 = 4.4. The next two numbers I need are the medians of the two halves. Since I used the "4.4" in the middle of the list, I can't re-use it, so my two remaining data sets are: 3.9, 4.1, 4.2, 4.3, 4.3, 4.4, 4.4, 4.4 and 4.5, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1

Read More About Chi Square Test Example

Math.Tutorvista.com

Page No. :- 4/5

ThankYou

Math.TutorVista.com

S-ar putea să vă placă și