Sunteți pe pagina 1din 9

Calculating Central Tendency

Lets say that we recruited a sample of 16 children and measured their weight. These data are
below.
ID #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

Weight (lbs)
82.0
70.8
123.5
66.8
66.8
67.0
58.0
67.0
56.0
52.0
77.5
115.8
74.8
85.5
119.0
51.0

Mean ( x ) =x =

where x is the sum of the xs (in this case each x is each


weight) and n is the number of observations (usually the number
of people).
Mean = 1233.3 + 16 = 77.1
Mode = the most frequently occurring score. In this case, the
distribution is bimodal with modes of 66.8 and 67.0.
Median (sometimes called the 50th percentile or P50) is the score that
holds the middle position. To determine the median, you need to
arrange the scores in ascending or descending order.

Now that the data are arranged in order, we need to find the middle position. The red line
below indicates where the middle is.
ID #
3
15
12
14
1
11
13
2

Weight (lbs)
123.5
119.0
115.8
85.5
82.0
77.5
74.8
70.8

6
8
4
5
7
9
10
16

67.0
67.0
66.8
66.8
58.0
56.0
52.0
51.0

However, if we choose 70.8 as the median, then 7 numbers fall above it and 8 numbers fall
below it. By definition, the P50 represents the position that is in the middle. If we choose

67.0, again this does not satisfy the definition of middle.


The median in this case would be the point that sits in between 70.8 and 67.0. If we calculate
the average of these 2 middle numbers, we get 68.9. Therefore, the median of this set of data
is 68.9. Notice that if there were an odd number of scores, there would be a number in the
dataset that sits in the middle. For example, pretend that ID #16 was not included so that the
dataset ranged from 52 to 123.5 with only 15 children. The median in this case would be 70.8
(with 7 #s falling above it and 7 #s falling below it).

Calculating Measures of Variability


ID #
3
15
12
14
1
11
13
2
6
8
4
5
7
9
10
16

Weight (lbs)
123.5
119.0
115.8
85.5
82.0
77.5
74.8
70.8
67.0
67.0
66.8
66.8
58.0
56.0
52.0
51.0

The Range = the highest score minus the


lowest score. In this case the range is 123.5
51.0 = 72.5. This a is crude measure of
variability that depends on the value of 2
numbers. A better measure of variability is
called the standard deviation.

The Standard Deviation (s) =


Notice that the (nasty looking) equation
involves the deviation (d), which is how far
each x varies from the mean. Below I have
calculated the d for each childs score. Also,
notice that when you do this, you get some
positives and some negatives. If you were to
add up the column of ds, you would get
zero. I got very nearly zero because I
rounded my ds to two digits, but if I hadnt
done that, the sum of d would be zero.

ID #

Weight
(lbs)

where d = x x

3
123.5
46.42
15
119.0
41.92
12
115.8
38.67
14
85.5
8.42
1
82.0
4.92
11
77.5
0.42
13
74.8
-2.33
2
70.8
-6.33
6
67.0
-10.08
8
67.0
-10.08
4
66.8
-10.33
5
66.8
-10.33
7
58.0
-19.08
9
56.0
-21.08
10
52.0
-25.08
16
51.0
-26.08
sum= 1233.3 -0.03 mean= 77.1

The important thing to recognize is that this


equation basically gives you the average
deviation from the mean (looks similar to
the equation for the mean, doesnt it?). But
the problem is that when you add up the ds
you get zero! So, some clever person realized
that if you square each d, you keep the idea
of a deviation intact but you eliminate the
negative sign. Clearly, this person had too
much time on their hands!
So, the 46.42 gets squared (multiplied
by itself) to give 2154.82. Repeat this
for each d and then add the column.
This is
2
d , which is the numerator. (Note that
this is
different than (d )2 where you would
add the ds and then square...)
The denominator of the equation is N-1,
similar to the denominator for the mean.
Now, notice that the whole equation gets
square-rooted (to un-do the squaring of
the ds that we did earlier).
The most common mistakes: square each
d and then add the ds (dont try to add
the ds and then square), dont forget to

square root the result of the division.

ID #
3
15
12
14
1
11
13
2
6
8
4
5
7
9
10
16
sum=
mean=

ID #
3
15
12
14
1
11
13
2
6
8
4
5
7
9
10
16
s
u
m
=
1
2
3
3
.
3

Weight
(lbs)
123.5
119.0
115.8
85.5
82.0
77.5
74.8
70.8
67.0
67.0
66.8
66.8
58.0
56.0
52.0
51.0
1233.3
77.1

d
d2
46.42 2154.82
41.92 1757.29
38.67 1495.37
8.42
70.90
4.92
24.21
0.42
0.18
-2.33
5.43
-6.33
40.07
-10.08
101.61
-10.08
101.61
-10.33
106.71
-10.33
106.71
-19.08
364.05
-21.08
444.37
-25.08
629.01
-26.08
680.17
-0.03 8082.46

Weight
x2
(lbs)
123.5 15252.25
119.0 14161.00
115.8 13398.06
85.5
7310.25
82.0
6724.00
77.5
6006.25
74.8
5587.56
70.8
5005.56
67.0
4489.00
67.0
4489.00
66.8
4455.56
66.8
4455.56
58.0
3364.00
56.0
3136.00
52.0
2704.00
51.0
2601.00

s=

d
N

= 8082.46 / 1 5 = 5 3 8 . 83 = 23.2 lbs

Interpretation: This group of children weighs 77.1


23.2 lbs. This means that 68% of the scores for
children in this population fall between 53.9 lbs ( x 1s) and 100.3 lbs ( x + 1s). 95% of children in this
population fall between 30.7 lbs and 123.5 lbs, with
only 5% falling outside of this range (2.5% above
123.5 lbs and 2.5% below 30.7lbs).
See the figure below for this example applied to a
normal curve.

First, there is another formula for standard deviation


that you should know.
Nx
s=

( )2

N (N1)

This formula is equal to the one given above. This


will be hard to believe by the looks of this formula
but back when people did calculations by hand, this
formula was easier to use, which is why someone
developed it! There are no deviations and negative
numbers to deal with and that makes it easier to
use... Anyway, this will give you the same result as
above.

1
0
3
1
3
9
.
1
m
e
a
n
=
7
7
.
1

x means that you square each x,

sum the squared xs, and multiply


by N, the number of
observations.
(x )2 means you add the xs and
then square. Do these two
expressions first and then subtract
to get the numerator. Dont forget
to square root at the end. Heres
what the calculation looks like:

x = 103139.1 (see table above)


(x) = (1233.3) = 1521028.8
2

The numerator = (16*103139.1) (1233.3)2 = 129196.8


The denominator = 16(16-1) = 16*15 = 240
s = 129196. 8 / 240 = 5 3 8 . 32 = 23.2 lbs

Score
Frequency

7.5
-3s

30.7

53.9

77.1

100.3

123.5

146.7

-2s

-1s

1s

2s

3s

Weight Scores (lbs)


By definition, plus or minus 1 standard deviation ( 1s) around the mean includes 68% of the
population (if the sample that was measured is a representative sample). If the standard deviation
were a smaller number, the distance between the mean and 1s would be more narrow and the
curve more leptokurtic (or peaked). It would take a smaller range to include 68% of the scores. If
the s were a larger number, then the distance between the mean and 1s would be wider and the
curve more platykurtic (or a flatter peak).

leptokurtic

platykurtic

S-ar putea să vă placă și