Documente Academic
Documente Profesional
Documente Cultură
Measure # 1. Range:
Range is the interval between the highest and the lowest score. Range
is a measure of variability or scatteredness of the variates or
observations among themselves and does not give an idea about the
spread of the observations around some central value.
ADVERTISEMENTS:
17, 23, 30, 36, 45, 51, 58, 66, 72, 77.
Example 2:
The scores of ten girls in a test are:
48, 49, 51, 52, 55, 57, 50, 59, 61, 62.
ADVERTISEMENTS:
1
So the range is the difference between these two scores:
... Range = 77 – 17 = 60
In a similar way, in example II
ADVERTISEMENTS:
Range = 62 – 48 = 14
Here we find that the scores of boys are widely scattered. Thus the
scores of boys vary much But the scores of girls do not vary much (of
course they vary less). Thus the variability of the scores of boys is more
than the variability of the scores of girls.
Solution:
In this case, the upper true limit of the highest class 70-79 is Hs = 79.5
and the lower true limit of the lowest class 20-29 is Ls = 19.5
Therefore, Range R = Hs – Ls
Advantages:
1. Range can be calculated quite easily.
Limitations:
1. Range is not based on all the observations of the series. It takes into
account only the most extreme cases.
ADVERTISEMENTS:
3. The range takes into account the two extreme scores in a series.
Thus when N is small or when there are large gaps in the frequency
distribution, range as a measure of variability is quite unreliable.
3
Example 4:
ADVERTISEMENTS:
Here range = 33 – 3 = 30
4
Measure # 2. Quartile Deviation:
Range is the interval or distance on the scale of measurement which
includes 100 percent cases. The limitations of the range are due to its
dependence on the two extreme values only.
For example a test results 20 scores and these scores are arranged in a
descending order. Let us divide the distribution of scores into four
5
equal parts. Each part will present a ‘quarter’. In each quarter there
will be 25% (or 1/4th of N) cases.
The value of the item which divides the first half of a series (with
values less than the value of the median) into two equal parts is called
the First Quartile (Q1) or the Lower Quartile. In other words, Q1 is a
point below which 25% of cases lie. Q1 is the 25th percentile.
6
The Second Quartile (Mdn) or the Middle Quartile is the median. In
other words, it is a point below which 50% of the scores lie. A median
is the 50th percentile.
The value of the item which divides the latter half of the series (with
values more than the value of the median) into two equal parts is
called the Third Quartile (Q3) or the Upper Quartile. In other words,
Q3 is a point below which 75% of the scores lie. Q3 is the 75th
percentile.
Note:
A student must clearly distinguish between a quarter and a quartile.
Quarter is a range; but quartile is a point on the scale. Quarters are
numbered from top to bottom (or from highest score to lowest score),
but quartiles are numbered from the bottom to the top.
7
L = Lower limit of the c.i. where Q1 lies,
N/4 = One fourth (or 25%) of N,
Thus, S I R. = Q3 – Q1/4
Q or Quartile Deviation is otherwise known as semi-interquartile
range (or S.I.R.)
Thus,Q = Q3 – Q1/2
If we will compare the formula of Q3 and Q1 with the formula
of median the following observations will be clear:
i. In case of Median we use N/2 whereas for Q1 we use N/4 and for
Q3 we use 3N/4.
ii. In case of median we use fm to denote the frequency of c.i., upon
which median lies; but in case of Q1 and Q3 we use fq to denote the
frequency of the c.i. upon which Q1 or Q3 lies.
Computation of Q (Ungrouped Data):
8
In order to calculate Q we are required to calculate Q3 and Q1 first.
Q1 and Q3 are calculated in the same manner as we were computing the
median.
The only differences are:
(i) in case of median we were counting 50% cases (N/2) from the
bottom, but
(ii) in case of Q1 we have to count 25% of cases (or N/4) from the
bottom and
(iii) in case of Q3 we have to count 75% of cases (or 3N/4) from the
bottom.
Example 5:
Find out Q of the following scores 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39.
25% of N = 20/4 = 5
Thus Q3 = 34.5.
9
In a symmetrical distribution, the median lies halfway on the scale
from Q1 and Q3. Therefore, the value Q1 + Q or Q3 – Q gives the value of
median. But, generally, distributions are not symmetrical and so Q1 +
Q or Q3 – Q would not give the value of the median.
Computation of Q (Grouped Data):
Example 6:
The scores obtained by 36 students in a test are shown in the table.
Find the quartile deviation of the scores.
Here N = 36, so for Q1 we have to take N/4 = 36/4 = 9 cases and for
Q3 we have to take 3N/4 = 3 x 36/4 = 27 cases. By looking into column
3, cf = 9 will be include in c.i. 55 – 59, whose actual limit are 54.5 –
59.5. Q1 would lie in the interval 54.5 – 59.5.
The value of Q1 is to be computed as follows:
10
For calculating Q3, cf = 27 will be included in c.i. 65 – 69,
whose actual limits are 64. 5 – 69.5. So Q3 would lie in the
interval 64.5 – 69.5 and its value is to be computed as
follows:
Sometimes, the extreme cases or values are not known, in which case
the only alternative available to us is to compute median and quartile
deviation as the measure of central, tendency and dispersion. Through
11
median and quartiles we can infer about the symmetry or skewness of
the distribution. Let us, therefore, get some idea of symmetrical and
skewed distributions.
Example 7:
Find whether the given distribution is symmetrical or not.
12
The skewness is said to be positive if the longer tail is on the right side
and it is said to the negative if the longer tail is on the left side.
Limitations of Q:
13
1. However, like median, quartile deviation is not amenable to
algebraic treatment, as it does not take into consideration all the
values of the distribution.
2. It only calculates the third and the first quartile and speaks us about
the range. From Q’ we cannot get a true picture about how the scores
are dispersed from the central value. That is ‘Q’ does not give us any
idea about the composition of scores. ‘Q’ of two series may be equal,
yet series may be quite dissimilar in composition.
4. It ignores the scores above the third quartile and the scores below
the first quartile. It simply speaks us about the middle 50% of the
distribution.
Uses of Q:
1. When the median is a measure of a central tendency;
It is given by:
Coefficient of quartile deviation= Q3 – Q1/Q3+ Q1
Where Q3 and Q1 refer to upper and lower quartiles respectively.
Measure # 3. Average Deviation (A.D.) or Mean Deviation
(M.D.):
As we have already discussed the range and the ‘Q’ roughly gives us
some idea of variability. The range of two series may be the same or
the quartile deviation of two series may be same, yet the two series
may be dissimilar. Neither the range nor the ‘Q’ speaks of the
composition of the series. These two measures do not take into
consideration the individual scores.
15
Where ∑ is sum total of;
And ‘d’ means the deviation of individual scores from the mean.
16
Example 9:
Find the mean deviation for the scores given below:
25, 36, 18, 29, 30, 41, 49, 26, 16, 27
The mean of the above scores was found to be 29.7.
Note:
If you apply some algebra, you can see that ∑ (X – M) is zero
17
Here, in column 1, we write the c.i. ‘s, in column 2, we write the
corresponding frequencies, in column 3, we write the mid-points of
the c.i. ‘s which is denoted by ‘X’, in column 4, we write the product of
frequencies and mid-points of the c.i. ‘s denoted by X, in column 5, we
write the absolute deviations of mid-points of c.i. from the mean which
is denoted by |d| and in column 6, we write the product of absolute
deviations and frequencies, denoted by |fd|.
18
4. It is the average of the deviations of individual scores from the
mean.
Limitations:
1. Mean deviation ignores the algebraic signs of the deviations and as
such it is not capable of further mathematical treatment. So, it is used
only as a descriptive measure of variability.
Uses of M.D:
1. When it is desired to weigh all the deviations according to their size.
19
depends on it. For less number of cases, the measure is likely to be
more.
In the first case, mean deviation is almost 25% of the mean, while in
the second case it is less. But the mean deviation may be more in first
case because of less number of cases. So the two mean deviations
computed above indicate almost similar dispersion.
Here also, the deviations of all the values from the mean of the
distribution are considered. This measure suffers from the least
drawbacks and provides accurate results.
20
i. In computing AD or MD, we disregard signs, whereas in finding SD
we avoid the difficulty of signs by squaring the separate deviations;
So S.D. is also called the ‘Root mean square deviations from mean’ and
is generally denoted by the small Greek letter σ (sigma).
(Some authors use ‘x’ as the deviation of individual scores from the
mean)
21
The mean square deviations is referred to as variance. Or in
simple words square of standard of deviation is called the
Second Moment of Dispersion or Variance.
Computation of S.D. (Ungrouped data):
There are two ways of computing S.D. for ungrouped data:
(a) Direct method.
Step 1:
Calculate arithmetic mean of the given data:
Step 2:
Write the value of the deviation d i.e. X – M against each score in
column 2. Here the deviations of scores are to be taken from 12. Now
you will find that ∑d or ∑ (X – M) is equal to zero. Think, why is it so?
Check it. If this is not so, find out the error in computation and rectify
it.
Step 3:
22
Square the deviations and write the value of d2 against each score in
column 3. Find the sum of squared deviations. ∑d2 = 84 .
Table 4.5 Computation of S.D:
Step 4:
Calculate the mean of the squared deviations and then find out the
positive square root for getting the value of standard deviation i.e. σ.
23
To facilitate computation in such situations, the deviations may be
taken from an assumed mean. The adjusted short-cut formula for
calculating S.D. will then be,
where,
Solution:
Let us take assumed mean AM = 11.
24
Putting the values from table in formula, the S.D.
Here also, the first step is to find the mean M, for which we have to
take the mid-points of the c.i’s denoted by X’ and find the product fX.’.
Mean is given by ∑fx’/N. The second step is to find the deviations of
25
the mid-points of class intervals X’ from the mean i.e. X’- M denoted
by d.
The third step is to square the deviations and find the product of the
squared deviations and the corresponding frequency.
26
Sometimes, in direct method, it is observed that the deviations from
the actual mean results in decimals and the values of d2 and fd2 are
difficult to calculate. In order to avoid this problem we follow a short
cut-method for calculating standard deviation.
In this method, instead of taking the deviations from actual mean, we
take deviations from a suitably chosen assumed mean, say A.M.
Example 13:
Using short-cut method find S.D. of the data in table 4.7.
Solution:
Let us take assumed mean AM = 10. Other calculations needed for
calculating S.D. are given in table 4.8.
27
Putting values from table
28
Here, Assumed Mean is the mid-point of the c.i. 9-11 i.e. 10, so the
deviations d‘s have been taken from 10 and divided by 3, the length of
c.i. The formula for S.D. in step-deviation method is
f= frequency;
d = deviations of the mid-points of c.i. ‘s from the assumed
mean (AM) in class interval (i) units, Which can be stated:
29
Combined Standard Deviation (σcomb):
When two sets of scores have been combined into a single lot, it is
possible to calculate the σ of the total distribution from the σ’ s of the
two component distributions.
The formula is:
where σ1 , = SD of distribution 1
σ2 = SD of distribution 2
d1 = (M1 – Mcomb)
d2 = (M2 – Mcomb)
N1 = No. of cases in distribution 1.
N2 = No. of cases in distribution 2.
An example will illustrate the use of the formula.
Example 14:
Suppose we are given the means and SD’s on an Achievement Test for
two classes differing in size, and are asked to find the o of the
combined group.
Data are as follows:
30
First, we find that
Properties of S.D:
1. If each variate value is increased by the same constant
value, the value of S.D. of the distribution remains
unchanged:
We will discuss this effect upon S.D. by considering an illustration.
The table (4.10) shows original scores of 5 students in a test with an
arithmetic mean score of 20.
New scores (X’) are also given in the same table which we obtain by
adding a constant 5 to each original score. Using formula for
31
ungrouped data, we observe that S.D. of the scores remains the same
in both the situations.
32
Thus, the S.D. of the new distribution will be multiplied by the same
constant (here, it is 5).
If, for example, on a test in a class, boys have mean score M1 = 60 with
S.D. σ1 = 15 and girls mean score is M2 = 60 with S.D. σ2 = 10. Clearly,
33
girls who have a lesser S.D., are more consistent in scoring around
their average score than boys.
We have situations when two or more distributions having unequal
means or different units of measurements are to be compared in
respect of their scattered-ness or variability. For making such
comparisons we use coefficients of relative dispersion or coefficient of
variations (C.V.).
(2) when M’s are unequal, the units of the scale being the same.
34
A group of 10 years old boys has a mean height of 137 cm. with a o of
6.2 cm. The same group of boys has a mean weight of 30 kg. with a of
3.5 kg. In which trait, is the group more variable?
Solution:
Obviously, we cannot compare centimetres and kilograms directly, but
we can compare the relative variability of the two distributions in
terms of V.
In the present example, two groups not only differ in respect of mean
but also in units of measurements which is cm. in the first case and kg.
in the second. Coefficient of variation may be used to compare the
variability of the groups in such a situation.
Thus, from the above calculation it appears that these boys are about
twice as variable (11.67/4.53 = 2.58) in weight as in height.
2. When means are unequal, but scale units are the same:
Suppose we have the following data on a test for a group of
boys and a group of men:
Then, compare:
(i) The performance of the two groups on the test.
35
(ii) The variability of scores in the two groups.
Solution:
(i) Since the mean score of group of boys is greater than that of men,
therefore, boys group has given a better performance of the test.
36
Interpretation of Standard Deviation:
The standard deviation characterises the nature of distribution of
scores. When the scores are more widely spread S.D. is more and
when scores are less scattered S.D. is less. For interpreting the value of
the measure of dispersion, we must understand that greater the value
of ‘σ‘ the more scattered are the scores from the mean.
As in the case of mean deviation, the interpretation of standard
deviation requires the value of M and N for consideration.
Merits of SD:
1. SD is rigidly defined and its value is always definite.
4. Here, the signs of deviations are not disregarded, instead they are
eliminated by squaring each of the deviations.
37
5. It is the master measure of variability as it is amenable to algebraic
treatment and is used in correlational work and in further statistical
analysis.
Limitations:
1. It is not easy to calculate and it is not easily understood.
2. It gives more weights to extreme items and less to those which are
near the mean. When the deviation of an extreme score is squared it
gives rise to a bigger value.
Uses of S.D:
Standard deviation is used:
(i) When the most accurate, reliable and stable measure of variability
is wanted.
38
(iii) When coefficient of correlation and other statistics are
subsequently computed.
39