Sunteți pe pagina 1din 39

There are four commonly used measures to indicate the variability (or

dispersion) within a set of measures. They are: 1. Range 2. Quartile


Deviation 3. Average Deviation 4. Standard Deviation.

Measure # 1. Range:
Range is the interval between the highest and the lowest score. Range
is a measure of variability or scatteredness of the variates or
observations among themselves and does not give an idea about the
spread of the observations around some central value.

Symbolically R = Hs – Ls. Where R = Range;

ADVERTISEMENTS:

Hs is the ‘Highest score’ and Ls is the Lowest Score.

Computation of Range (Ungrouped data):


Example 1:
The scores of ten boys in a test are:
ADVERTISEMENTS:

17, 23, 30, 36, 45, 51, 58, 66, 72, 77.

Example 2:
The scores of ten girls in a test are:
48, 49, 51, 52, 55, 57, 50, 59, 61, 62.

ADVERTISEMENTS:

In example I the highest score is 77 and the lowest score is 17.

1
So the range is the difference between these two scores:
... Range = 77 – 17 = 60
In a similar way, in example II

ADVERTISEMENTS:

Range = 62 – 48 = 14

Here we find that the scores of boys are widely scattered. Thus the
scores of boys vary much But the scores of girls do not vary much (of
course they vary less). Thus the variability of the scores of boys is more
than the variability of the scores of girls.

Computation of Range (Grouped data):


Example 3:
ADVERTISEMENTS:

Find the range of data in following distribution:

Solution:
In this case, the upper true limit of the highest class 70-79 is Hs = 79.5
and the lower true limit of the lowest class 20-29 is Ls = 19.5

Therefore, Range R = Hs – Ls

= 79.5 – 19.5 = 60.00

Range is an index of variability. When the range is more the group is


more variable. The smaller the range the more homogeneous is the
group. Range is the most general measure of ‘spread’ or ‘scatter’ of
2
scores (or measures). When we wish to make a rough comparison of
variability of two or more groups we may compute the range.

Range as compared above is in a crude form or is an absolute measure


of dispersion and is unfit for the purposes of comparison, especially
when the series are in two different units. For the purpose of
comparison, coefficient of range is calculated by dividing the range by
the sum of the largest and the smallest- items.

Advantages:
1. Range can be calculated quite easily.

2. It is a simplest measure of dispersion.

3. It is computed when we want to make a rough comparison of two or


more graphs of variability.

Limitations:
1. Range is not based on all the observations of the series. It takes into
account only the most extreme cases.

ADVERTISEMENTS:

2. It helps us to make only a rough comparison of two or more groups


of variability.

3. The range takes into account the two extreme scores in a series.

Thus when N is small or when there are large gaps in the frequency
distribution, range as a measure of variability is quite unreliable.

3
Example 4:
ADVERTISEMENTS:

Scores of Group A – 3, 5, 8, 11, 20, 22, 27, 33

Here range = 33 – 3 = 30

Scores of Group B – 3, 5, 8, 11, 20, 22, 27, 93

Here range = 93 – 3 = 90.

Just compare the series of scores in group A and group B. In group A if


a single score 33 (the last score) is changed to 93, the range is widely
changed. Thus a single high score may increase the range from low to
high. This is why range is not a reliable measure of variability.

4. It is affected very greatly by fluctuations in sampling. Its value is


never stable. In a class where normally the height of students ranges
from 150 cm to 180 cm, if a dwarf, whose height is 90 cm is admitted,
the range would shoot up from 90 cm to 180 cm.

5. Range does not present the series and dispersion truly.


Asymmetrical and symmetrical distribution can have the same range
but not the same dispersion. It is of limited accuracy and should be
used with caution.

However, we should not overlook the fact that range is a crude


measure of dispersion and is entirely unsuitable for precise and
accurate studies.

4
Measure # 2. Quartile Deviation:
Range is the interval or distance on the scale of measurement which
includes 100 percent cases. The limitations of the range are due to its
dependence on the two extreme values only.

There are some measures of dispersion which are independent of


these two extreme values. Most common of these is the quartile
deviation which is based upon the interval containing the middle 50
percent of cases in a given distribution.

Quartile deviation is one-half the scale distance between the


third quartile and the first quartile. It is the Semi-
interquartile range of a distribution:
Before taking up the quartile deviation, we must know the meaning of
quarters and quartiles.

For example a test results 20 scores and these scores are arranged in a
descending order. Let us divide the distribution of scores into four

5
equal parts. Each part will present a ‘quarter’. In each quarter there
will be 25% (or 1/4th of N) cases.

As scores are arranged in descending order,

The top 5 scores will be in the 1st quarter,

The next 5 scores will be in the 2nd quarter,

The next 5 scores will be in the 3rd quarter, and

And the lowest 5 scores will be in the 4th quarter.

With a view to having a better study of the composition of a series, it


may be necessary to divide it in three, four, six, seven, eight, nine, ten
or hundred parts.

Usually, a series is divided in four, ten or hundred parts. One item


divides the series in two parts, three items in four parts (quartiles),
nine items in ten parts (deciles), and ninety-nine items in hundred
parts (percentiles).

There are, thus, three quartiles, nine deciles and ninety-nine


percentiles in a series. The second quartile, or 5th decile or the 50th
percentile is the median (see Figure).

The value of the item which divides the first half of a series (with
values less than the value of the median) into two equal parts is called
the First Quartile (Q1) or the Lower Quartile. In other words, Q1 is a
point below which 25% of cases lie. Q1 is the 25th percentile.

6
The Second Quartile (Mdn) or the Middle Quartile is the median. In
other words, it is a point below which 50% of the scores lie. A median
is the 50th percentile.

The value of the item which divides the latter half of the series (with
values more than the value of the median) into two equal parts is
called the Third Quartile (Q3) or the Upper Quartile. In other words,
Q3 is a point below which 75% of the scores lie. Q3 is the 75th
percentile.
Note:
A student must clearly distinguish between a quarter and a quartile.
Quarter is a range; but quartile is a point on the scale. Quarters are
numbered from top to bottom (or from highest score to lowest score),
but quartiles are numbered from the bottom to the top.

The Quartile Deviation (Q) is one half the scale distance


between the Third Quartile (Q3) and the First Quartile (Q1):

L = Lower limit of the c.i. where Q3 lies,


3N/4= 3/4 of Nor 75% of N.

F = total of all frequencies below ‘L’,

fq = Frequency of the c.i. upon which Q3 lies and i = size or length of


the c.i.

7
L = Lower limit of the c.i. where Q1 lies,
N/4 = One fourth (or 25%) of N,

F = total of all frequencies below ‘L’,

fq = frequency of the c.i. upon which Q1 lies,


and i = size or length of c.i.
Inter-Quartile Range:
The range between the third quartile and the first quartile is known as
the inter-quartile range. Symbolically inter-quartile range = Q3 – Q1.
Semi-Interquartile Range:
It is half the distance between the third quartile and the first quartile.

Thus, S I R. = Q3 – Q1/4
Q or Quartile Deviation is otherwise known as semi-interquartile
range (or S.I.R.)

Thus,Q = Q3 – Q1/2
If we will compare the formula of Q3 and Q1 with the formula
of median the following observations will be clear:
i. In case of Median we use N/2 whereas for Q1 we use N/4 and for
Q3 we use 3N/4.
ii. In case of median we use fm to denote the frequency of c.i., upon
which median lies; but in case of Q1 and Q3 we use fq to denote the
frequency of the c.i. upon which Q1 or Q3 lies.
Computation of Q (Ungrouped Data):

8
In order to calculate Q we are required to calculate Q3 and Q1 first.
Q1 and Q3 are calculated in the same manner as we were computing the
median.
The only differences are:
(i) in case of median we were counting 50% cases (N/2) from the
bottom, but

(ii) in case of Q1 we have to count 25% of cases (or N/4) from the
bottom and
(iii) in case of Q3 we have to count 75% of cases (or 3N/4) from the
bottom.
Example 5:
Find out Q of the following scores 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39.

There are 20 scores.

25% of N = 20/4 = 5

Q1 is a point below which 25% of cases lie. In this example, Q1 is a point


below which 5 cases lie. From the mere inspection of ordered data it is
found that below 24.5 there are 5 cases. Thus Q1 = 24.5
Likewise Q3 is a point below which 75% of eases lie.
75% of N = 3/4 x 20 = 15

We find that below 34.5,15 cases lie

Thus Q3 = 34.5.

9
In a symmetrical distribution, the median lies halfway on the scale
from Q1 and Q3. Therefore, the value Q1 + Q or Q3 – Q gives the value of
median. But, generally, distributions are not symmetrical and so Q1 +
Q or Q3 – Q would not give the value of the median.
Computation of Q (Grouped Data):
Example 6:
The scores obtained by 36 students in a test are shown in the table.
Find the quartile deviation of the scores.

In column 1, we have taken class Interval, in column 2, we have taken


the frequency, and in column 3, cumulative frequencies starting from
the bottom have been written.

Here N = 36, so for Q1 we have to take N/4 = 36/4 = 9 cases and for
Q3 we have to take 3N/4 = 3 x 36/4 = 27 cases. By looking into column
3, cf = 9 will be include in c.i. 55 – 59, whose actual limit are 54.5 –
59.5. Q1 would lie in the interval 54.5 – 59.5.
The value of Q1 is to be computed as follows:

10
For calculating Q3, cf = 27 will be included in c.i. 65 – 69,
whose actual limits are 64. 5 – 69.5. So Q3 would lie in the
interval 64.5 – 69.5 and its value is to be computed as
follows:

Interpretation of Quartile Deviation:


While interpreting the value of quartile deviation it is better to have
the values of Median, Q1 and Q3, along with Q. If the value of Q is
more, then the dispersion will be more, but again the value depends
on the scale of measurement. Two values of Q are to be compared only
if scale used is the same. Q measured for scores out of 20 cannot be
compared directly with Q for scores out of 50.
If median and Q are known, we can say that 50% of the cases lie
between ‘Median – Q’ and ‘Median + Q’. These are the middle 50% of
cases. Here, we come to know about the range of only the middle 50%
of the cases. How the lower 25% of the cases and the upper 25% of the
cases are distributed, is not known through this measure.

Sometimes, the extreme cases or values are not known, in which case
the only alternative available to us is to compute median and quartile
deviation as the measure of central, tendency and dispersion. Through
11
median and quartiles we can infer about the symmetry or skewness of
the distribution. Let us, therefore, get some idea of symmetrical and
skewed distributions.

Symmetrical and Skewed Distributions:


A distribution is said to be symmetrical when the frequencies are
symmetrically distributed around the measure of central tendency. In
other words, we can say that the distribution is symmetrical if the
values at equal distance on the two sides of the measure of central
tendency have equal frequencies.

Example 7:
Find whether the given distribution is symmetrical or not.

Here the measure of central tendency, mean as well as median, is 5. If


we start comparing the frequencies of the values on the two sides of 5,
we find that the values 4 and 6, 3 and 7, 2 and 8, 1 and 9, 0 and 10
have the same number of frequencies. So the distribution is perfectly
symmetrical.

In a symmetrical distribution, mean and median are equal and median


lies at an equal distance from the two quartiles i.e. Q3 – Median =
Median—Q1.
If a distribution is not symmetric, then the departure from the
symmetry refers to its skewness. Skewness indicates that the curve is
turned more towards one side than the other. So the curve will have a
longer tail on one side.

12
The skewness is said to be positive if the longer tail is on the right side
and it is said to the negative if the longer tail is on the left side.

The following figures show the appearance of a positively


skewed and negatively skewed curve:

Q3 – Mdn > Mdn – Q1 indicates + ve skewness


Q3 – Mdn < Mdn – Q1 indicates – ve skewness
Q3 – Mdn = Mdn – Q1 indicates zero skewness
Merits of Q:
1. It is a more representative and trustworthy measure of variability
than the overall range.

2. It is a good index of score density at the middle of the distribution.

3. Quartiles are useful in indicating the skewness of a distribution.

4. Like the median, Q is applicable to open-end distributions.

5. Wherever median is preferred as a measure of central tendency,


quartile deviation is preferred as measure of dispersion.

Limitations of Q:

13
1. However, like median, quartile deviation is not amenable to
algebraic treatment, as it does not take into consideration all the
values of the distribution.

2. It only calculates the third and the first quartile and speaks us about
the range. From Q’ we cannot get a true picture about how the scores
are dispersed from the central value. That is ‘Q’ does not give us any
idea about the composition of scores. ‘Q’ of two series may be equal,
yet series may be quite dissimilar in composition.

3. It roughly gives an idea of dispersion.

4. It ignores the scores above the third quartile and the scores below
the first quartile. It simply speaks us about the middle 50% of the
distribution.

Uses of Q:
1. When the median is a measure of a central tendency;

2. When the distribution is incomplete at either end;

3. When there are scattered or extreme score which would


disproportionately influence the SD;

4. When the concentration around the median — the middle 50% of


cases is of primary interest.

Coefficient of Quartile Deviation:


The Quartile Deviation is an absolute measure of dispersion and in
order to make it relative, we calculate the ‘coefficient of quartile
14
deviation’. Coefficient is calculated by dividing the quartile deviation
by the average of quartiles.

It is given by:
Coefficient of quartile deviation= Q3 – Q1/Q3+ Q1
Where Q3 and Q1 refer to upper and lower quartiles respectively.
Measure # 3. Average Deviation (A.D.) or Mean Deviation
(M.D.):
As we have already discussed the range and the ‘Q’ roughly gives us
some idea of variability. The range of two series may be the same or
the quartile deviation of two series may be same, yet the two series
may be dissimilar. Neither the range nor the ‘Q’ speaks of the
composition of the series. These two measures do not take into
consideration the individual scores.

The method of average deviation or ‘the mean deviation’, as it is called


sometimes, tends to remove a serious shortcoming of both methods
(Range and ‘Q’). The average deviation is also called the first moment
of dispersion and is based on all the items in a series.

Average deviation is the arithmetic mean of the deviations of a series


computed from some measure of central tendency (mean, median or
mode), all the deviations being considered positive. In other words the
average of the deviations of all the values from the arithmetic mean is
known as mean deviation or average deviation. (Usually, the deviation
is taken from the mean of the distribution.)

15
Where ∑ is sum total of;

X is the score; M is the mean; N is the total number of scores.

And ‘d’ means the deviation of individual scores from the mean.

Computation of Mean Deviation (Ungrouped data):


Example 8:
Find mean deviation for the following set of variates:
X = 55, 45, 39, 41, 40, 48, 42, 53, 41, 56
Solution:
In order to find mean deviation we first calculate mean for the given
set of observations.

The deviations and the absolute deviations are given in


Table 4.2:

16
Example 9:
Find the mean deviation for the scores given below:
25, 36, 18, 29, 30, 41, 49, 26, 16, 27
The mean of the above scores was found to be 29.7.

For calculating the mean deviation:

Note:
If you apply some algebra, you can see that ∑ (X – M) is zero

Computation of Mean Deviation (grouped data):


Example 10:
Find the mean deviation for the following frequency
distribution:

17
Here, in column 1, we write the c.i. ‘s, in column 2, we write the
corresponding frequencies, in column 3, we write the mid-points of
the c.i. ‘s which is denoted by ‘X’, in column 4, we write the product of
frequencies and mid-points of the c.i. ‘s denoted by X, in column 5, we
write the absolute deviations of mid-points of c.i. from the mean which
is denoted by |d| and in column 6, we write the product of absolute
deviations and frequencies, denoted by |fd|.

Merits of Mean Deviation:


1. Mean deviation is the simplest measure of dispersion that takes into
account all the values in a given distribution.

2. It is easily comprehensible even by a person not well versed in


statistics.

3. It is not very much affected by the value of extreme items.

18
4. It is the average of the deviations of individual scores from the
mean.

Limitations:
1. Mean deviation ignores the algebraic signs of the deviations and as
such it is not capable of further mathematical treatment. So, it is used
only as a descriptive measure of variability.

2. In fact, M.D. is not in common use. It is rarely used in modern


statistics and generally dispersion is studied by standard deviation.

Uses of M.D:
1. When it is desired to weigh all the deviations according to their size.

2. When it is required to know the extent to which the measures are


spread out on either side of the mean.

3. When extreme deviations unduly influence the standard deviation.

Interpretation of Mean Deviation:


For interpreting the mean deviation, it is always better to look into it
along with the mean and the number of cases. Mean is required
because the mean and the mean deviation are respectively the point
and the distance on the same scale of measurement.

Without mean, the mean deviation cannot be interpreted, as there is


no clue for the scale of measurement or the unit of measurement. The
number of cases is important because the measure of dispersion

19
depends on it. For less number of cases, the measure is likely to be
more.

In the two examples, we have:

In the first case, mean deviation is almost 25% of the mean, while in
the second case it is less. But the mean deviation may be more in first
case because of less number of cases. So the two mean deviations
computed above indicate almost similar dispersion.

Measure # 4. Standard Deviation or S.D. and Variance:


Out of several measures of dispersion, the most frequently used
measure is ‘standard deviation’. It is also the most important because
of being the only measure of dispersion amenable to algebraic
treatment.

Here also, the deviations of all the values from the mean of the
distribution are considered. This measure suffers from the least
drawbacks and provides accurate results.

It removes the drawback of ignoring the algebraic signs while


calculating deviations of the items from the average. Instead of
neglecting the signs, we square the deviations, thereby making all of
them positive.

It differs from the AD in several respects:

20
i. In computing AD or MD, we disregard signs, whereas in finding SD
we avoid the difficulty of signs by squaring the separate deviations;

ii. The squared deviations used in computing SD are always taken


from the mean, never from the median or mode.

“Standard deviation or S.D. is the square root of the mean of


the squared deviations of the individual scores from the
mean of the distribution.”
To be more clear, we should note here that in computing the S.D., we
square all the deviations separately. Find their sum, divide the sum by
total number of scores and then find the square root of the mean of the
squared deviations.

So S.D. is also called the ‘Root mean square deviations from mean’ and
is generally denoted by the small Greek letter σ (sigma).

Symbolically, the standard deviation for ungrouped data is


defined as:

Where d = deviation of individual scores from the mean;

(Some authors use ‘x’ as the deviation of individual scores from the
mean)

∑ = sum total of; N = total number of cases.

21
The mean square deviations is referred to as variance. Or in
simple words square of standard of deviation is called the
Second Moment of Dispersion or Variance.
Computation of S.D. (Ungrouped data):
There are two ways of computing S.D. for ungrouped data:
(a) Direct method.

(b) Short-cut method.

(a) Direct Method:


Find the standard deviation for the scores given below:
X = 12, 15, 10, 8, 11, 13, 18, 10, 14, 9

This method uses formula (18) for finding S.D. which


involves the following steps:

Step 1:
Calculate arithmetic mean of the given data:

Step 2:
Write the value of the deviation d i.e. X – M against each score in
column 2. Here the deviations of scores are to be taken from 12. Now
you will find that ∑d or ∑ (X – M) is equal to zero. Think, why is it so?
Check it. If this is not so, find out the error in computation and rectify
it.

Step 3:

22
Square the deviations and write the value of d2 against each score in
column 3. Find the sum of squared deviations. ∑d2 = 84 .
Table 4.5 Computation of S.D:

The required standard deviation is 2.9.

Step 4:
Calculate the mean of the squared deviations and then find out the
positive square root for getting the value of standard deviation i.e. σ.

Using formula (19), the Variance will be σ2 = ∑d2/N = 84/10 = 8.4


(b) Short-cut Method:
In most of the cases the arithmetic mean of the given data happens to
be a fractional value and then the process of taking deviations and
squaring them becomes tedious and lime consuming in computation
of S.D.

23
To facilitate computation in such situations, the deviations may be
taken from an assumed mean. The adjusted short-cut formula for
calculating S.D. will then be,

where,

d = Deviation of the score from an assumed mean, say AM; i.e. d = (X


– AM).

d2 = The square of the deviation.


∑d = The sum of the deviations.

∑d2 = The sum of the squared deviations.


N = No. of the scores or variates.

The computation procedure is clarified in the following


example:
Example 11:
Find S.D. for the scores given in table 4.5 of X = 12, 15, 10, 8, 11, 13, 18,
10, 14, 9. Use short-cut method.

Solution:
Let us take assumed mean AM = 11.

The deviations and squares of deviations needed in formula


are given in the following table:

24
Putting the values from table in formula, the S.D.

The short-cut method gives the same result as we obtained by using


direct method in previous example. But short-cut method tends to
reduce the calculation work in situations where arithmetic mean is not
a whole number.

Computation of S.D. (Grouped data):


(a) Long Method/Direct Method:
Example 12:
Find the S.D. for the following distribution:

Here also, the first step is to find the mean M, for which we have to
take the mid-points of the c.i’s denoted by X’ and find the product fX.’.
Mean is given by ∑fx’/N. The second step is to find the deviations of

25
the mid-points of class intervals X’ from the mean i.e. X’- M denoted
by d.
The third step is to square the deviations and find the product of the
squared deviations and the corresponding frequency.

To solve the above problem, c.i.’s are written in column 1, frequencies


are written in column 2, mid-points of c.i’s i.e. X’ are written in
column 3, the product of fX’ is written in column 4, the deviation of X’
from the mean is written in column 5, the squared deviation d2 is
written in column 6, and the product fd2 is written in column 7,
As shown below:

So, the deviations of the midpoints are to be taken from 11.1.

Thus, the required standard deviation is 4.74.

(b) Short-cut Method:

26
Sometimes, in direct method, it is observed that the deviations from
the actual mean results in decimals and the values of d2 and fd2 are
difficult to calculate. In order to avoid this problem we follow a short
cut-method for calculating standard deviation.
In this method, instead of taking the deviations from actual mean, we
take deviations from a suitably chosen assumed mean, say A.M.

The following formula is then used for calculating S.D:

where d is deviation from assumed mean.


The following steps are then involved in the computation of
standard deviation:
(i) Obtain deviations of the variates from assumed mean A.M. as d =
(X – AM)

(ii) Multiply these deviations by corresponding frequencies to get the


column fd. The sum of this column gives ∑fd.
fd with corresponding deviation (d)
(iii) Multiply to get the column fd 2. The sum of this column will be
∑fd 2.
(iv) Use formula (22) to find S.D.

Example 13:
Using short-cut method find S.D. of the data in table 4.7.

Solution:
Let us take assumed mean AM = 10. Other calculations needed for
calculating S.D. are given in table 4.8.
27
Putting values from table

Using the formula (19), the variance

(c) Step-Deviation Method:


In this method, in column 1 we write c.i. ‘s; in column 2 we
write the frequencies; in column 3 we write the values of d,
where d = X’-AM/i; in column 4 we write the product of fd,
and in column 5, We write the values of fd2, as shown below:

28
Here, Assumed Mean is the mid-point of the c.i. 9-11 i.e. 10, so the
deviations d‘s have been taken from 10 and divided by 3, the length of
c.i. The formula for S.D. in step-deviation method is

where i = length of the c.i’s,

f= frequency;
d = deviations of the mid-points of c.i. ‘s from the assumed
mean (AM) in class interval (i) units, Which can be stated:

Putting values from the table

The procedures of calculation can also be stated in following


manner:

29
Combined Standard Deviation (σcomb):
When two sets of scores have been combined into a single lot, it is
possible to calculate the σ of the total distribution from the σ’ s of the
two component distributions.
The formula is:

where σ1 , = SD of distribution 1
σ2 = SD of distribution 2
d1 = (M1 – Mcomb)
d2 = (M2 – Mcomb)
N1 = No. of cases in distribution 1.
N2 = No. of cases in distribution 2.
An example will illustrate the use of the formula.

Example 14:
Suppose we are given the means and SD’s on an Achievement Test for
two classes differing in size, and are asked to find the o of the
combined group.
Data are as follows:

30
First, we find that

The formula (24) can be extended to any number of distributions. For


example, in the case of three distributions, it will be

Properties of S.D:
1. If each variate value is increased by the same constant
value, the value of S.D. of the distribution remains
unchanged:
We will discuss this effect upon S.D. by considering an illustration.
The table (4.10) shows original scores of 5 students in a test with an
arithmetic mean score of 20.

New scores (X’) are also given in the same table which we obtain by
adding a constant 5 to each original score. Using formula for

31
ungrouped data, we observe that S.D. of the scores remains the same
in both the situations.

Thus, the value of S.D. in both situations remains same.

2. When a constant value is subtracted from each variate,


the value of S.D. of the new distribution remains unchanged:
The students can also examine that when we subtract a constant from
each score, the mean is decreased by the constant, but S.D. is the
same. It is due to the reason that ‘d‘ remains unchanged.
3. If each observed value is multiplied by a constant value,
S.D. of the new observations will also be multiplied by the
same constant:
Let us multiply each score of the original distribution (Table 4.10) by
5.

32
Thus, the S.D. of the new distribution will be multiplied by the same
constant (here, it is 5).

4. If each observed value is divided by a constant value, S.D.


of the new observations will also be divided by the same
constant. The students can examine with an example:
Thus, to conclude, SD is independent of change of origin (addition,
subtraction) but dependent of change of scale (multiplication,
division).

Measurements of Relative Dispersion (Coefficient of


Variation):
The measures of dispersion give us an idea about the extent to which
scores are scattered around their central value. Therefore, two
frequency distributions having the same central values can be
compared directly with the help of various measures of dispersion.

If, for example, on a test in a class, boys have mean score M1 = 60 with
S.D. σ1 = 15 and girls mean score is M2 = 60 with S.D. σ2 = 10. Clearly,

33
girls who have a lesser S.D., are more consistent in scoring around
their average score than boys.
We have situations when two or more distributions having unequal
means or different units of measurements are to be compared in
respect of their scattered-ness or variability. For making such
comparisons we use coefficients of relative dispersion or coefficient of
variations (C.V.).

The formula is:

(Coefficient Of variation or coefficient of relative variability)

V gives the percentage which σ is of the test mean. It is thus a ratio


which is independent of the units of measurement.

V is restricted in its use owing to certain ambiguities in its


interpretation. It is defensible when used with ratio scales—scales in
which the units are equal and there is a true zero or reference point.

For example, V may be used without hesitation with physical scales—


those concerned with linear magnitudes, weight and time.

Two cases arise in the use of V with ratio scales:


(1) When units are dissimilar, and

(2) when M’s are unequal, the units of the scale being the same.

1. When units are unlike:


Example 15:

34
A group of 10 years old boys has a mean height of 137 cm. with a o of
6.2 cm. The same group of boys has a mean weight of 30 kg. with a of
3.5 kg. In which trait, is the group more variable?
Solution:
Obviously, we cannot compare centimetres and kilograms directly, but
we can compare the relative variability of the two distributions in
terms of V.

In the present example, two groups not only differ in respect of mean
but also in units of measurements which is cm. in the first case and kg.
in the second. Coefficient of variation may be used to compare the
variability of the groups in such a situation.

We, thus calculate:

Thus, from the above calculation it appears that these boys are about
twice as variable (11.67/4.53 = 2.58) in weight as in height.

2. When means are unequal, but scale units are the same:
Suppose we have the following data on a test for a group of
boys and a group of men:

Then, compare:
(i) The performance of the two groups on the test.

35
(ii) The variability of scores in the two groups.

Solution:
(i) Since the mean score of group of boys is greater than that of men,
therefore, boys group has given a better performance of the test.

(ii) For comparing two groups in respect of variability among scores,


coefficient of variations are calculated V of boys = 26.67 and V of men
= 38.46.

Therefore, the variability of scores is greater in group of men. The


students in boys’ group, having a lesser C. V., are more consistent in
scoring around their average score as compared to the men’s group.

S.D. and the spread of observations:


In a symmetrical (normal) distribution,

(i) Mean ± 1 SD covers 68.26% of the scores.

Mean ± 2 SD covers 95.44% of the scores.

Mean ± 3 SD covers 99.73% of the scores.

(ii) In large samples (N = 500), the Range is about 6 times SD.

If N is about 100, the Range is about 5 times the SD.

If N is about 50, the Range is about 4.5 times the SD.

If N is about 20, the Range is about 3.7 times the S.D.

36
Interpretation of Standard Deviation:
The standard deviation characterises the nature of distribution of
scores. When the scores are more widely spread S.D. is more and
when scores are less scattered S.D. is less. For interpreting the value of
the measure of dispersion, we must understand that greater the value
of ‘σ‘ the more scattered are the scores from the mean.
As in the case of mean deviation, the interpretation of standard
deviation requires the value of M and N for consideration.

In following examples, the required values of σ, mean and N


are given like:

Here, the dispersion is more in example 2 as compared to example 1.


It means the values are more scattered in example 2, as compared to
the values of example 1.

Merits of SD:
1. SD is rigidly defined and its value is always definite.

2. It is the most widely used and important measure of dispersion. It


occupies a central position in statistics.

3. Like mean deviation, it is based on all the values of the distribution.

4. Here, the signs of deviations are not disregarded, instead they are
eliminated by squaring each of the deviations.

37
5. It is the master measure of variability as it is amenable to algebraic
treatment and is used in correlational work and in further statistical
analysis.

6. It is less affected by fluctuations of sampling.

7. It is the reliable and most accurate measure of variability. S.D.


always goes with the mean which is the most reliable measure of
central tendency.

8. It provides a standard unit of measure that possesses comparable


meaning from one test to another. Moreover, the normal curve is
directly related to S.D.

Limitations:
1. It is not easy to calculate and it is not easily understood.

2. It gives more weights to extreme items and less to those which are
near the mean. When the deviation of an extreme score is squared it
gives rise to a bigger value.

Uses of S.D:
Standard deviation is used:
(i) When the most accurate, reliable and stable measure of variability
is wanted.

(ii) When more weight is to be given to extreme deviations from the


mean.

38
(iii) When coefficient of correlation and other statistics are
subsequently computed.

(iv) When measures of reliability are computed.

(v) When scores are to be properly interpreted with reference to the


normal curve.

(vi) When standard scores are to be computed.

(vii) When we want to test the significance of the difference between


two statistics.

(viii) When coefficient of variation, variance, etc. are calculated.

39

S-ar putea să vă placă și