Sunteți pe pagina 1din 12

CHAPTER 3:

DATA ANALYSIS
- MEASURES OF DISPERSION
3.3

Introduction

3.4

Ungrouped Data
3.4.1 Range
3.4.2 Inter-Quartile Range
3.4.3 Semi Inter-Quartile Range
3.4.4 Variance and Standard Deviation

3.5

Grouped Data
3.5.1 Range
3.5.2 Inter-Quartile Range
3.5.3 Semi Inter-Quartile Range
3.5.4 Variance and Standard Deviation

3.6

Relative Dispersion

3.7

Skewness

CHAPTER 3 contd..:

3.3

MEASURES OF DISPERSION

INTRODUCTION

Measures of central value give us one single figure that represents the entire data.
However, the measures of averages alone cannot adequately describe a set of
observations; it is also necessary to describe the variability or dispersion of the
observation.
Two sets of data might have the same mean value but not necessary of the same
spread. For instance, the number sets 6, 7, 8, 9, 6 and 2, 7, 9, 13, 5, 6 have the same
mean, 7, but most of the numbers in the first set are around the mean value. On the other
hand, the second set is more spread away from the mean. The difference in the spread can
be determined by the measure of dispersion.
There are 5 methods of measures of dispersion:
1)
2)
3)
4)
5)

Range
Inter-quartile Range
Semi Inter-quartile range
Variance
Standard deviation

Range is however not a good measure of dispersion because it is influenced by the


extreme values and the calculation does not cover all observations. Among all, variance
and standard deviation are the most useful and widely used measure of dispersion. This is
because, although they are influenced by the extreme values, the calculations cover all
the observations.
Standard deviation is taken as the square root of variance. The deviation in the term
refers to the difference between the observed data and the mean. It gives us an idea of
how close are the values of the data around the mean, generally, the larger value of
standard deviation for a data set, the larger the spread of the observations around the
mean.

3.4

UNGROUPED DATA
3.4.1

RANGE

The range is the difference between the highest and lowest value in the
distribution.
Range = Highest value Lowest value

Example 1:
Calculate the prices of shares of ABS Co. Ltd over seven-day week:
Prices of shares (RM00) :
21

20

28

Range = 28 16

3.4.2

16

22

25

= 12 ( RM 1 200)

INTERINTER-QUARTILE
QUARTILE RANGE

Inter-quartile range is the difference between the quartiles Q3 Q1 . This covers


the middle 50% of the observations.
Interquartile Range = Q3 Q1
For example, for the data in Example 1:
16

20

21

22

25

28

Q1= n+1th = 7 = 1.25th observation = 16+20 = 18


4
4
2
Q3=3( n+1)th = 21 = 5.25th observation = 25+28 = 26.5
4
4
2

Therefore inter-quartile range = 26.5-18 = 8.5 (RM8500)

3.4.3
3.4.3

SEMI INTERINTER-QUARTILE RANGE


The semi inter-quartile range (quartile deviation) is the average of the
differences of the quartiles from the median.
Semi Inter-Quartile range =

Q3 Q1
2

(Same example)
If Q3 = 26.5

and

Q1 = 18

Semi Inter-Quartile Range = 26.5 18


2

= 4.25 (RM425)

VARIANCE AND STANDARD DEVIATION

3.4.4
3.4.4

For ungrouped data, the variance is given as

 




 2 

2 

2
 

1

The standard deviation is






or simply

   

2 

  

 

1

Example 2:
1. Find the variance and standard deviation for the sample data:
5, 2, 3, 4, 5, 6, 3
Solution:

  5 ! 2 ! 3 ! 4 ! 5 ! 6 ! 3  28
   5 ! 2 ! 3 ! ! 3  124
 
28

 112

7


Variance, ( 




 )
*

+ 
,


-

(

Standard deviation,   ./0/12  2  3. 535

2. Calculate the standard deviations for the following sets of sample data. Hence,
determine which one is more dispersed about the mean, than the other.
Data A: 2, 7, 10, 9, 2, 5, 16
Data B: 10,8,14, 20, 40, 32, 1, 4, 8, 36, 12, 32
Solution:
For data A:

  51
   519
 



 

1


8  

51
519


7

71

147.3
 
 5. 9:;
6
For data B:

  217
   5929

 

 ,



 


1



217
5929


12

12  1

2004.92
 
 3=. :>
11

Therefore, data B is more dispersed than data A.


5

3.5

GROUPED DATA
3.5.1
3.5.1

RANGE
Range for grouped data:

Range = Upper boundary of last class lower boundary of first class

3.5.2
3.5.2

INTERINTER-QUARTILE RANGE
Interquartile Range = Q3 Q1
The values of Q3 and Q1 are obtained whether from the cumulative
frequency curve (graphical method) or using formula.

3.5.3
3.5.3

SEMI INTERINTER-QUARTILE RANGE


The semi inter-quartile range (quartile deviation) is the average of the
differences of the quartiles from the median.
Semi Inter-Quartile range =

3.5.
3.5.4
5.4

Q3 Q1
2

VARIANCE AND STANDARD DEVIATION


For grouped data, the variance is given as

 

?


or

 

?


 @)

And the standard deviation is




?


or



?


 @)

Example 3:
The marks obtained by 50 students in a certain college.
Marks
10 but under 20
20 but under 30
30 but under 40
40 but under 50
50 but under 60
60 but under 70

No. of students
3
7
10
20
7
3

Find the inter-quartile range, variance and standard deviation by using both
graphical (semi inter-quartile range only) and calculation methods.

3.6
3.6

RELATIVE DISPERSION
The standard deviation, on its own, tells us very little about the amount of
dispersion in the data. To compare the dispersion between different set of data, we
need a measure of relative dispersion which expresses the magnitude of the
standard deviation to the mean. In this case, we use the coefficient of variation.
Coefficient of variation = standard deviation x 100
mean
C.V =

s
100
x

Larger Percentage Greater Variation

Larger Variation Less Consistency


Smaller Variation More Consistency

Example 4:
An analysis of monthly wages paid to workers in two firms, A and B, belonging
to the same industry gives the following results.

Average monthly wages


Standard deviation of the distribution of wages

A
RM 105
RM 20

B
RM 95
RM 22

Calculate the coefficient of variation and comment on your finding.


7

Solution:
Firm A

C.V = 20 x 100
105

= 19.05 %

Firm B

C.V = 22 x 100
95

= 23.16 %

The value of the coefficient of variation for firm B is higher than that of firm A.
Therefore, wages in firm B are less consistent compared to firm A.

3.7

SKEWNESS
Although frequency curves can take any shape, certain are often encountered. The
most common is the bell-shaped distribution.
i) Symmetrical

mean, median, mode


This distribution is perfectly symmetrical. The graph for the symmetrical
distribution is the normal curve. The mean, median and mode all have the same
value.
ii) Asymmetrical
Skewness is a term used to describe the asymmetry properties of a frequency
curve. Skewness can either be positive or negative.
 Positive skewness
When the frequency distribution has a tail stretching out to the right, it is
said to be positively skewed.
Mode

Median
Mean

 Negative skewness
When the frequency distribution has a tail stretching out to the left, it is
said to be negatively skewed.

Mean Median Mode


iii)

Measure the skewness by calculation.


To measure the amount of skewness in a set of data, a number of measures
have been suggested. The most common measure is the Pearson
coefficient of skewness which;
a)

Skewness = Mean Mode


Standard Deviation

or

Skewness = 3 (Mean Median)


Standard Deviation

b)

EXERCISE
1.

The table below gives an analysis of the debtorss balance of Fuel


Suppliers Bhd.
Balance Outstanding
(RM00)
20 but under 39.9
40 but under 59.9
60 but under 79.9
80 but under 99.9
100 but under 119.9
120 but under 139.9
140 but under 159.9
a)
b)
c)
d)

No. of accounts
1
3
6
10
5
3
2

Calculate the median and quartile deviation


Estimate the mean and standard deviation
Calculate the coefficient of variation.
Calculate the measure of skewness of these data.

2.

The sales orders (RM) achieved by 10 companies in Shah Alam.

Area
Sales (RM)

A
150

B
130

C
140

D
150

E
140

F
300

G
110

H
120

I
140

J
120

For these sales data calculate;


i)
Arithmetic mean
ii)
Mode
iii) Median
iii) Mean Deviation
iv) Standard Deviation

3.

A long term investor is considering selling one of the two stocks, X, Y or


Z. He decides to hold on to the more consistent stock. Using the
information in the following table, find:
Stock
X
Y
Z
i)

The following frequency polygon shows the profit in (RM00) of a random


sample of 48 shops in 2006.
16

14

14
12
No. of shops

Standard Deviation
RM 12
RM 6
s

If stock Y and Z have the same relative dispersion. What is the


standard deviation for stock Z
Which stock should he sell?

ii)

4.

Average Price
RM 76
RM 45
RM 50

11

10

8
6

4
2
0
4.5

i)
ii)

5.5

6.5

7.5
8.5
Profit (RM'000)

9.5

10.5

11.5

Construct a frequency table using the above polygon


From the frequency table, calculate the median, mean and standard
deviation.

10

5.

The time taken of orders (in minutes) of 10 randomly selected telephone


calls taking by telephone company A was.
2.82
i)
ii)
iii)
iv)

6.

2.45

5.42

6.24

4.24

3.70

4.75

3.25 4.02

Calculate the range of data.


Estimate the mean and standard deviation of time taken.
Calculate the coefficient of variation
The mean and standard deviation of time taken of order by company B
was 3.77 and 2.12 minutes respectively. Calculate the coefficient of
variation for this data and compare your answer with that obtained in
part (iii).

The following data shows the results in (CGPA) of 80 students in class A.


CGPA
Number of student
2.5 - 2.6
4
2.7 - 2.8
13
2.9 - 3.0
24
3.1 - 3.2
10
3.3 - 3.4
18
3.5 - 3.6
11
a)
b)
c)
d)

7.

3.71

Draw a histogram to represent these data


By using the histogram, state the modal class
Calculate the mean, median and standard deviation of the results.
The mean and standard deviation of the results for students in class
B is 3.12 and 0.366 respectively. Using the appropriate
measurement, compare the performance consistency of the two
classes.

FLY company employed 50 salespersons to sell its products. The


following information is gathered from its travelling claims records.
Distance travelled per month
(in km)
300 and less than 400
400 and less than 500
500 and less than 600
600 and less than 700
700 and less than 800

No. of salespersons
3
8
18
11
10

(i) Draw an ogive for the data.


(ii) Using ogive in (i), estimate
a) Median, lower quartile and upper quartile.
b) the percentage of salespersons who travelled less than 550 km in a
month
c) the minimum monthly distance travelled by the 30% of the active
travelling salespersons
(iii) Find the mean and standard deviation for the above data.
(iv) Determine the shape of skewness using appropriate calculation.
11

8.

A research physician wants to estimate the average age of people with


diabetes. She takes a random sample of 27 diabetics and obtains the
following ages.
54
48
61
38
23
79
70
82
56
38
7
79
83
57
41
10
75
60
68
76
61
77
65
55
21
53
83
(i)
(ii)
(iii)

Construct a frequency table by using all the steps required.


Find the mean and median of the data.
Given that the standard deviation of the average age people with
diabetes is 22.068;
a)
Determine the shape of the distribution and
b)
Coefficient of variation for the above data.

12

S-ar putea să vă placă și