Sunteți pe pagina 1din 3

Correlation Analysis

When variables are found to be related, we often want to know how close the relationship is. For
example, we may be interested in measuring the relationship between the

(a) Amount of fertilizer used and wheat production

(b) Ages of husbands and their wives

(c) Volume of sales and years of experience of sales persons

(d) Heights and weights of a group of people

(e) Income earned and income saved.

The study of this relationship is accomplished through what is referred to as the correlation analysis.

Correlation Coefficient: Correlation coefficient 𝑟 is a statistical measure that quantifies the linear
relationship between a pair of variables.

For 𝑛 pairs of sample observations (𝑥1 , 𝑦1 ), (𝑥2 , 𝑦2 ), … … … … , (𝑥𝑛 , 𝑦𝑛 ), the correlation coefficient r can
be computed employing the following formula:

∑(𝑥𝑖 − 𝑥̅ )(𝑦𝑖 − 𝑦̅)


𝑟=
√∑(𝑥𝑖 − 𝑥̅ )2 √∑(𝑦𝑖 − 𝑦̅) 2

For computational purposes, either of the following two formulae for 𝑟 may be used

𝑛 ∑ 𝑥𝑖 𝑦𝑖 − ∑ 𝑥𝑖 ∑ 𝑦𝑖
𝑟=
√𝑛 ∑ 𝑥𝑖 2 − (∑ 𝑥𝑖 )2 √𝑛 ∑ 𝑦𝑖 2 − (∑ 𝑦𝑖 )2

∑ 𝑥𝑖 ∑ 𝑦𝑖
∑ 𝑥𝑖 𝑦 𝑖 −
𝑛
Or, 𝑟= 2 2
√∑ 𝑥𝑖 2 −(∑ 𝑥𝑖 ) √∑ 𝑦𝑖 2 −(∑ 𝑦𝑖 )
𝑛 𝑛

Problem 01: Calculate the correlation coefficient between the years of experience of the salespersons(𝑥)
and the annual sales volume(𝑦) from the following data:

Salesperson 1 2 3 4 5 6 7 8 9 10
𝑥𝑖 1 3 4 4 6 8 10 10 11 13
𝑦𝑖 80 97 92 102 103 111 119 123 117 136

Solution: We know, the correlation coefficient between 𝑥𝑖 and 𝑦𝑖 is

𝑛 ∑ 𝑥𝑖 𝑦𝑖 − ∑ 𝑥𝑖 ∑ 𝑦𝑖
𝑟= … … . … … … (𝑎)
√𝑛 ∑ 𝑥𝑖 2 − (∑ 𝑥𝑖 )2 √𝑛 ∑ 𝑦𝑖 2 − (∑ 𝑦𝑖 )2
The calculations required to compute 𝑟 are shown in the accompanying table:

Salesperson 𝑥𝑖 𝑦𝑖 𝑥𝑖 𝑦𝑖 𝑥𝑖 2 𝑦𝑖 2
1 1 80 80 1 6400
2 3 97 291 9 9409
3 4 92 368 16 8464
4 4 102 408 16 10404
5 6 103 618 36 10609
6 8 111 888 64 12321
7 10 119 1190 100 14161
8 10 123 1230 100 15129
9 11 117 1287 121 13689
10 13 136 1768 169 18496
∑ 𝑥𝑖 = 70 ∑ 𝑦𝑖 = 1080 ∑ 𝑥𝑖 𝑦𝑖 =8128 ∑ 𝑥𝑖 2 = 632 2
∑ 𝑦𝑖 =119082

Using the summary values of the above table in (a), we get

10(8128) − 70(1080)
𝑟= = 0.96
√10(632) − (70)2 √10(119082) − (1080)2

Problem 02: Calculate the correlation coefficient from the following data:

𝑥 40 45 50 55 60 65 70 75 80 85
𝑦 31 39 42 48 50 64 77 70 82 80

Problem 03: Calculate the correlation coefficient for the age distribution of brothers and sisters as
follows:

Age of brother 12 18 17 21 26 23 30 37
Age of sister 13 17 16 19 18 28 31 34

Problem 04: Ten students appearing at an examination were evaluated by two independent examiners out
of 100 marks. Table below shows these marks:

Examiner Marks assigned


Examiner 1 65 70 76 75 80 78 83 84 85 90
Examiner 2 30 25 35 40 38 42 48 50 55 45
Rank the data and hence compute the rank correlation coefficient.

Solution: To facilitate the computation, we construct the following table:


Examiner-1 Examiner-2
Student Mark Rank Mark Rank 𝑑𝑖 𝑑𝑖2
1 65 10 30 9 +1 1
2 70 9 25 10 -1 1
3 76 7 35 8 -1 1
4 75 8 40 6 +2 4
5 80 5 38 7 -2 4
6 78 6 42 5 +1 1
7 83 4 48 3 +1 1
8 84 3 50 2 +1 1
9 85 2 55 1 +1 1
10 90 1 45 4 -3 9

Computation of rank correlation:

From the table ∑ 𝑑𝑖2 = 24 and 𝑛 = 10, so that

6 ∑ 𝑑𝑖2 6(24)
𝑟𝑠 = 1 − 2
=1− = 0.85
𝑛(𝑛 − 1) 10(100 − 1)

S-ar putea să vă placă și