Sunteți pe pagina 1din 31

We often come across situations where more than

one random variables occur together and we are


interested in figuring out if there exists any
relationship amongst them.

For example, one might be interested in knowing


how does the weight vary with height in a
particular age group. Here height and weight
both are random quantities occurring together.

Covariance and Correlation are two parameters


that help us study the relationship between these
variables. MSF © 1
Joint Probability
When more than one quantities appear together
then the probabilities associated with the
occurrence of their joint values are known as their
joint probabilities.
For example, we throw two unbiased die together and
X – number appearing on the 1st die
Y – number appearing on the 2nd die
then, the joint probability that X = 1 and Y = 1 is
1
P( X 1 and Y 1)
36
MSF © 2
Example:
Experiment: Flip a coin 3 times

HHH , HHT , HTH , HTT ,


Sample Space, S
TTT , TTH , THT , THH

Random variable X records no. of heads on the last


throw and Y records total no. of heads in 3 flips.

Possible values of X are 0,1

Possible values of Y are 0,1,2,3


MSF © 3
Example (cont…)

Joint Probability Distribution of X and Y can


be given in a table as shown below:
Y
0 1 2 3 P(x)
X 0 1/8 1/4 1/8 0 1/2
1 0 1/8 1/4 1/8 1/2
P(y) 1/8 3/8 3/8 1/8
P(X = 0 and Y = 1) = 1/4
MSF © 4
Covariance
Covariance is a number that describes how two
random quantities or variables are related to each
other.
1. If the variables tend to move up or down
together (in the same direction), their covariance
is positive.
2. If the variables tend to move in opposite
directions, their covariance is negative.
3. If their movements are independent of each other,
their covariance is zero and the variables are called
independent random variables. 5
MSF ©
Covariance

Y Y Y

X X X
Cov > 0 Cov < 0 Cov = 0

MSF © 6
Covariance of random variables X, Y

Cov( X , Y ) pij ( xi x )( y j y )
ij

where,

pij is the probability that X = xi and Y = yj


μx is the expectation of X and
μy is the expectation of Y.

MSF © 7
Example:
Let us calculate the covariance in the last example
where a coin was flipped 3 times.

The joint distribution was given by:

Y P(x)
0 1 2 3
X 0 1/8 1/4 1/8 0 1/2
1 0 1/8 1/4 1/8 1/2
P(y) 1/8 3/8 3/8 1/8

MSF © 8
Example(cont..)
1 1
E (X ) 0 1 0.5
2 2
1 3 3 1
E (Y ) 0 1 2 3 1.5
8 8 8 8

1 1
Cov( X , Y ) (0 0.5)(0 1.5) (0 0.5)(1 1.5) 
8 4
1
(1 0.5)(3 1.5)
8
0.25
MSF © 9
• Alternate formula for covariance:

Cov( X , Y ) pij xi y j x y
i, j

• Cov(X,X) = Var(X)

MSF © 10
Correlation

If we have two variables A and B with non-zero


standard deviations σA and σB , and covariance
Cov (A,B), then we define the correlation
coefficient of A and B by

Cov( A, B)
A B

The correlation coefficient of A and B is generally


denoted by rAB or ρAB.
MSF © 11
Correlation
The correlation coefficient measures the
strength of a linear relationship between two
variables.

MSF © 12
Properties of r:
• r ranges between +1 and −1.
• r > 0 , termed as positive correlation, indicates
that the two variables move up or down together ,
i.e. if one moves up then the other is also expected
to move up. However, the strength of the positive
linear relationship is more when r > 0.7.

• r = 1 (perfect positive correlation) indicates that


the two variables move up or down together
perfectly.
MSF © 13
Properties of r (cont..):
• r < 0 , termed as negative correlation, indicates
that the two variables move up or down in
opposite direction i.e. if one moves up then the
other is expected to move down. However, the
strength of the negative linear relationship is
more when r < -0.7 .
• r = −1 (perfect negative correlation) implies
that the two variables move perfectly in
opposite directions.
• r = 0 (uncorrelated) indicates no relation at all.
MSF © 14
Properties of r (cont..):

MSF © 15
Variance of Sum
Recollect:

E(aX+bY) = aE(X) + bE(Y)


i.e. expectation is linear but what about variance?

It turns out:
2 2 2 2
Var(aX+bY) = a X b Y 2a.b.Cov( X , Y )

(Note the resemblance to the binomial expansion


(ax by) 2 a 2 x 2 b 2 y 2 2abxy )
MSF © 16
The general formula for the variance of a sum is as
follows:

2 2
Var wi X i w
i Xi 2 wi w j Cov( X i , X j )
i i i j

MSF © 17
Sampling

MSF © 18
There are times when one has to use the
observed data to draw conclusions about the
properties of the distribution in a population.

In such cases it is sometimes not feasible to reach


each and every member of the entire population,
instead what can be done is that one can take
out a sample from the related population and
draw conclusions on the basis of the drawn
sample.

MSF © 19
Sample
It is a subset of elements, from the population
of interest.

Population

Sample

MSF © 20
Sample Mean
Suppose we pick a random sample x1, x2, ... , xn
from a population, then the sample mean is
defined as
n
1
x xi
n i 1

We can estimate the unknown population mean μ


with this sample mean x .

MSF © 21
Sample Variance and Standard Deviation
A random sample x1, x2, ... , xn is picked from a
population, then the sample variance is defined as

n
1
S2 ( xi x ) 2 , where x is the sample mean,
n 1i 1

and the sample standard deviation is defined as

n
2 1 2
S S ( xi x)
n 1i 1

MSF © 22
Note:
Sample mean/variance/standard deviation depends
on the sample size and the sample values.
Therefore, an estimate of the population
mean/variance/standard deviation changes every
time you pick a new sample. For a good estimate, it
is recommended that a larger sample size must be
taken.

MSF © 23
Sample Mean, Variance and Standard
Deviation in Open Office Calc

• AVERAGE function calculates sample mean.


• VAR function calculates sample variance.
• STDEV function calculates sample standard
deviation.

MSF © 24
Covariance and Correlation in Open
Office Calc

• COVAR function calculates covariance between


two variables.
Syntax: COVAR(Data1,Data2)

• CORREL function calculates correlation between


two variables.
Syntax: CORREL(Data1,Data2)

MSF © 25
INT function in Open Office Calc

INT function rounds a number down to the


nearest integer.
Syntax: INT(number)

Example:

• INT(4.7) = 4
• INT(-4.7) = -5

MSF © 26
INDIRECT function in Open Office Calc
Returns the reference specified by a text string.
Syntax: INDIRECT(reference)

MSF © 27
VLOOKUP function in Open Office Calc
This function checks if a specific value is contained
in the first column of an array and then returns the
value in the same row of the column asked for.

Syntax:
VLOOKUP(SearchCriterion, Array, Index, SortOrder)
SearchCriterion is the value searched for in the first column of the array.

Array is the reference, which is to comprise at least two columns.

Index is the number of the column in the array that contains the value to
be returned. The first column has the number 1.

MSF © 28
RAND function in Open Office Calc

RAND function returns a random number


between 0 and 1.
Syntax: RAND()

Remark:

RANDBETWEEN function returns an integer


random number between two integers.
Syntax: RANDBETWEEN(Bottom,Top)
MSF © 29
Multiple Operation Tool in Open
Office Calc
This tool is used to recalculate a formula based on 1 or
more variables.
To use this tool we have to set an example of the
formula based on values stored in some cells and then
give a range of cells over which the values of the
variables are changing.

MSF © 30
Multiple Operation Tool in Open Office Calc

Cell reference
to the formula

{
Enter the cell reference to the corresponding cell
that is part of the formula 31
MSF ©

S-ar putea să vă placă și