Sunteți pe pagina 1din 8

Correlation and

Regression
Correlation Analysis
Correlation analysis is a group of techniques used to measure
the strength of the association/relationship between variables.

Pearson Correlation Spearman Rank


Coefficient Correlation
- parametric test - nonparametric test
Pearson Correlation Coefficient

Pearson r Description
1.0 Perfect Correlation
0.75 – 0.99 Very high (positive/negative) Correlation
0.5 – 0.74 Moderately high (positive/negative) Correlation
0.25 – 0.49 Moderately low (positive/negative) Correlation
0.1 – 0.24 Very low (positive/negative) Correlation
0 No correlation
Simple Linear Regression
A simple linear regression attempts to model the relationship
between two variables by fitting a linear equation to observed
data. One variable is considered to be an explanatory variable,
and the other is considered to be a dependent variable.

Dependent variable – the variable that is being estimated or


predicted.
Independent variable – the variable that provides a basis for
estimation. It is the predictor variable.
Simple Linear Regression
•The linear regression model In general, the goal of linear
postulates that regression is to find the line that
best predicts from , that is, to find
where: the line that best estimates the
regression model by determining
= dependent/response variable
and that best estimate and .
= independent/explanatory variable
and = regression coefficients
= -intercept of the regression line
= slope of the regression line
Simple Linear Regression
•Coefficient of Determination,

• is used to determine the proportion of the variance (fluctuation) of one


variable that is predictable from the other variable. It allows us to
determine how certain one can be in making predictions from a certain
model/graph.
• The coefficient of determination has values from 0 to , and measures
how well the regression line represents the data. It represents the
percent of the data that is the closest to the line of best fit.
Example:
The following are the numbers of sales contacts made by 9 salespersons during
a week and the number of sales made.
Sales-person 1 2 3 4 5 6 7 8 9
Sales contact 71 64 100 105 75 79 82 68 110
Sales 25 14 37 40 18 10 22 12 42

a. Plot a scatter diagram.


b. Compute the correlation coefficient and interpret.
c. Find the equation of the regression line.
d. Compute the coefficient of determination and interpret.
e. Estimate the number of sales when 90 sales contact were made.
SUMMARY
OUTPUT X Variable 1 Line Fit Plot
45
40
Regression Statistics 35

Multiple R 0.902811914 30
25 Y
R Square 0.815069353 20 Predicted Y

Y
15
Adjusted R Square 0.788650689 10
Standard Error 5.696313724 5
0
Observations 9 60 70 80 90 100 110 120
X Variable 1

ANOVA
df SS MS F Significance F
Regression 1 1001.086292 1001.086292 30.85203 0.000855887
Residual 7 227.1359303 32.44799004
Total 8 1228.222222

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
- - - - -
Intercept 30.73642142 10.11434195 3.038894825 0.018876 -54.65303969 6.819803148 54.65303969 6.819803148
X Variable 1 0.65865755 0.11858174 5.554460175 0.000856 0.378256293 0.939058807 0.378256293 0.939058807

S-ar putea să vă placă și