Corre Laci On

Universidad Industrial de Santander Facultad de Ingenierías Físico Mecánicas Escuela de
Ingeniería de Sistemas e Informática Profesor: Andrés Leonardo González Gómez, MSc.
Estadística 2: Actividad en clase N◦10
Taller correlacion
Andres Ricardo Hernandez Torres
cod: 2122274
1. See Table E11-1 for data on the ratings of quarterbacks for the 2008 National Football League season (The
Sports Network). It is suspected that the rating (y) is related to the average number of yards gained per pass
attempt (x).
format long
x=[8.39,7.67,7.66,7.98,7.21,7.53,8.01,7.66,7.21,7.16,7.93,7.10,6.33,6.76,6.86,7.35,7.22,7.94,6.
y=[105.5,97.4,96.9,96.2,95,93.8,92.7,91.4,90.2,89.4,87.7,87.5,87,86.4,86.4,86,85.4,84.7,84.3,81
[b1,b0,s]=regresion_lineal(x,y,1);
(a) Calculate R2 for this model and provide a practical interpretation of this quantity.
yy=0;
yy1=0;
n=length(y);
for i=1:1:n
yy=((y(i)-mean(y))*(x(i)-mean(x)))+yy;
yy1=(y(i)-mean(y))*(y(i)-mean(y))+yy1;
end
R21=(b1*yy)/yy1
R21 =
0.671801770078746
Se puede observar que el modelo se ajusta de buena manera, es decir, es fiable ya que el coeficiente de
determinacion da un valor que se acerca a 1.
(b) Prepare a normal probability plot of the residuals from the least squares model. Does the normality
assumption seem to be satisfied?
e=[];
for i=1:1:n
e(i)=(y(i)-(b0+(b1*x(i)))) ;
end
1
figure()
normplot(e)
(c) Plot the residuals versus the fitted values and against x. Interpret these graphs (The linear regression model
appears to be appropriate).
figure()
plot(e,y,'o')
title("Y vs Residuales")
xlabel("e")
ylabel("Y")
2
figure()
plot(e,x,'o')
title("X vs Residuales")
xlabel("e")
ylabel("X")
3
2. An article in Technometrics by S. C. Narula and J. F. Wellington [“Prediction, Linear Regression, and a
Minimum Sum of Relative Errors” (1977, Vol. 19)] presents data on the selling price and annual taxes for 24
houses. The data are in the Table E11-2. Refer to the data in table on house-selling price y and taxes paid x.
x2=[5.0500,8.2464,6.6969,7.7841,9.0384,5.9894,7.5422,8.7951,6.0831,8.3607,8.1400,9.1416];
y2=[30.0,36.9,41.9,40.5,43.9,37.5,37.9,44.5,37.9,38.9,36.9,45.8]
y2 = 1×12
30.000000000000000 36.899999999999999 41.899999999999999 40.500000000000000
[b12,b02,s2]=regresion_lineal(x2,y2,1);
(a) Find the residuals for the least squares model.
e2=[];
n2=length(y2);
for j=1:1:n2
e2(j)=(y(j)-(b02+(b12*x2(j)))) ;
end
disp("Residuales")
Residuales
disp(e2)
4
Columns 1 through 8
72.150984623662168 56.404034322753120 59.611001376707122 56.310023912325846 52.109282527508476 58.20359846652
Columns 9 through 12
54.379434018021726 48.130587188666681 46.958581786154667 44.362390626915989
(b) Prepare a normal probability plot of the residuals and interpret this display.
figure()
normplot(e2)
(c) Plot the residuals versus ybi and versus xi . Does the normality assumption seem to be satisfied?
figure()
plot(e2,y2,'o')
xlabel("e")
ylabel("Y")
5
figure()
plot(e2,x2,'o')
xlabel("e")
ylabel("X")
6
(d) What proportion of total variability is explained by the regression model?
yy2=0;
yy22=0;
n=length(y2);
for i=1:1:n
yy2=((y2(i)-mean(y2))*(x2(i)-mean(x2)))+yy2;
yy22=(y2(i)-mean(y2))*(y2(i)-mean(y2))+yy22;
end
R22=(b12*yy2)/yy22
R22 =
0.544645445033293
3. The number of pounds of steam used per month by a chemical plant is thought to be related to the average
ambient temperature (in ◦F) for that month. The past year’s usage and temperatures are in the following table:
x3=[21,24,32,47,50,59,68,74,62,50,41,30];
y3=[185.79,214.47,288.03,424.84,454.58,539.03,621.55,675.06,562.03,452.93,369.95,273.98];
[b13,b03,s3]=regresion_lineal(x3,y3,1);
(a) What proportion of total variability is accounted for by the simple linear regression model?
yy3=0;
yy32=0;
n=length(y3);
for i3=1:1:n
7
yy3=((y3(i3)-mean(y3))*(x3(i3)-mean(x3)))+yy3;
yy32=(y3(i)-mean(y3))*(y3(i3)-mean(y3))+yy32;
end
R23=(b13*yy3)/yy32
R23 =
-5.932776967566785e+15
(b) Prepare a normal probability plot of the residuals and interpret this graph.
e3=[];
n3=length(y3);
for j3=1:1:n3
e3(j3)=(y(j3)-(b03+(b13*x3(j3)))) ;
end
figure()
normplot(e3)
(c) Plot residuals versus ybi and xi .
figure()
plot(e3,y3,'o')
xlabel("e")
ylabel("Y")
8
figure()
plot(e3,x3,'o')
xlabel("e")
ylabel("X")
9
4. Suppose that data are obtained from 20 pairs of (x, y) and the sample correlation coeffi cient is 0.8.
(a) Test the hypothesis that H0 : ρ = 0 against H1 : ρ with α = 0.05. Calculate the P-value.
T0=(0.8*sqrt(18))/(sqrt(1-(0.8*0.8)))
T0 =
5.656854249492381
Pvalue=2*(1-tcdf(abs(T0),18))
Pvalue =
2.292887199439875e-05
como α>p_value entonces se rechaza H0
(b) Test the hypothesis that H1 : ρ = 0.5 against H1 : ρ ≠ 0.5 with α = 0.05. Calculate the P-value.
T02=(0.5*sqrt(18))/(sqrt(1-0.25))
T02 =
2.449489742783178
Pvalue2=2*(1-tcdf(abs(T02),18))
Pvalue2 =
0.024769558804110
10
(c) Construct a 95% two-sided confidence interval for the correlation coefficient.
z=norminv(0.025)
z =
-1.959963984540054
inf=tanh(atanh(0.8)-(z/sqrt(17)))
inf =
0.917655484096945
sup=tanh(atanh(0.8)+(z/sqrt(17)))
sup =
0.553387644453858
11

Corre Laci On

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Corre Laci On

Încărcat de

Drepturi de autor:

Formate disponibile

Universidad Industrial de Santander Facultad de Ingenierías Físico Mecánicas Escuela de

Ingeniería de Sistemas e Informática Profesor: Andrés Leonardo González Gómez, MSc.

Estadística 2: Actividad en clase N◦10

Andres Ricardo Hernandez Torres

(a) Find the residuals for the least squares model.

72.150984623662168 56.404034322753120 59.611001376707122 56.310023912325846 52.109282527508476 58.20359846652

54.379434018021726 48.130587188666681 46.958581786154667 44.362390626915989

(c) Plot residuals versus ybi and xi .

como α>p_value entonces se rechaza H0

S-ar putea să vă placă și