Sunteți pe pagina 1din 14

Stats101B HW6

YIK LUN, KEI


2017/2/21

6.17

(a)

data<-as.data.frame(c(A=76.95,AB=-51.32,ABC=-2.82,CD=1.27,BD=14.74,
ABCD=-6.25,D=-18.73,BC=20.78,BCD=-7.98,C=-7.84,
AD=9.78,ACD=10.20,B=-67.52,AC=11.69,ABD=-6.50))
x<-data[order(data[,1]),1]
lab <- rownames(data)[order(data[,1])]
y<-c()
for(i in 1:15){y[i] = (i-0.5)/15}
plot(x,y);text(x+3,y,labels = lab,cex= 0.7)
1.0

A
BC
BD
0.8

AC
ACD
AD
0.6

CD
y

ABC
ABCD
0.4

ABD
C
BCD
0.2

D
AB
B
0.0

50 0 50

(b)

Base on the plot of the effects in part (a), we should include A, B and AB in our model.
76.95 67.52 51.32
y = intercept + 2 XA + 2 XB + 2 XAB

1
6.24

(a)

From the half normal probability plot we can see that the interactions AB, AC, BC, and factors A and C
are all significant, but factor B is not significant. However, from ANOVA table, we can see that two factor
interactions and factor C are significant but factor A is slightly above 0.05.
temp1 <- expand.grid(A=c(-1,1),B=c(-1,1),C=c(-1,1))
temp2 <- expand.grid(A=c(-1,1),B=c(-1,1),C=c(-1,1))
data<-rbind(temp1,temp2)
data$y <- c(50,44,46,42,49,48,47,56,54,42,48,43,46,45,48,54)
reg<-lm(y~.^3,data=data)
source("http://www.stat.ucla.edu/~hqxu/stat201A/R/halfnormal.R")
halfnormalplot(2*coef(reg)[-1],l=T,n=5)

HalfNormal Plot

A:C
5
4
absolute effects

B:C
A:B
C
3
2

A
1

0.0 0.5 1.0 1.5 2.0 2.5

halfnormal quantiles
anova(reg)

## Analysis of Variance Table


##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 12.25 12.25 4.0833 0.0779708 .
## B 1 2.25 2.25 0.7500 0.4116944
## C 1 36.00 36.00 12.0000 0.0085163 **
## A:B 1 42.25 42.25 14.0833 0.0056019 **
## A:C 1 100.00 100.00 33.3333 0.0004176 ***
## B:C 1 49.00 49.00 16.3333 0.0037282 **
## A:B:C 1 4.00 4.00 1.3333 0.2815369
## Residuals 8 24.00 3.00

2
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(b)

After we drop all insignificant variables, the residual plots below do not identify model inadequacy.
colnames(data)

## [1] "A" "B" "C" "y"


reg<-lm(y~A+C+A:B+A:C+B:C,data=data)
par(mfrow=c(2,2))
plot(reg)

## hat values (leverages) are all = 0.375


## and there are no factor predictors; no plot no. 5

Standardized residuals
Residuals vs Fitted Normal QQ
2

8 8
Residuals

14
0
14
3

2
1 1

44 46 48 50 52 54 2 1 0 1 2

Fitted values Theoretical Quantiles


Standardized residuals

ScaleLocation
1
14
8
1.0
0.0

44 46 48 50 52 54

Fitted values

(c)

Based on the interaction plots below, we recommend 1st class mail, color brochures, and an offered price of
$24.95 would achieve the greatest number of orders.
interaction.plot(data$A,data$B,data$y,fixed=T,main="Interaction AB")

3
Interaction AB

data$B
49

1
1
mean of data$y

48
47
46
45

1 1

data$A
interaction.plot(data$B,data$C,data$y,fixed=T,main="Interaction BC")

Interaction BC
51

data$C
50

1
1
mean of data$y

49
48
47
46
45

1 1

data$B
interaction.plot(data$A,data$C,data$y,fixed=T,main="Interaction AC")

4
Interaction AC

data$C
50

1
1
mean of data$y

48
46
44

1 1

data$A

6.30

(a)

In this analysis, A, the pan material and B, the stirring method, appear to be significant. Others are greatly
insignificant.
A=rep(c(-1,1), 32)
B=rep(rep(c(-1,1), c(2,2)), 16)
C=rep(rep(c(-1,1), c(4,4)), 8)
y<- c(11,15,9,16,10,12,10,15, 9,10,12,17,11,13,12,12,
10,16,11,15,15,14,13,15, 10,14,11,12,8,13,10,13,
11,12,11,13,6,9,7,12, 10,9,11,13,8,13,7,12,
8,6,11,11,9,14,17,9, 9,15,12,11,14,9,13,14)
reg<- lm( y ~ (A+B+C)^3 )
anova(reg)

## Analysis of Variance Table


##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 72.25 72.250 11.9527 0.001049 **
## B 1 18.06 18.063 2.9882 0.089385 .
## C 1 0.06 0.062 0.0103 0.919370
## A:B 1 0.06 0.063 0.0103 0.919370
## A:C 1 1.56 1.562 0.2585 0.613154
## B:C 1 1.00 1.000 0.1654 0.685751
## A:B:C 1 0.25 0.250 0.0414 0.839584

5
## Residuals 56 338.50 6.045
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(b)

We have 56 df on error and 8 replicates. It is not a good idea to use the analysis in part (a) because the
estimate of error may not reflect the batch-to-batch variation.

(c)

Variables A and B affect the mean rank of the brownies. However AC interaction affects the standard
deviation of the ranks. For the analysis of both the average of the ranks and the standard deviation of the
ranks, the mean square error is now determined by pooling apparently unimportant effects. This is a more
accurate estimate of error than obtained assuming that all observations were replicates.
data <- matrix(y,nrow=8,ncol=8,byrow=F)
A=rep(c(-1,1), 4)
B=rep(rep(c(-1,1), c(2,2)), 2)
C=rep(rep(c(-1,1), c(4,4)))
sdy<-apply(data,1, sd)
avgy<-rowMeans(data)
reg<- lm(avgy ~ (A+B+C)^3 )
normalplot(2*coef(reg)[-1],l=T,n=5)

Normal Plot

A
2.0
1.5
effects

B
1.0
0.5
0.0

A:B:C
A:C B:C

2 1 0 1 2

normal quantiles
reg<- lm(sdy ~ (A+B+C)^3 )
normalplot(2*coef(reg)[-1],l=T,n=5)

6
Normal Plot

C
0.5

A B:C
0.0
effects

A:B
0.5
1.0
1.5

A:C

2 1 0 1 2

normal quantiles

6.35

(a)

From the half normal plot, we can see that factor A, factor B, factor C and interaction between A and B are
mostly far away from the other points, so we can conclude that these terms are significant. Model should be
y = 0 + 1 XA + 2 XB + 3 XAB + 4 XC
y <- c(0.00340,0.00362,0.00301,0.00182,0.00280,0.00290,0.00252,
0.00160,0.00336,0.00344,0.00308,0.00184,0.00269,0.00284,0.00253,0.00163)
A=rep(c(-1,1), 8)
B=rep(rep(c(-1,1), c(2,2)), 4)
C=rep(rep(c(-1,1), c(4,4)), 2)
D=rep(c(-1,1), c(8,8))
reg<- lm(y~(A+B+C+D)^4)
halfnormalplot(2*coef(reg)[-1], l=T, n=5)

7
HalfNormal Plot

8e04 B
absolute effects

A:B
C
A
4e04

B:C
0e+00

0.0 0.5 1.0 1.5 2.0 2.5

halfnormal quantiles

(b)

The plot of residuals versus predicted shows a slight downward and then upward pattern in the residuals, but
from the qqplot, most points are along the line, which means that normality is satisfied.
reg<- lm(y~A+B+C+A:B)
anova(reg)

## Analysis of Variance Table


##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 8.5562e-07 8.5562e-07 61.425 7.936e-06 ***
## B 1 3.0800e-06 3.0800e-06 221.114 1.249e-08 ***
## C 1 1.0302e-06 1.0302e-06 73.960 3.263e-06 ***
## A:B 1 1.4400e-06 1.4400e-06 103.377 6.261e-07 ***
## Residuals 11 1.5322e-07 1.3930e-08
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(reg)

## hat values (leverages) are all = 0.3125


## and there are no factor predictors; no plot no. 5

8
Standardized residuals
Residuals vs Fitted Normal QQ
16 2 16 2
Residuals

0.5
1e04

1.5
4 4

0.0015 0.0025 0.0035 2 1 0 1 2

Fitted values Theoretical Quantiles


Standardized residuals

ScaleLocation
16 4 2
0.8
0.0

0.0015 0.0025 0.0035

Fitted values

(c)

The plots of the residuals are more representative of a model that does not violate the constant variance
assumption. In the normal Q-Q plot, we can see that most points are along the line, which means that
normality is satisfied.
newy <- 1/y
reg<- lm(newy~(A+B+C+D)^4)
normalplot(2*coef(reg)[-1], l=T, n=5)

9
Normal Plot
150
B

A:B
100

A
effects

C
50
0

B:D

2 1 0 1 2

normal quantiles
reg<- lm(newy~A+B+C+A:B)
anova(reg)

## Analysis of Variance Table


##
## Response: newy
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 42611 42611 1205.11 1.359e-12 ***
## B 1 89386 89386 2527.99 2.367e-14 ***
## C 1 18762 18762 530.63 1.168e-10 ***
## A:B 1 55130 55130 1559.16 3.332e-13 ***
## Residuals 11 389 35
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(reg)

## hat values (leverages) are all = 0.3125


## and there are no factor predictors; no plot no. 5

10
Standardized residuals
Residuals vs Fitted Normal QQ
10 10
Residuals

5 13 8 13 8

0.5
5

1.5
300 400 500 600 2 1 0 1 2

Fitted values Theoretical Quantiles


Standardized residuals

ScaleLocation
10 8
13
0.8
0.0

300 400 500 600

Fitted values

(d)
1
Surf aceRoughness
= 397.807 + 51.606 XA + 74.744 XB + 34.244 XC + 58.699 XAB
summary(reg)

##
## Call:
## lm(formula = newy ~ A + B + C + A:B)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.2577 -3.9861 -0.4486 2.4799 8.9713
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 397.807 1.487 267.60 < 2e-16 ***
## A 51.606 1.487 34.72 1.36e-12 ***
## B 74.744 1.487 50.28 2.37e-14 ***
## C 34.244 1.487 23.04 1.17e-10 ***
## A:B 58.699 1.487 39.49 3.33e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.946 on 11 degrees of freedom
## Multiple R-squared: 0.9981, Adjusted R-squared: 0.9974
## F-statistic: 1456 on 4 and 11 DF, p-value: 6.716e-15

11
6.45

se(Ef f ect) = 2S
n2k

CI = Ef f ect t/2,N p se(Ef f ect)


length of CI = 2 t/2,N p 2S
n2k

Therefore 2 t0.975,n88 24 1.5


n8
If we want the length of a 95 percent confidence interval on any effect to be less than or equal to 1.5, we
would need at least 14 replicates.
2*qt(0.975,16*8-8)*(4/sqrt(16*8))

## [1] 1.400022
2*qt(0.975,15*8-8)*(4/sqrt(15*8))

## [1] 1.446989
2*qt(0.975,14*8-8)*(4/sqrt(14*8))

## [1] 1.499035
2*qt(0.975,13*8-8)*(4/sqrt(13*8))

## [1] 1.55715

7.14
We should conduct a blocking analysis. From the half normal probability plot we can see that the interactions
AB, AC, BC, and factors A and C are all significant, but factor B is not significant.
temp1 <- expand.grid(A=c(-1,1),B=c(-1,1),C=c(-1,1))
temp2 <- expand.grid(A=c(-1,1),B=c(-1,1),C=c(-1,1))
block <- rep(c(-1,1),c(8,8))
data<-rbind(temp1,temp2)
data<-cbind(data,block)
data$y <- c(50,44,46,42,49,48,47,56,54,42,48,43,46,45,48,54)
reg<-lm(y~(A+B+C)^3+block,data=data)
halfnormalplot(2*coef(reg)[-1],l=T,n=5)

12
HalfNormal Plot

5
4 A:C
absolute effects

B:C
A:B
C
3
2

A
1

0.0 0.5 1.0 1.5 2.0 2.5

halfnormal quantiles
anova(reg)

## Analysis of Variance Table


##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 12.25 12.250 3.6105 0.0991858 .
## B 1 2.25 2.250 0.6632 0.4422655
## C 1 36.00 36.000 10.6105 0.0139146 *
## block 1 0.25 0.250 0.0737 0.7938773
## A:B 1 42.25 42.250 12.4526 0.0096129 **
## A:C 1 100.00 100.000 29.4737 0.0009777 ***
## B:C 1 49.00 49.000 14.4421 0.0067124 **
## A:B:C 1 4.00 4.000 1.1789 0.3135412
## Residuals 7 23.75 3.393
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
reg <- lm(y~A+C+A:B+A:C+B:C,data=data)
anova(reg)

## Analysis of Variance Table


##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 12.25 12.250 4.0496 0.0718913 .
## C 1 36.00 36.000 11.9008 0.0062287 **
## A:B 1 42.25 42.250 13.9669 0.0038641 **
## A:C 1 100.00 100.000 33.0579 0.0001853 ***
## C:B 1 49.00 49.000 16.1983 0.0024200 **

13
## Residuals 10 30.25 3.025
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(reg)

## hat values (leverages) are all = 0.375


## and there are no factor predictors; no plot no. 5

Standardized residuals
Residuals vs Fitted Normal QQ
2

8 8
Residuals

0
14 14
3

2
1 1

44 46 48 50 52 54 2 1 0 1 2

Fitted values Theoretical Quantiles


Standardized residuals

ScaleLocation
1
14
8
1.0
0.0

44 46 48 50 52 54

Fitted values

14

S-ar putea să vă placă și