Sunteți pe pagina 1din 18

CURS 3

ANOVA(1) PENTRU MODEL NORMAL, SINGULAR


Consideram experimentul cu observatiile A
ij
= j + c
i
+
ij
. i = 1. .... :.
, = 1. .... :
i
. cu
i1
. ...
in
i
. i = 1. .... :, v.a.i.i.r. ` (0. o
2
)
Scrierea matriceala:
X =

+
X=(A
11
. .... A
1n
1
. .... A
r1
. .... A
rnr
)
0
=
_

11
. ....
1n
1
. ....
r1
. ....
rnr
_
0

=(j. c
1
. .... c
r
)
0
1
r+1

Y=
_
_
_
_
_
_
_
_
_
_
1 1 0 ..... 0
... ... ... ... ...
1 1 0 ... 0
... ... ... ... ...
1 0 0 ... 1
... ... ... ... ...
1 0 0 ... 1
_
_
_
_
_
_
_
_
_
_
(n
1
+n
2
+:::+nr ; r+1)
:c:q

Y = :
: =
r

i=1
:
i
Pentru a obtine un model nesingular (deci un estimator unic prin metoda
celor mai mici patrate), vom introduce o restrictie liniara asupra componen-
telor parametrului

.

=
1
+
2

1
= (j. 0. .... 0)
0

2
= (0. c
1
. .... c
r
)
0
/ =
_

[

1
r+1
_
/
1
=
_

[

=
1
. j 1
_
/
2
=
_

[

=
2
. (c
1
. .... c
r
) 1
r
_
1
Noua restrictie va introdusa asa incat
/
1
l /
2
j
r

i=1
:
i
c
i
= 0
Presupunand j ,= 0. obtinem restrictia liniara
r

i=1
:
i
c
i
= 0
c
r
=
r1

i=1
:
i
:
r
c
i
Scrierea matriceala pentru experimentul caruia i s-a adaugat aceasta noua
restrictie este
X = Y +
X=(A
11
. .... A
1n
1
. .... A
r1
. .... A
rnr
)
0
=
_

11
. ....
1n
1
. ....
r1
. ....
rnr
_
0
=(j. c
1
. .... c
r1
)
0
1
r
Y =
_
_
_
_
_
_
_
_
_
_
1 1 0 ..... 0
... ... ... ... ...
1 1 0 ... 0
... ... ... ... ...
1
n
1
nr

n
2
nr
...
n
r1
nr
... ... ... ... ...
1
n
1
nr

n
2
nr
...
n
r1
nr
_
_
_
_
_
_
_
_
_
_
(n
1
+n
2
+:::+nr ; r)
:c:qY = r
Estimatorul prin metoda celor mai mici patrate (LSE) se obtine prin
calcul direct:

LS
(X) = (Y
0
Y)
1
Y
0
X

LS
(X) = ( j. c
1
. .... c
r1
)
0
2
j = A

=
1
:
r

i=1
n
i

j=1
A
ij
c
i
= A
i
A

. i = 1. .... : 1
c
r
=
r1

i=1
:
i
:
r
c
i
= A
r
A

unde
A
i
=
1
:
i
n
i

j=1
A
ij
. i = 1. .... :
oo
t
=
r

i=1
n
i

j=1
_
A
ij
A

_
2
oo
A
=
r

i=1
n
i

j=1
_
A
i
A

_
2
=
r

i=1
:
i

_
A
i
A

_
2
oo
rezid
=
r

i=1
n
i

j=1
_
A
ij
A
i
_
2
Ecuatia ANOVA(1), proprietatile variabilelor aleatoare oo
t
. oo
A
. oo
rezid
si tabelul ANOVA(1) sunt aceleasi ca si in cazul modelului nesingular.
Regiunea critica pentru ipoteza
H : c
i
= 0. \i = 1. .... :
cu alternativa
H
0
: i : c
i
,= 0
la pragul de semnicatie c este
1
n
=
_
(r
11
. .... r
rnr
) [ 1 _ ,
(r1;nr):1
_
unde ,
(r1;nr):1
este cuantila de rang (1 c) a repartitiei T (: 1. : :) .
3
ANOVA(2) PENTRU MODEL NORMAL, CU INTERACTIUNI
Consideram experimentul cu acelasi numar (:) de observatii in celule:
A
ijk
= j + c
i
+ ,
j
+
ij
+
ijk
. i = 1. .... :. , = 1. .... :. / = 1. .... :.
cu
ij1
. ...
ijn
. i = 1. .... :. , = 1. .... : v.a.i.i.r. ` (0. o
2
) .
Un experiment cu acelasi numar de observatii in toate celulele se numeste
"echilibrat" (balanced).
Numarul total de observatii este :::.
Scrierea matriceala:
X =

+
X=(A
111
. .... A
11n
. .... A
rs1
. .... A
rsn
)
0
=(
111
. ....
11n
. ....
rs1
. ....
rsn
)
0

=(j. c
1
. .... c
r
. ,
1
. .... ,
s
.
11
. ....
rs
)
0
1
rs+r+s+1

Y =
_
_
_
_
_
_
_
_
_
_
1 1 0 ... 0 1 0 ... 0 ... 1 0 ... 0
... ... ... ... ... ... ... ... ... ... ... ... ... ...
1 1 0 ... 0 1 0 ... 0 ... 1 0 ... 0
... ... ... ... ... ... ... ... ... ... ... ... ... ...
1 0 0 ... 1 0 0 ... 1 ... 0 0 ... 1
... ... ... ... ... ... ... ... ... ... ... ... ... ...
1 0 0 ... 1 0 0 ... 1 ... 0 0 ... 1
_
_
_
_
_
_
_
_
_
_
(nrs;rs+r+s+1)
:c:q

Y = ::
Modelul este singular. Pentru a obtine un model nesingular (deci un es-
timator unic prin metoda celor mai mici patrate), vom introduce (: + : + 1)
restrictii liniare asupra componentelor parametrului

.

=
1
+
2
+
3
+
4

1
= (j. 0. .... 0)
0

2
= (0. c
1
. .... c
r
. 0. .... 0)
0

3
= (0. .... 0. ,
1
. .... ,
s
. 0. .... 0)
0

4
= (0. .... 0.
11
. ....
rs
)
0
4
/ =
_

[

1
rs+r+s+1
_
/
1
=
_

[

=
1
. j 1
_
/
2
=
_

[

=
2
. (c
1
. .... c
r
) 1
r
_
/
3
=
_

[

=
3
. (,
1
. .... ,
s
) 1
s
_
/
4
=
_

[

=
4
. (
11
. ....
rs
) 1
rs
_
Noile restrictii vor introduse asa incat
/
1
l /
2
l /
3
l /
4
Presupunand j ,= 0. se obtin urmatoarele relatii:
r

i=1
c
i
= 0
s

j=1
,
j
= 0
r

i=1

ij
= 0. , = 1. .... :
s

j=1

ij
= 0. i = 1. .... :
Pentru aceste (: + : + 2) relatii se verica o conditie suplimentara,
r

i=1
s

j=1

ij
= 0.
deci avem doar (: + : + 1) restrictii independente.
5
Modelul cu restrictii
X=

+
r

i=1
c
i
= 0
s

j=1
,
j
= 0
r

i=1

ij
= 0. , = 1. .... :
s

j=1

ij
= 0. i = 1. .... :
este echivalent cu un model nesingular
X = Y +
X=(A
111
. .... A
11n
. .... A
rs1
. .... A
rsn
)
0
=(
111
. ....
11n
. ....
rs1
. ....
rsn
)
0
=
_
j. c
1
. .... c
r1
. ,
1
. .... ,
s1
.
11
. ....
r1;s1
_
0
1
rs
:c:qY = ::
Estimatorul prin metoda celor mai mici patrate (LSE) se obtine prin
calcul direct:

LS
(X) = (Y
0
Y)
1
Y
0
X

LS
(X) =
_
j. c
1
. .... c
r1
.

,
1
. ....

,
s1
.
11
. ....
r1;s1
_
0
j = A

=
1
:::
r

i=1
s

j=1
n

k=1
A
ijk
c
i
= A
i
A

. i = 1. .... : 1

,
j
= A
j
A

. , = 1. .... : 1

ij
= A
ij
A
i
A
j
+ A

. i = 1. .... : 1. , = 1. .... : 1
6
Folosind restrictiile suplimentare introduse, obtinem
c
r
=
r1

i=1
c
i
= A
r
A

,
s
=
s1

j=1

,
j
= A
s
A


rj
= A
rj
A
r
A
j
+ A

. , = 1. .... :

is
= A
is
A
i
A
s
+ A

. i = 1. .... :
unde am notat
A
i
=
1
::
s

j=1
n

k=1
A
ijk
A
j
=
1
::
r

i=1
n

k=1
A
ijk
A
ij
=
1
:
n

k=1
A
ijk
ECUATIA ANOVA(2)
Introducem urmatoarele sume de patrate
oo
t
=
r

i=1
s

j=1
n

k=1
_
A
ijk
A

_
2
oo
rezid
=
r

i=1
s

j=1
n

k=1
_
A
ijk
A
ij
_
2
oo
A
=
r

i=1
s

j=1
n

k=1
_
A
i
A

_
2
= ::
r

i=1
_
A
i
A

_
2
oo
B
=
r

i=1
s

j=1
n

k=1
_
A
j
A

_
2
= ::
s

j=1
_
A
j
A

_
2
7
oo
AB
=
r

i=1
s

j=1
n

k=1
_
A
ij
A
i
A
j
+ A

_
2
= :
r

i=1
s

j=1
_
A
ij
A
i
A
j
+ A

_
2
Atunci are loc relatia
oo
t
= oo
rezid
+ oo
A
+ oo
B
+ oo
AB
Proprietatea 7
Aplicand proprietatea (3) de la metoda celor mai mici patrate pentru un
model normal, avem
1
o
2
oo
rezid
~
2
(::: ::)
In continuare luam in discutie ipotezele
H
A
: c
i
= 0. \i = 1. .... :. H
0
A
: i : c
i
,= 0
H
B
: ,
j
= 0. \, = 1. .... :. H
0
B
:
_
, : ,
j
,= 0
_
H
AB
:
ij
= 0. \i = 1. .... :. , = 1. .... :. H
0
AB
:
_
i. , :
ij
,= 0
_
Proprietatea 8
Presupunem ca ipotezele H
A
. H
B
. H
AB
sunt adevarate, deci cele ::: obser-
vatii A
ijk
sunt independente, identic repartizate ` (j. o
2
) . Atunci A

este
E.V.M. al lui j si avem:
1
o
2
oo
t
~
2
(::: 1)
1
o
2
oo
A
~
2
(: 1)
1
o
2
oo
B
~
2
(: 1)
1
o
2
oo
AB
~
2
(: 1) (: 1)
In plus, variabilele aleatoare oo
rezid
. oo
A
. oo
B
si oo
AB
sunt independente,
8
ceea ce implica faptul ca urmatoarele variabile aleatoare au repartitii Fisher:
1
A
=
oo
A
: 1
_
oo
rezid
:: (: 1)
=
oo
A
oo
rezid
~ T (: 1. :: (: 1))
1
B
=
oo
B
: 1
_
oo
rezid
:: (: 1)
=
oo
B
oo
rezid
~ T (: 1. :: (: 1))
1
AB
=
oo
AB
(: 1) (: 1)
_
oo
rezid
:: (: 1)
=
oo
AB
oo
rezid
~ T ((: 1) (: 1) . :: (: 1))
(demonstatia foloseste teorema Cochran, ca si in cazul Proprietatii (5) de
la ANOVA(1))
TABELUL ANOVA(2)
Sursa de var oo grade de lib oo
factor A oo
A
: 1 oo
A
= oo
A
, (: 1)
factor B oo
B
: 1 oo
B
= oo
B
, (: 1)
interact A & B oo
AB
(: 1) (: 1) oo
AB
= oo
AB
, (: 1) (: 1)
sel aleatoare oo
rezid
:: (: 1) oo
rezid
= oo
rezid
,:: (: 1)
total oo
t
::: 1
Comentariu
Se pot lua in considerare si urmatoarele ipoteze:
factorul A este non-inuential: H
A&AB
:
_
c
i
=
ij
= 0. i = 1. ...:
_
factorul B este non-inuential: H
B&AB
:
_
,
j
=
ij
= 0. , = 1. ...:
_
Statisticile utilizate pentru testarea acestor ipoteze sunt:
oo
A&AB
= oo
A
+ oo
AB
~
2
(: 1 + (: 1) (: 1)) =
2
(: (: 1))
oo
A&AB
= oo
A&AB
, : (: 1)
1
A&AB
= oo
A&AB
, oo
rezid
~ 1 (: (: 1) . :: (: 1))
oo
B&AB
= oo
B
+ oo
AB
~
2
(: 1 + (: 1) (: 1)) =
2
(: (: 1))
oo
B&AB
= oo
B&AB
, : (: 1)
1
B&AB
= oo
B&AB
, oo
rezid
~ 1 (: (: 1) . :: (: 1))
9
ANOVA(2) PENTRU MODEL NORMAL, FARA
INTERACTIUNI
Consideram experimentul cu acelasi numar (:) de observatii in celule:
A
ijk
= j + c
i
+ ,
j
+
ijk
. i = 1. .... :. , = 1. .... :. / = 1. .... :.
cu
ij1
. ...
ijn
. i = 1. .... :. , = 1. .... : v.a.i.i.r. ` (0. o
2
) .
Numarul total de observatii este :::.
Scrierea matriceala:
X =

+
X=(A
111
. .... A
11n
. .... A
rs1
. .... A
rsn
)
0
=(
111
. ....
11n
. ....
rs1
. ....
rsn
)
0

=(j. c
1
. .... c
r
. ,
1
. .... ,
s
)
0
1
r+s+1

Y =
_
_
_
_
_
_
_
_
_
_
1 1 0 ... 0 1 0 ... 0
... ... ... ... ... ... ... ... ...
1 1 0 ... 0 1 0 ... 0
... ... ... ... ... ... ... ... ...
1 0 0 ... 1 0 0 ... 1
... ... ... ... ... ... ... ... ...
1 0 0 ... 1 0 0 ... 1
_
_
_
_
_
_
_
_
_
_
(nrs;r+s+1)
:c:q

Y = : + : 1
Modelul este singular. Pentru a obtine un model nesingular (deci un
estimator unic prin metoda celor mai mici patrate), vom introduce doua
restrictii liniare asupra componentelor parametrului

.

=
1
+
2
+
3

1
= (j. 0. .... 0)
0

2
= (0. c
1
. .... c
r
. 0. .... 0)
0

3
= (0. .... 0. ,
1
. .... ,
s
. 0. .... 0)
0
10
/ =
_

[

1
rs+r+s+1
_
/
1
=
_

[

=
1
. j 1
_
/
2
=
_

[

=
2
. (c
1
. .... c
r
) 1
r
_
/
3
=
_

[

=
3
. (,
1
. .... ,
s
) 1
s
_
Noile restrictii vor introduse asa incat
/
1
l /
2
l /
3
Presupunand j ,= 0. se obtin urmatoarele relatii:
r

i=1
c
i
= 0
s

j=1
,
j
= 0
Modelul cu restrictii
X=

+
r

i=1
c
i
= 0
s

j=1
,
j
= 0
este echivalent cu un model nesingular
X = Y +
X=(A
111
. .... A
11n
. .... A
rs1
. .... A
rsn
)
0
=(
111
. ....
11n
. ....
rs1
. ....
rsn
)
0
=
_
j. c
1
. .... c
r1
. ,
1
. .... ,
s1
_
0
1
rs
:c:qY = : + : 1
11
Estimatorul prin metoda celor mai mici patrate (LSE) se obtine prin
calcul direct:

LS
(X) = (Y
0
Y)
1
Y
0
X

LS
(X) =
_
j. c
1
. .... c
r1
.

,
1
. ....

,
s1
_
0
j = A

=
1
:::
r

i=1
s

j=1
n

k=1
A
ijk
c
i
= A
i
A

. i = 1. .... : 1

,
j
= A
j
A

. , = 1. .... : 1
Folosind restrictiile suplimentare introduse, obtinem
c
r
=
r1

i=1
c
i
= A
r
A

,
s
=
s1

j=1

,
j
= A
s
A

ECUATIA ANOVA(2)
Introducem urmatoarele sume de patrate
oo
t
=
r

i=1
s

j=1
n

k=1
_
A
ijk
A

_
2
oo
rezid
=
r

i=1
s

j=1
n

k=1
_
A
ijk
A
i
A
j
+ A

_
2
oo
A
=
r

i=1
s

j=1
n

k=1
_
A
i
A

_
2
= ::
r

i=1
_
A
i
A

_
2
oo
B
=
r

i=1
s

j=1
n

k=1
_
A
j
A

_
2
= ::
s

j=1
_
A
j
A

_
2
Atunci are loc relatia
oo
t
= oo
rezid
+ oo
A
+ oo
B
12
Proprietatea 9
Aplicand proprietatea (3) de la metoda celor mai mici patrate pentru un
model normal, avem
1
o
2
oo
rezid
~
2
(::: : : + 1)
In continuare luam in discutie ipotezele
H
A
: c
i
= 0. \i = 1. .... :. H
0
A
: i : c
i
,= 0
H
B
: ,
j
= 0. \, = 1. .... :. H
0
B
:
_
, : ,
j
,= 0
_
Proprietatea 10
Presupunem ca ipotezele H
A
si H
B
sunt adevarate, deci cele ::: obser-
vatii A
ijk
sunt independente, identic repartizate ` (j. o
2
) . Atunci A

este
E.V.M. al lui j si avem:
1
o
2
oo
t
~
2
(::: 1)
1
o
2
oo
A
~
2
(: 1)
1
o
2
oo
B
~
2
(: 1)
In plus, variabilele aleatoare oo
rezid
. oo
A
si oo
B
sunt independente, ceea ce
implica faptul ca urmatoarele variabile aleatoare au repartitii Fisher:
1
A
=
oo
A
: 1
_
oo
rezid
:: (: 1)
=
oo
A
oo
rezid
~ T (: 1. ::: : : + 1)
1
B
=
oo
B
: 1
_
oo
rezid
:: (: 1)
=
oo
B
oo
rezid
~ T (: 1. ::: : : + 1)
(demonstatia foloseste teorema Cochran, ca si in cazul Proprietatii (5) de
la ANOVA(1))
TABELUL ANOVA(2)
Sursa de var oo grade de lib oo
factor A oo
A
: 1 oo
A
= oo
A
, (: 1)
factor B oo
B
: 1 oo
B
= oo
B
, (: 1)
sel aleatoare oo
rezid
::: : : + 1 oo
rezid
= oo
rezid
, (::: : : + 1)
total oo
t
::: 1
13
FUNCTIILE aov si anova DIN R
aov {stats} R Documentation
Fit an Analysis of Variance Model
Description: Fit an analysis of variance model by a call to lm for each
stratum.
Usage: aov(formula, data = NULL, projections = FALSE, qr = TRUE,
contrasts = NULL, ...)
Arguments
formula A formula specifying the model.
aov is designed for balanced designs, and the results can be hard to in-
terpret without balance: beware that missing values in the reponse(s) will
likely lose the balance.
anova {stats} R Documentation
Anova Tables
Description: Compute analysis of variance (or deviance) tables for one
or more tted model objects.
Usage: anova(object, ...)
Arguments
object an object containing the results returned by a model tting
function (e.g., lm or glm).
... additional objects of the same type.
Value
This (generic) function returns an object of class anova. These objects
represent analysis-of-variance and analysis-of-deviance tables. When given a
single argument it produces a table which tests whether the model terms are
signicant.
When given a sequence of objects, anova tests the models against one
another in the order specied.
14
APLICATIA 1
Generam date pentru un model cu un factor, cu 3 nivele.
Facem un experiment echilibrat, cu cate 20 de observatii in ecare celula
X1<-c(rnorm(20,5,4))
X2<-c(rnorm(20,8,4))
X3<-c(rnorm(20,2,4))
v<-c(X1,X2,X3)
k=20
n=3
tf=gl(n, k, length = n*k, labels = 1:n, ordered = FALSE)
tf
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[39] 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
Levels: 1 2 3
Prin functia aov
av<-aov(v ~ tf)
av
Call:
aov(formula = v ~ tf)
Terms:
tf Residuals
Sum of Squares 428.6167 1191.0570
Deg. of freedom 2 57
Residual standard error: 4.571186
summary(av)
Df Sum Sq Mean Sq F value Pr(F)
tf 2 428.62 214.31 10.256 0.0001568
Residuals 57 1191.06 20.90
15
p-value = 0.0001568 < 0.05
Decidem sa respingem ipoteza de egalitate a mediilor
Prin functia anova
model<-lm(v~tf)
anova(model)
Analysis of Variance Table
Response: v
Df Sum Sq Mean Sq F value Pr(F)
tf 2 428.62 214.31 10.256 0.0001568
Residuals 57 1191.06 20.90
APLICATIA 2
PlantGrowth {datasets} R Documentation
Results from an Experiment on Plant Growth
Description: Results from an experiment to compare yields (as measured
by dried weight of plants) obtained under a control and two dierent
treatment conditions.
Usage: PlantGrowth
Format: A data frame of 30 cases on 2 variables (10 observatii pentru
control, 10 observatii pentru treatment1 si 10 observatii pentru treatment 2)
[, 1] weight numeric
[, 2] group factor
The levels of group are ctrl, trt1, and trt2.
PlantGrowth
weight group
1 4.17 ctrl
2 5.58 ctrl
3 5.18 ctrl
4 6.11 ctrl
5 4.50 ctrl
16
6 4.61 ctrl
7 5.17 ctrl
8 4.53 ctrl
9 5.33 ctrl
10 5.14 ctrl
11 4.81 trt1
12 4.17 trt1
13 4.41 trt1
14 3.59 trt1
15 5.87 trt1
16 3.83 trt1
17 6.03 trt1
18 4.89 trt1
19 4.32 trt1
20 4.69 trt1
21 6.31 trt2
22 5.12 trt2
23 5.54 trt2
24 5.50 trt2
25 5.37 trt2
26 5.29 trt2
27 4.92 trt2
28 6.15 trt2
29 5.80 trt2
30 5.26 trt2
model1<-lm(weight ~group, data = PlantGrowth)
av1<-anova(model1)
av1
Analysis of Variance Table
Response: weight
Df Sum Sq Mean Sq F value Pr(F)
tf 2 3.7663 1.8832 4.8461 0.01591
Residuals 27 10.4921 0.3886
p-value = 0.01591 < 0.05
Decidem sa respingem ipoteza de egalitate a mediilor
17
CONTINUAREA TEMATICII
1) Post-hoc testing of ANOVAs
Multiple comparison procedures are commonly used in an analysis of vari-
ance after obtaining a signicant omnibus test result, like the ANOVA F-test.
The signicant ANOVA result suggests rejecting the global null hypothesis
H
0
that the means are the same across the groups being compared. Multiple
comparison procedures are then used to determine which means dier. In a
one-way ANOVA involving 1 group means, there are 1(1 1),2 pairwise
comparisons.
A number of methods have been proposed for this problem, some of which
are:
1.1. Single-step procedures
* TukeyKramer method (Tukeys HSD) (1951)
* Schee method (1953)
1.2. Multi-step procedures based on Studentized range statistic
* Duncans new multiple range test (1955)
* The Nemenyi test is similar to Tukeys range test in ANOVA.
* Student Newman-Keuls post-hoc analysis
2) Non-parametric analysis of variance
The KruskalWallis one-way analysis of variance by ranks is a
non-parametric method for testing equality of population medians among
groups. It is identical to a one-way analysis of variance (ANOVA(1)) with
the data replaced by their ranks. It is an extension of the MannWhitney U
test to 3 or more groups.
Since it is a non-parametric method, the KruskalWallis test does not as-
sume a normal population, unlike the analogous one-way analysis of variance.
However, the test does assume an identically-shaped and scaled distribution
for each group, except for any dierence in medians.
Multiple comparisons can be done using pairwise comparisons (for the
non-parametric alternative to ANOVA.example using Wilcoxon rank sum
tests) and using a correction to determine if the post-hoc tests are signicant
(for example a Bonferroni correction).
18

S-ar putea să vă placă și