Sunteți pe pagina 1din 45

ACP - corelatia dintre componenta X si variabila observata Y

Variabilele sunt pe linii si componentele pe coloane => A12 = 0.6


Corelatia = A12 * sqrt(alfa(2))
0.42

Discriminare bayesiana

Clasa Nr Instante Prob cond Prob apriorPc*Pi Prob aposteorica


A 200 0.8 0.1 0.05 0.12
B 600 0.4 0.2 0.07 0.17
C 400 0.5 0.1 0.06 0.14
D 900 0.4 0.3 0.10 0.26
E 1400 0.3 0.4 0.12 0.30
0.39
3500

ACP - numar minim de componente pentru a acoperi cel putin x% din varianta totala

S= 6
0.416667 0.6666666667 0.833333333333333

Matricea comunalitatilor
0.85 85%

ACP - numar componente principale semnificative cform criteriu Cattel, Kaiser

Nr alfa eps=alfa(k)-alfa(k+1) eps(k)-eps(k+1)


1 2.6 0.1 -1.3 => Prima diferenta mai mica decat 0 este -1,3 si co
2 2.5 1.4 componentei 1. Deci nr de componente semnific
3 1.1 0.7
4 0.4 0.2
5 0.2 0.05
6 0.15 0.1
7 0.05

Analiza de cluster - numar optimal de clusteri


Amplitudine=max-min 140000
Se trage o dreapta orizonatala (x=0 si y=140000)
Numarul de linii pe care dreapta le interesecteaza = numarul de clusteri = 2

Analiza canonica - cantitatea de varianta comuna

Rxz Rxz^2
0.1 0.8 -0.4 0.01 0.64 0.16
0.1 -0.7 -0.4 0.01 0.49 0.16
-0.3 -0.1 0.8 0.09 0.01 0.64
0.1 0.4 0.7 0.01 0.16 0.49
SUM 1.3
Varianta SUM*alfa(2) 0.26

ACP - varianta explicata de a x-a componenta principala

X1 X2 X3 X4
C1 0.9 0.5 0.5 0.6
C2 0.3 0.3 0.6 0.1
C3 -0.5 0.9 0.5 -0.1
C4 -0.3 -0.1 -0.6 0.1
C2^2 0.09 0.09 0.36 0.01
Varianta 0.55

ACP - coordonatele unei instante in axele componentelor principale

Formula pentru determinarea vecorilor proprii: ak = Rk/sqrt(alfa(k))


alfa(k) - suma patratelor corelatiilor
alfa(1) 2.19 x 10 5 15
alfa(2) 1.3
sqrt(alfa(1)) 1.48
sqrt(alfa(2)) 1.14

a1 0.61
0.54
-0.47
0.34

a2 0.18
0.26
0.53
0.79

c1 8.45
c2 26.75

Tabel contingenta - valoarea unei anumite frecvente


Metrou RATB Auto propriu
<25 65 30 5
26-35 55 20 25
36-45 30 10 60
46-55 35 15 50
55+ 40 50 10
total 500

i = grupa de varsta 26-35


j = auto propriu
ni = numarul absolut al celor din grupa de varsta 26-35 100
nj = numarul absolut al celor cu auto propriu 150
n = total
tij 30
Variabilele sunt pe linii si componentele pe coloane => A11 = 0.3
Corelatia = A11 * sqrt(alfa(1))
0.6

Clasa Nr Instante Prob cond Prob apriorica Pc*Pi


A 100 0.4 0.1 0.04
B 100 0.8 0.1 0.08
C 600 0.2 0.6 0.12
D 200 0.3 0.2 0.06
0.30
1000
0.9 90%

In urma efectuarii unei analize in componente principale pe un numar de 7 var


urmatoarele valori proprii: (3, 1.6, 0.9, 0.8, 0.3, 0.25, 0.15)
o Care este numarul de componente principale semnificativa conform criteriul

a mai mica decat 0 este -1,3 si coresponde Nr alfa eps=alfa(k)-alfa(k+eps(k)-eps(k+1)


Deci nr de componente semnificative este 2 1 3 1.4 0.7
2 1.6 0.7 0.6
3 0.9 0.1 -0.4
4 0.8 0.5
5 0.3 0.05
6 0.25 0.1
7 0.15
Amplitudine=max-min 0.5
Se trage o dreapta orizonatala (x=0 si y=0.5)
Numarul de linii pe care dreapta le interesecteaza = numarul de clusteri = 4

Rxz Rxz^2
0.1 0.8 -0.4 0.01 0.64
0.1 -0.7 -0.4 0.01 0.49
-0.3 -0.1 0.8 0.09 0.01
0.1 0.4 0.7 0.01 0.16
SUM 0.12
Varianta SUM*alfa(1) 0.096

Rxz Rxz^2
0.9 0.3 -0.2 0.5 0.81 0.09
0.8 0.2 0.3 0.1 0.64 0.04
-0.7 -0.5 0.5 0.2 0.49 0.25
-0.3 -0.7 -0.6 0.1 0.09 0.49
20
Variabilele sunt pe linii si componentele pe coloane
Corelatia = A21 * sqrt(alfa(2))
0.8

Prob aposteorica Clasa Nr InstanteProb cond


0.13 A 400 0.4
0.27 B 100 0.7
0.40 C 300 0.2
0.20 D 200 0.3

1000
C1 C2 C3
X1 0.64 0.8 0.96
X2 0.49 0.85 0.89
X3 0.35 0.5 0.99
X4 0.64 0.8 0.89

nte principale pe un numar de 7 variabile observate s-au obtinut


0.3, 0.25, 0.15)
pale semnificativa conform criteriului lui Cattell?
Nr
eps(k)-eps(k+1) 1
2
3
=> Prima diferenta mai mica decat 0 este -0,4 si coresponde 4
componentei 3. Deci nr de componente semnificative este 4 5
6
7
Deoarece nu exista nicio d
Amplitudine=max-min 1638
Se trage o dreapta orizonatala (x=0 si y=1638)
Numarul de linii pe care dreapta le interesecteaza = numarul de clusteri = 3

Rxz Rxz^2
0.16 0.1 0.8 -0.4 0.01
0.16 0.1 -0.7 -0.4 0.01
0.64 -0.3 -0.1 0.8 0.09
0.49 0.1 0.4 0.7 0.01
SUM
Varianta SUM*alfa(2) 0.26

C1 C2 C3 C4 C5
X1 0.9 0.5 0.1 0.2 -0.1
X2 0.8 0.2 -0.1 0.1 0.05
X3 0.7 -0.1 0.6 -0.1 -0.1
0.04 0.25 X4 0.8 -0.2 -0.3 -0.05 -0.05
0.09 0.01 X5 -0.4 0.9 0.7 -0.15 0.2
0.25 0.04
0.36 0.01 C2^2 0.25 0.04 0.01 0.04 0.81
0.74 Varainta 1.15
mponentele pe coloane => A21 = 0.3

Prob apriori Pc*Pi Prob aposteorica Clasa Nr Instante


0.4 0.16 0.46 A 100
0.1 0.07 0.20 B 500
0.3 0.06 0.17 C 300
0.2 0.06 0.17 D 100
0.35
1000
C4 R1 R2=C2-C1 R3=C3-C2 R4=C4-C3
1 X1 0.64 0.16 0.16 0.04
1 X2 0.49 0.36 0.04 0.11
1 X3 0.35 0.15 0.49 0.01
1 X4 0.64 0.16 0.09 0.11
Varianta 2.12 0.83 0.78 0.27

alfa eps=alfa(k)-alfa(k+eps(k)-eps(k+1) Nr
3.35 1.75 1.05 1
1.6 0.7 0.2 2
0.9 0.5 0.4 3
0.4 0.1 0.05 4
0.3 0.05 0 5
0.25 0.05 6
0.2 7
eoarece nu exista nicio diferenta negativa => 7 componente semnificative
ul de clusteri = 3

0.64 0.16
0.49 0.16
0.01 0.64
0.16 0.49
1.3
a1 a2 a3 a4 a5
0.3 0.6 0.3 0.6 -0.36
0.4 -0.5 0.5 0.1 -0.35
-0.25 -0.2 0.9 0.15 0.56
-0.1 -0.1 0.8 0.16 0.14
-0.35 0.7 -0.6 -0.58 0.05

Componentele sunt pe linii si variabilele pe coloane => A33 = 0.9


Corelatia = A33 * sqrt(alfa(3))
0.45

Prob cond Prob apriorPc*Pi Prob aposteorica Clasa


0.5 0.1 0.05 0.10 A
0.5 0.5 0.25 0.50 B
0.6 0.3 0.18 0.36 C
0.2 0.1 0.02 0.04 D
0.50
C1 C2 C3 C4 R1 R2=C2-C1
X1 0.64 0.8 0.96 1 X1 0.64 0.16
X2 0.49 0.85 0.89 1 X2 0.49 0.36
X3 0.35 0.5 0.99 1 X3 0.35 0.15
X4 0.64 0.8 0.89 1 X4 0.64 0.16
Varianta 2.12 0.83

alfa eps=alfa(k)eps(k)-eps(k+1)
2.9 0.3 -1.2 =>
2.6 1.5 Prima diferenta mai mica decat 0 este -1,2 si coresponde
1.1 0.3 componentei 1. Deci nr de componente semnificative este 2
0.8 0.3
0.5 0.2
0.3 0.2
0.1
Rxz Z1 Z2 Z3
X1 0.2 0.8 -0.4
X2 0.1 -0.7 -0.4
X3 -0.3 -0.1 0.8
X4 0.1 0.4 0.7
X5 0.4 0.1 0.6

Rxz^2 Z1 Z2 Z3
X1 0.04 0.64 0.16
X2 0.01 0.49 0.16
X3 0.09 0.01 0.64
X4 0.01 0.16 0.49
X5 0.16 0.01 0.36
SUM 1.81
Varianta SUM*alfa(3) 0.362
Nr InstanteProb cond Prob apriorPc*Pi Prob aposteorica
200 0.8 0.1 0.08 0.16
600 0.4 0.3 0.12 0.24
400 0.5 0.2 0.10 0.20
800 0.5 0.4 0.20 0.40
0.50
2000
R3=C3-C2 R4=C4-C3 C1 C2 C3 C4
0.16 0.04 X1 0.64 0.8 0.96 1
0.04 0.11 X2 0.49 0.85 0.89 1
0.49 0.01 X3 0.35 0.5 0.99 1
0.09 0.11 X4 0.64 0.8 0.89 1
0.78 0.27
Z4 Corelatii canonice
-0.4 0.9 0.3 0.2 0.1
-0.3 alfa(3)
-0.2
0.5
-0.6

Z4
0.16
0.09
0.04
0.25
0.36
R1 R2=C2-C1 R3=C3-C2 R4=C4-C3
X1 0.64 0.16 0.16 0.04
X2 0.49 0.36 0.04 0.11
X3 0.35 0.15 0.49 0.01
X4 0.64 0.16 0.09 0.11
Varianta 2.12 0.83 0.78 0.27
ACP - cosinus, scor

X a1 a2 a3 x*a1 x*a2 x*a3


2 0.4 -0.25 0.8 0.8 -0.5 1.6
4 0.5 0.75 0.6 2 3 2.4
2 0.1 0.75 0.5 0.2 1.5 1
C 3 4 5
C^2 9 16 25

cos= 0.5

Clasificare liniara - Analiza cluster

x F1 F2 F3 F4 F1(x) F2(x) F3(x)


10 10 9 15 10 100 90 150
-8 -8 -5 -10 -10 64 40 80
15 12 20 16 15 180 300 240
1000 800 900
1344 1230 1370
Grupa 3

Analiza factoriala - factor loadings

g3.csv
F1 F2 F3
X1 0.6 -0.6 0.5
X2 -0.7 0.5 -0.2
X3 -0.8 0.4 -0.3
X4 0.8 -0.5 -0.1

F1 F2 F3 F1^2 F2^2 F3^2 Comunalita1-Comunalitate


0.6 -0.6 0.5 0.36 0.36 0.25 0.97 0.03
-0.7 0.5 -0.2 0.49 0.25 0.04 0.78 0.22
-0.8 0.4 -0.3 0.64 0.16 0.09 0.89 0.11
0.8 -0.5 -0.1 0.64 0.25 0.01 0.9 0.1

Analiza cluster - grafic dendograma legatura completa


x a1 a2 a3 a4 x1*a1
3 0.5 -0.49 -0.51 -0.5 1.5
1 0.5 -0.5 0.47 0.55 0.5
2 -0.4 -0.53 0.5 -0.49 -0.8
1 -0.4 -0.48 -0.51 0.49 -0.4
0.8 c11

0.5

F4(x) x F1 F2 F3 F1(x) F2(x) F3(x)


100 10 10 9 15 100 90 150
80 -5 -5 -6 -6 25 30 30
225 15 15 20 10 225 300 150
850 1000 800 1500
1255 1350 1220 1830
1-Comunalitate
Grupa 3
Analiza discriminanta - metoda Histogramelor

METODA HISTOGRAMELOR
X1 X2
2 8 X2
9
1 5
8 8 8
9 1 7 7
10 2 6 6 6
8 1 5 5
7 3 4
2 6 3 3
2 2
1 8
1 1 1
1 7
0
1 6 0 2 4 6 8 10 12

Pentru x=(1,6) valoarea lui X2 = 6 => GRUPA 2

Analiza canonica

X Y Z X^2 Y^2 Z^2 ALFA


1 0.9 0.7 1 0.81 0.49 2.3
0.8 0.8 0.6 0.64 0.64 0.36 1.64
0.7 0.5 0.5 0.49 0.25 0.25 0.99
0.5 0.3 0.4 0.25 0.09 0.16 0.5
0.3 0.2 0.3 0.09 0.04 0.09 0.22
0.2 0.1 0.1 0.04 0.01 0.01 0.06

Analiza bivariata

Nr axe = min(p,q) - 1 = min(3,5)-1 = 2


Primele doua axe acopera toata inertia

Analiza discriminanta - putere de discriminare


n=100 100

numar valori proprii = min(10,q)-1


numar valori proprii = 3 => q=4 4

alfa lambda
0.8 128
0.6 48
0.2 8

Analiza factoriala - KMO

R R^2
1 0.9 0.5 0.6 1 0.81 0.25
0.9 1 0.5 0.9 0.81 1 0.25
0.5 0.5 1 0.4 0.25 0.25 1
0.6 0.9 0.4 1 0.36 0.81 0.16

A A^2
1 -1 0.4 0.9 1 1 0.16
-1 1 -0.5 -1 1 1 0.25
0.4 0.5 1 0.4 0.16 0.25 1
0.9 -1 0.4 1 0.81 1 0.16

Am efectuat sumele de pe linia 3, mai putin elementele de pe diagonala principala


KMO(X3) 0.536585365853658 SLABA
METODA HISTOGRAMELOR X2
X1 X2 9
8 8 8
1 6 7
9 1 6
9 8 5
8 1 4
8 2 3
2
2 6 2
1 8 1
10 12 1 7 0
0 1 2 3 4 5 6 7 8 9 10
1.5 6

Pentru x=(1.5 ,6) valoarea lui X2 = 6 => GRUPA 3

Rxz Ryu
0.1 0.8 -0.4 -0.3 -0.8
0.1 -0.7 -0.4 -0.1 0.9
-0.3 -0.1 0.8 0.1 0.4
0.1 0.4 0.7

Rxz^2 Ryu^2
0.01 0.64 0.16 0.09 0.64
0.01 0.49 0.16 0.01 0.81
0.09 0.01 0.64 0.01 0.16
0.01 0.16 0.49 0.11 1.61
0.12 1.3 1.45 VX 0.033 0.322
0.036 0.26 0.145 SX

suma 0.35 n=lambda^2/suma


n 40

n=100 numar valori proprii = min(10,q)-1


numar valori proprii = 7 => q=8
alfa lambda
0.9 118.2857 0.50757977313686
0.8 52.57143 0.733170783419909
0.75 39.42857 0.902364041132195
0.5 13.14286
0.3 5.632653
0.2 3.285714
0.05 0.691729
233.0387

Coeficientii de corelatie
0.36 R X1 X2 X3 X4
0.81 X1 1 0.9 0.5 0.6
0.16 0.66 X2 0.9 1 0.5 0.9
1 X3 0.5 0.5 1 0.4
X4 0.6 0.9 0.4 1

0.81 R^2 X1 X2 X3 X4
1 X1 1 0.81 0.25 0.36
0.16 0.57 X2 0.81 1 0.25 0.81
1 X3 0.25 0.25 1 0.16
X4 0.36 0.81 0.16 1

Am efectuat sumele de pe linia 3, mai putin elementele de pe diagonala principala


KMO 0.536585366
7 8 9 10
0.6
0.5
-0.9

VX = variantele explicate de radacinile canonice


SX = redundantele informationale

0.36
0.25
0.81
1.42 VY
0.142 SY

n=lambda^2/suma
Coeficientii de corelatie partiala
A X1 X2 X3 X4
X1 1 -1 0.4 0.9
X2 -1 1 -0.5 -1
X3 0.4 -0.5 1 0.4
X4 0.9 -1 0.4 1

A^2 X1 X2 X3 X4
X1 1 1 0.16 0.81
X2 1 1 0.25 1
0.66 X3 0.16 0.25 1 0.16 0.57
X4 0.81 1 0.16 1

e de pe diagonala principala

S-ar putea să vă placă și