Sunteți pe pagina 1din 30

ANALIZA

N
COMPONENTE PRINCIPALE

IIN
NT
TR
RO
OD
DU
UC
CE
ER
RE
E
Studiem cu ajutorul acestei metode un tabel indivizi x variabile, n cazul
n care toate variabilele sunt numerice.
3UH]HQW P PDL vQWkL R DERUGDUH H[SORUDWRDUH FDUH SHUPLWH GHVFULHUHD

indivizilor n multiplele lor GLPHQVLXQL L YL]XDOL]DUHD DFHVWRU UHOD LL vQWUH


variabile.
8UPHD]  $QDOL]D vQ &RPSRQHQWH 3ULQFLSDOH $&3  $FHDVW  PHWRG 
SHUPLWHRE LQHUHDXQHLK U LDLQGLYL]LORUvQIXQF LHGHDVHP Q ULOHGLQWUHHLL
RKDUW DYDULDELOHORUvQIXQF LHGHFRUHOD LLOH

lor.

(VWHGH DVHPHQL SRVLELO  RE LQHUHD XQHL UHSUH]HQW UL VLPXOWDQH SH R KDUW 
DLQGLYL]LORULDYDULDELOHORU
0DLPXOWHVWHQHFHVDUV FRPSOHW PUHSUH]HQWDUHDJUDILF DGDWHORUFXR

tipologie a indivizilor.
3UH]HQW P GH DVHPHQL PHWRGD GH FODVLILFDUH DVFHQGHQW  LHUDUKLF  FDUH
IRORVHWHFULWHULXOOXL:DUGIRDUWHELQHDGDSWDWODWUDWDUHDGDWHORUQXPHULFH

Analiza exploratoare a datelor multidimensionale


7DEHOXO  YD VHUYL GUHSW ILU FRQGXF WRU SHQWUX DFHDVW  SUH]HQWDUH /LQLLOH
WDEHOXOXL

UHSUH]LQW 

PRGHOH

GH

PDLQL

GLQ

DQXO



LDU

FRORDQHOH

FDUDFWHULVWLFLOH WHKQLFH FDSDFLWDWHD FLOLQGULF  VDX FLOLQGUHH SXWHUH YLWH] 


JUHXWDWHOXQJLPHO

LPH

Tabelul 1
Nr. Crt.

Model

Cilindree

Putere

9LWH]

Caracteristicile celor 24 modHOHGHPDLQL

Greutate

Lungime

LPH

Studiul descriptiv al indivizilor


&HLLQGLYL]LSRWILUHSUH]HQWD LvPSUHXQ FXFHOHFDUDFWHULVWLFLDOHORU

utiliznd graficul n stea din figura 1.


Figura 1
Grafic n stea
9LWH]

Putere

Cilindree

Greutate

Lungime

LPH

Fiecare individ este reprezentat printr-un hexagon. Fiecare vrf al


KH[DJRQXOXL FRUHVSXQGH XQHL YDULDELOH 3HQWUX LQGLYLGXO UHSUH]HQWDW GLVWDQ D
GHODYkUIODRULJLQHHVWHSURSRU LRQDO FXDEDWHUHDYDORULORUYDULDELOHLID

GH

YDORDUHDHLPLQLP HDHVWHPLQLP GDF FDUDFWHULVWLFDHVWHPLQLP LPD[LP 


GDF FDUDFWHULVWLFDHVWHPD[LP 
&HLLQGLYL]LVXQWUHSUH]HQWD LvQILJXUD
6 SXQFW PFkWHYDFD]XULSDUWLFXODUH

3HXJHRW  5DOO\H 6HDW ,EL]D 6;L L &LWURsQ$; 6SRUW DX YLWH]H
PDULLSXWHUHPDUHvQU

aport cu celelalte caracteristici ale lor

1LVVDQ9DQHWWHL9:&DUDYHOOHVHFDUDFWHUL]HD] SULQYLWH]HPLFL

Figura 2
Graficele n stea pentru indivizi

5HQDXOWDUHRSXWHUHPLF vQUDSRUWFXFLOLQGUHHDVD$FHVWDHVWH

un diesel.
QJHQHUDOPXO LPHDFDUDFWHULVWLFLORUHYROXHD] vQDFHODLVHQV
*UDILFHOHvQVWHDFUHVFUHJXODWGHOD PDLQLOHPLFLSUHFXP)RUG)LHVWDL
)LDW8QRODFHOHPDLPDULSUHFXP%09,5RYHULL5HQDXOW

Studiul descriptiv al variabilelor


Tabelul 2
Rezumate statistice ale datelor
Statistici elementare
Variabila

Media

Dispersia

Abaterea
medie

Minim

Maxim

Greutate

Lungime

S WUDWLF

&RUHOD LL

Variabile

Cilindree

Putere

9LWH]

Q WDEHOXO  VXQW SUH]HQWDWH FkWHYD VWDWLVWLFL HOHPHQWDUH L PDWULFHD


FRUHOD LL

lor dintre variabile.

3UHFL] P F  GLVSHUVLLOH VXQW FDOFXODWH vPS U LQG SULQ Q L QX SULQ Q

-1),

GHRDUHFH HVWH YRUED GH R DQDOL]  JHRPHWULF  D GDWHORU L QX H[LVW  LQIHUHQ

VWDWLVWLF 
6WDWLVWLFD LQIHUHQ LDO  VWXGLD]  XQ HDQWLRQ L WUDJH FRQFOX]LL SHQWU

vQWUHDJDPXO LPH 
3HQWUX D DYHD R YL]LXQH FRPSOHW  D GDWHORU L D LQWHU UHOD LLORU vQWUH

variabile, am construit graficul din figura 3.

Figura 3
*UDILFXOFRUHOD LLORULQWHU

-variabile

Toate variabilele sunt corelate pozitiv ntre ele. AceaVWD vQVHDPQ

 F 

H[LVW  XQ IDFWRU P ULPH WDOLH  L F  OD R SULP  DQDOL]  PDLQLOH SRW IL
RUGRQDWH GH OD FHOH PDL PLFL OD FHOH PDL PDUL $FHDVWD VH YHGH GH DOWIHO L

examinnd global graficele n stea din figura 2.


9LWH]DHVWHIRDUWHFRUHODW FXSXWHUHDLPDLSX LQFXDOWHYDULDELOH

Variabilele grupului (Cilindree, Lungime, Greutate) sunt puternic corelate


ntre ele:
Cor (Cilindree, Lungime) = 0.86
Cor (Cilindree, Greutate) = 0.90
Cor (Lungime, Greutate) = 0.92
&RUHOD LLLPSRUWDQWHPDLVXQWFHOHGLQWUH/XQJLPHL/
GLQWUH3XWHUHL9LWH]  

LPH  LFHOH

3XWHP V  UH]XP P L V  YL]XDOL] P DFHDVW  SULP  DQDOL]  SULQWU


FODVLILFDUH

LHUDUKLF 

DVFHQGHQW 

D

YDULDELOHORU

VLPLODULWDWHvQWUHYDULDELOHFRUHOD LLOHORU DLFLWRDWH

OXkQG

GUHSW

LQGLFH

-o

GH

pozitive).

'DF  FRUHOD LLOH  YDULDELOHORU DX YDORUL QHJDWLYH OX P vQ FRQVLGHUD LH

valorile lor absolute.


6 GHVFULHPDFHDVW PHWRG 
Q SULPD HWDS  VH UHJUXSHD]  FHOH GRX  YDULDELOH FHOH PDL FRUHODWH
*UHXWDWHL/XQJLPH  
QDGRXDHWDS VHFDXW FHDPDLSXWHUQLF FRUHOD LHFDUHDU PDV
(VWHYRUEDGHFRUHOD LDGLQWUHYDULDELOHOH*UHXWDWHL&LOLQGUHH  

Variabila Cilindree se va uni deci cu grupul (Greutate, Lungime)


Q D WUHLD HWDS  JUXSXO 3XWHUH 9LWH]  VH IRUPHD]  FX R FRUHOD LH GH

0.894.
QDSDWUDHWDS /
&RU /

LPHDvQWkOQHWHJUXSXO &LOLQGUHH*UHXWDWH/XQJLPH 

LPH/XQJLPH

 

L vQ VIkULW vQ D FLQFHD HWDS  FHOH GRX  JUXSXUL 3XWHUH 9LWH]  L
&LOLQGUHH*UHXWDWH/XQJLPH/

LPH IX]LRQHD] 

&HDPDLSXWHUQLF FRUHOD LHvQWUHRYDULDELO DSULPXOXLJUXSLRYDULDELO 

a celui de-DOGRLOHDJUXSHVWHHJDO FXFHDGLQWUH3XWHUHL&LOLQGUHH


Acest procedeu de agregare iterativ este vizualizat n figura 4 printr-o
GHQGRJUDP 

Indicele de agregare calculat este cHD

PDL SXWHUQLF  FRUHOD LH vQWUH

YDULDELOHOHXQXLJUXSLFHOHDOHFHOXLODOWJUXSvQPRPHQWXOUHJUXS ULL

Figura 4
&ODVLILFDUHDLHUDUKLF

DVFHQGHQW

DYDULDELOHORU

0HWRGDFRUHOD LLORUPD[LPH

Indice de agregare Variabila

3XWHP GH DVHPHQL V  P VXU P DVLPLODULWDWHD vQWUH ILHFDUH YDULDELO  L


PXO LPHD

WXWXURU

YDULDELOHORU

LQFOXVLY

HD

vQV L 

XWLOL]kQG

S WUDWHOH

FRUHOD LLORU(VWHYRUEDGHFLGHDP VXUDLPSRUWDQ DXQHLYDULDELOH


'H H[HPSOX LPSRUWDQ D YDULDELOHL &LOLQGUHH HVWH FDOFXODW  I FkQG PHG
WXWXURUS WUDWHORUFRUHOD LLORUVDOHFXPXO LPHDGHYDULDELOH

1
4.29
(1 + 0.8612 + 0.6932 + 0.905 2 + 0.864 2 + 0.709 2 ) =
= 0.715
6
6

ia

7DEHOXO  FRQ LQH VLPLODULWDWHD ILHF UHL YDULDELOH FX vQWUHDJD PXO LPH D

variabilelor:
Tabelul 3

$VLPLODULWDWHDILHF UHLYDULDELOHFXPXO LPHDWXWXURUYDULDEL

lelor.

$VWIHOYDULDELODFDUHUH]XP FHOPDLELQHPXO LPHDFHORUYDULDELOHHVWH


FLOLQGUHHD9LWH]DHVWHRYDULDELO PDLLQGHSHQGHQW ID

GHFHOHODOWH

Analiza n Componente Principale


'DWHOH FDUH WUHEXLH DQDOL]DWH VH SUH]LQW  VXE IRUPD XQXL WDEHO LQGLY

izi x

YDULDELOH ([LVW  S YDULDELOH 


LQ 1RW P FX

xi = ( xi1 ,.....xip ) PXO


L

xij 

X 1 ,......, X J ,....., X P observate pentru n indivizi

YDORDUHD OXDW  GH YDULDELOD

LPHDFDUDFWHULVWLFLORU

X j pentru individul i,

pentru individul i, x j , s 2j =

s j PHGLDGLVSHUVLDLDEDWHUHDPHGLHS

WUDWLF DYDULDELOHL

1 n
( xij x j ) 2
n i =1

Xj.

3UHFL] P F  GLIHUHQ HOH  vQWkOQLWH vQWUH SURJUDPHOH GH $QDOL]  vQ


&RPSRQHQWH

3ULQFLSDOH

IUDQFH]H

L

mericane, la nivelul calculului


FRPSRQHQWHORU SULQFLSDOH SURYLQ GLQ vPS U LUHD FX Q VDX Q-1) n calculul
dispersiei.
$QDOL]D vQ &RPSRQHQWH 3ULQFLSDOH FRQVW  vQ F XWDUHD XQXL QXP U PLF GH

variabile noi Y1 ,......,Ym numite componente principale

QHFRUHODWH vQWUH HOH L

FDUHV UH]XPHFkWPDLELQHSRVLELOGDWHOHGHSOHFDUHLQL LDOH


0DL PXOWH FULWHULL SHUPLW RE LQHUHD FRPSRQHQWHORU SULQFLSDOH &ULWHULXO
LQHU LHLHVWHFHOPDLYHFKL 3HDUVRQ LSUH]LQW DYDQWDMHOH

DERUGDUHDHVWHJHRPHWULF FHHDFHSHUPLWHvQ HOHJHUHDPDLSURIXQG 


DPHWRGHLLDUH]XOWDWHORULQWHUPHGLDUHFDUHDMXW ODLQWHUSUHWDUHDHL

proprii din acest punct de vedere.


-

$QDOL]D &RUHVSRQGHQ HORU GHYLQH R JHQHUDOL]DUH D $QDOL]HL vQ


&RPSRQHQWH 3ULQFLSDOH L VH vQ HOHJH PDL ELQ

e n acest cadru

geometric.
-

UH]XOWDWHOHSURJUDPHORUIUDQFH]HGHDQDOL] vQFRPSRQHQWHSULQFLSDOH
FRUHVSXQGDFHVWHLDERUG UL

&ULWHULXO LQHU LHL HVWH vQ DFHODL WLPS PXOW PDL FRPSOH[ GHFkW FHOHODOWH
GRX  FULWHULL SURSXVH GH +RWHOOLQJ  FULWHULXO FRUHOD LHL L FULWHULXO

dispersiei.

Vom prezenta de asemeni aceste criterii, ele corespunznd rezultatelor


RE LQXWHFXSURJUDPHOHDPHULFDQH$&3

3UH]HQWDUHD$&3FRQIRUPDERUG

ULLJHRPHWULFHDOXL3HDUVRQ

1RUXOGHSXQFWHDVRFLDWGDWHORULFDUDFWHULVWLFLO

e sale

Q DFHDVW  DERUGDUH JHRPHWULF  VH DVRFLD]  GDWHORU QRUXO GH SXQFWH

N = {x1 ,...., xi ,.., xn } ntr-XQ

VSD LX  GH GLPHQVLXQH S ILHFDUH YHFWRU

xi de

caracteristici ( xi1 ,.....xip ) ale individului i este considerat drept un punct ntr-un
VSD LXFXSGLPHQVLXQL
&HQWUXOGHJUHXWDWHDOQRUXOXL1HVWHSXQFWXOJDOHF UXLFRRUGRQDWHVXQW

mediile diferitelor variabile.


g=

1 n
xi = ( x1 ,...., x j ,.., x p ) = x .
n i =1

Pentru exemplul nostru:


g = (1906.114,183,1111,422,169).
9HFWRUXOJUHSUH]LQW FXPYDFDUDFWHULVWLFLOHXQHLPDLQLPHGLL
PSU WLHUHD QRUXOXL vQ MXUXO FHQWUXOXL V X GH JUHXWDWH VH P VRDU  FX
DMXWRUXOLQHU LHLWRWDOHDQRUXOXL1GHILQLW SULQ

I (N , g) =

1 n p
( xij x j ) 2 .
n i =1 j =1

,QHU LD WRWDO  SRDWH IL FDOFXODW  GLUHFW ILLQG HJDO  FX VXPD GLVSHUVLLO

or

YDULDELOHORUGLQSUREOHP 

I (N , g) =

1 n 2
1 n p
d
(
x
,
g
)
=
i
( xij x j ) 2 =
n i =1
n i =1 j =1

p
1 n
2
(
x

x
)
=
s 2j

ij
j
n
j =1
i =1
j =1

=
2E LQHPSHQWUXH[HPSOX

I(N,g)=267072+1441+609+50824+1638+56=321640.
6HREVHUY F LQHU LDQRUXOXLVHGDWRUHD] vQSULQFLSDOFLOLQGUHHL
$FHDVWD GLQ FDX]D DOHJHULL XQLW

LORU GH P VXU  'DF  DP IL P V

urat

FLOLQGUHHD vQ OLWUL LPSRUWDQ D  H[DJHUDW  D FLOLQGUHHL vQ FDOFXOXO LQHU LHL DU IL
GLVS UXW
Q SUDFWLF  DGHVHRUL HVWH SUHIHUDELO V  RE LQHP R GHVFULHUH D GDWHORU
LQGHSHQGHQW GHDOHJHUHDXQLW

LORU

3XWHPV RE LQHPGDWHRPRJHQHWUDQVIRUPkQGGDWHOHLQL LDOHvQYDULDELOH


FHQWUDWH UHGXVH ILHF UHL YDULDELOH

X *j =

X j xj
sj

X j L VH DVRFLD]

GHPHGLHLGLVSHUVLH

1RXOWDEHOVWXGLDWHVWHIRUPDWGLQFDQWLW

xij* =

LOH

xij x j
sj

 YDULDELOD FHQWUDW  UHGXV 

La individXOLVHDVRFLD] DFXPSXQFWXO xi* = ( xi*1 ,...., xip* ).


Noul nor de puncte este N * = {x1* ,...., xn* } .
Centrul de greutate al norului N *  HVWH  L LQHU LD VD WRWDO

 HVWH HJDO  FX

QXP UXOSDOYDULDELOHORU

3ULPDD[

SULQFLSDO

LSULPDFRPSRQHQW

9RP VWXGLD FRQVWUXF LD L SURSULHW

SULQFLSDO

LOH SULPHL FRPSRQHQWH SULQFLSDOH

apoi vom ilustra interpretarea ei cu ajutorul unui exemplu.


3ULPDHWDS DDFHVWHLFRQVWUXF LLFRQVW vQF XWDUHDSULPHLD[HSULQFLSDOHD

norului de puncte N * .
3ULPDD[

SULQFLSDO

& XW P V  IDFHP FD R GUHDSW 

1  V

 WUHDF  FkW PDL  ELQH SRVLELO SULQ

mijlocul norului de puncte N .


6H P VRDU  vPSU WLHUHD QRUXOXL

N * n jurul unei drepte cu ajutorul


ODGUHDSWD .

LQHU LHL

I ( N * , ) norului N * UDSRUWDW
1 n
I ( N * , ) = d 2 ( xi* , yi ) unde yi 
n i =1
*
punctului xi pe dreapta .

Dreapta 1 FDXW
a norului N * .

V PLQLPL]H]H

6H SRDWH DU WD F  GUHDSWD

HVWH

SURLHF LD

RUWRJRQDO 

I ( N * , ) LVHQXPHWHSULPDD[

P ( xi* )

SULQFLSDO 

1 trece prin originea O, centrul de greutate al

norului N al datelor centrate-UHGXVH L HVWH JHQHUDW


*

 GH  YHFWRUXO XQLWDU

YHFWRUSURSULXQRUPDWDOPDWULFHL5DFRUHOD LLORUvQWUHYDULDELOHOH

u1 ,

X j , asociat

la cea mai mare valoare proprie 1 .


9DORULOHSURSULLLYHFWRULLSURSULLDLPDWULFHL5VXQWFXSULQLvQWDEHOXO
& XWDUHDSULPHLD[HSULQFLSDOH

1 HVWHYL]XDOL]DW

vQILJXUD

Tabelul 4

9DORULLYHFWRULSURSULLDLPDWULFHLGHFRUHOD LL

Figura 5
&

XWDUHDSULPHLD[HSULQFLSDOH

3HQWUXH[HPSOXOFXPDLQLOHDPRE LQXW

1 = 4.6745
u1 = (0.4434;0.4182;0.3497;0.4252;0.4246;0.3811).

Se poate verifica rezultatul:


6

| | u1 ||= u12j = 0.4434 2 + ...... + 0.38112 = 1


j =1

3ULPDFRPSRQHQW

SULQFLSDO

Prima FRPSRQHQW

 SULQFLSDO 

Y1  HVWH R QRX

 YDULDELO  GHILQLW  SHQWUX

ILHFDUH LQGLYLG L SULQ OXQJLPHD DOJHEULF  D SURLHF LHL SXQFWXOXL

1 .Valoarea lui Y1 (i ) este deci egaO

xi* pe axa

FXSURGXVXOVFDODUvQWUHYHFWRULL

x ij x j

j =1

sj

Y1 (i ) = Oy i = u1 j (

u1 L xi* :

Astfel, valoarea primei componente principale Y1 SHQWUX5RYHUHVWHHJDO

FX

Y1 (Rover)=0.44*1.49+0.41*1.67+0.34*1.58+0.43*1.13+0.43*1.17+0.38*0.83=3.19
Global, Y1 se scrie deci:
Y1 = 0.44Cilindree* + 0.41Putere * + 0.34Viteza * +
+ 0.43Greutate* + 0.43Lungime* + 0.38Latime* .

Valorile lui Y1 pentru fiecare individ sunt cuprinse n tabelul 5.

WUDWHOHGLVWDQ HORUSkQ
LS

Tabelul 5
ODRULJLne, componentele principale

WUDWHOHFRVLQXVXULORU

3ULPD FRPSRQHQW  SULQFLSDO 

Y1 HVWH FHQWUDW

 ILLQG FRPELQD LH OLQLDU  GH

variabile centrate.
6HSRDWHDU WDF GLVSHUVLDVDHVWHHJDO FX

1 :

1
1
Y12 (i ) = d 2 ( y i ,0) = I ({ y1 ,...., y n },0) = 1 .

n i =1
n i =1
Dispersia primei componente principale Y1  HVWH HJDO  FX LQHU

Dispersie (Y1 ) =

LD QRUXOXL

de puncte proiectate pe 1 , n raport cu centrul de greutate O.


&RUHOD LLOH vQWUH YDULDELOHOH

Xj

L FRUHVSRQGHQ D SULQFLSDO 

Y1 pot fi

calculate cu ajutorul formulei:


cor ( X j , Y1 ) = 1 u1 j
6H GHGXFH F  DVLPLODULWDWHD  OXL

Y1  ID

 GH PXO LPHD GH YDULDELOH HVWH

HJDO FX

1 p

cor 2 ( X j , Y1 ) = 1

p j =1
p

Pentru exempluO QRVWUX RE LQHP

4.656
= 0.776 comparabil cu 0.715 al
6

cilindreei din tabelul 3.


&RUHOD LLOHvQWUH

X j L Y1 DSDUvQSULPDFRORDQ

DWDEHOXOXL

Tabelul 6.
&RUHOD LLYDULDELOH

-componente principale

PULPD FRPSRQHQW

 SULQFLSDO

Y1  ILLQG IRDUWH FRUHODW

 SR]LWLY FX WRDWH

YDULDELOHOH HD SRDWH IL LQWHUSUHWDW  FD XQ IDFWRU GH P ULPH FODVkQG PDLQLOH

de la cele mai mici ( Y1 (Fiat Uno)= -3.76; Y1 (Ford Fiesta)= - 3.50) la cele mai
mari ( Y1 (Renault 25)=3.44; Y1 (BMV530i)=3.95).
&DOLWDWHDJOREDO DSULPHLFRPSRQHQWHSULQFLSDOH
3HQWUX D P VXUD FDOLWDWHD JOREDO  D SULPHL FRPSRQHQWH SULQFLSDOH
FRQVLGHUDW  FD UH]XPDW DO GDWHORU VH IRORVHWH IRUPXOD GH GHVFRPSXQHUH D
LQHU LHLWRWDOH

Vectorul yi  ILLQG SURLHF LD RUWRJRQDO


avem:

 D YHFWRUXOXL

xi* pe dreapta 1 ,

d 2 ( xi* ,0) = d 2 ( yi ,0) + d 2 ( xi* , yi ) de unde :

1 n 2 *
1 n 2
1 n 2 *
d
(
x
,
0
)
=
d
(
y
,
0
)
+
d ( xi , yi )
i
i n
n i =1
n i =1
i =1
,QHU LDWRWDO 

I ( N * ,0) =

1 n 2 *
d ( xi ,0) = p
n i =1

VHGHVFRPSXQHGHFLvQGRX S U L

1 n 2
d ( yi ,0) = I ({ y1 ,...., yn },0)  UHSUH]LQW  LQHU LD WRWDO  D
n i =1
norului { y1 ,...., yn } D SURLHF LLORU SXQFWHORU xi* pe axa 1 $FHDVW 

primul termen

FDQWLWDWHUHSUH]LQW LQHU LDH[SOLFDW GHD[D

1 LHVWHHJDO

1
d 2 ( xi* , yi ) = I ( N * , 1 )  UHSUH]LQW

n i =1
a norului n jurul axei 1

al doilea termen

PentruH[HPSOXOFXPDLQLOHRE LQHP
LQHU LDWRWDO S 
LQHU LDH[SOLFDW GH 1 = 1 =4.656
LQHU LDUH]LGXDO  S- 1 =1.344

FX

 LQHU LD UH]LGXDO 

&DOLWDWHDJOREDO  D SULPHL FRPSRQHQWH SULQFLSDOH VH P VRDU SULQ SDUWHD

de iner LH H[SOLFDW


ID

1
p

6H UHJ VHWH  DSURSULHUHD FRPSRQHQWHL SULQFLSDOH

GHPXO LPHDGHYDULDELOH
QH[HPSOXSDUWHDGHLQHU LHH[SOLFDW GH

poate sSXQH F   GLQ LQHU LD WRWDO


de-a lungul primei axe principale.
&DOLWDWHDUHSUH]HQW

1 HVWHHJDO

FX

4.656
= 0.776. Se
6

 HVWH H[SOLFDW  SULQ DOXQJLUHD QRUXOXL

ULLLQGLYL]LORUSHSULPDD[

SULQFLSDO

&DOLWDWHD UHSUH]HQW ULL ILHF UXL LQGLYLG SH D[D

Y1

1  VH P VRDU
*
WUDWXOXLFRVLQXVXOXLXQJKLXOXLIRUPDWGHYHFWRUXO xi cu axa 1 :
cos 2 ( xi* , 1 ) =

 FX D

jutorul

d 2 ( yi ,0)
Y1 (i ) 2
=
.
d 2 ( xi* ,0) d 2 ( xi* ,0)

Astfel avem pentru Rover:

Y1 ( Rover ) = 3.19
d 2 ( Rover,0) = 1.49 2 + 1.67 2 + 1.582 + 1.132 + 1.17 2 +
+ 0.832 = 10.8
cos 2 ( Rover , 1 ) =

10.18
= 0.94
10.80

5RYHUHVWHELQHUHSUH]HQWDWSHD[DSULQFLSDO 

1 .

3 WUDWHOH GLVWDQ HORU  ILHF UXL LQGLYLG OD RULJLQH L S WUDWHOH FRVLQXVXULORU

sunt date n Tabelul 5.


$GRXDD[

SULQFLSDO

LDGRXDFRPSRQHQW

3UH]HQW P FRQVWUXF LD L SURSULHW

SULQFLSDO

LOH FHOHL GH

-a doua componente

principale.
$GRXDD[

SULQFLSDO

6HFDXW RD[ 

2 RUWRJRQDO

$FHDVW  D GRXD D[  SULQFLSDO 

FX

1 LFDUHV

 PLQLPL]H]HLQHU LD

I ( N * , ) .

2  WUHFH SULQ RULJLQHD 2 L HVWH JHQHUDW

 GH

vectorul u2  YHFWRU SURSULX QRUPDW GLQ PDWULFHD GH FRUHOD LL 5 DVRFLDW OD D
doua cea mai mare valoare proprie 2 .
Valoarea proprie 2  L YHFWRUXO SURSULX u2  SHQWUX H[HPSOXO FX PDLQLOH
se DIO  vQ 7DEHOXO  & XWDUHD FHOHL GH-a doua axe principale 2 este
YL]XDOL]DW vQ)LJXUD

Figura 6
&

6  QRW P FX

XWDUHDFHOHLGH

-a doua axe principale

zi  L ai  SURLHF

LLOH SXQFWXOXL

xi* pe axa 2  L SH SODQXO

( 1 , 2 ) respectiv. Vectorii yi L zi VXQWGHDVHPHQLSURLHF LLOHSXQFWHORU ai pe


axele 1 L 2 .
Din descompunerea:
d 2 ( xi* ,0) = d 2 (ai ,0) + d 2 ( xi* , ai ) =
= d 2 ( yi ,0) + d 2 ( zi ,0) + d 2 ( xi* , ai )

deducem:

I ( N , O ) = I ({ y1 ,...., yn },0) + I ({z1 ,...., z n },0) + I ( N * , ( 1 , 2 ))

(1)

unde
1 n 2 *
d ( xi , ai )
n i =1
*
LDQRUXOXL N
n raport cu planul (1 , 2 ) . 6HSRDWHGHPRQVWUDF
I ( N * , (1 , 2 )) =

HVWHLQHU

I ( N , (1 , 2 ))  HVWH PLQLP
*

 vQ UDSRUW FX LQHU LD ID

 GH WRDWH FHOHODOWH SODQH

posibile.
Planul (1 , 2 )  VH QXPHWH  SULPXO SODQ SULQFLSDO (VWH SODQXO FDUH  WUHFH
cel mai bine posibil prin mijlocul norului N * vQVHQVXOFULWHULXOXLLQHU LHL
$GRXDFRPSRQHQW

SULQFLSDO

$ GRXD FRPSRQHQW  SULQFLSDO  

Y2  HVWH R YDULDELO

fiecare individ i prin :


Y2 L 

OXQJLPHDDOJHEULF DVHJPHQWX

lui [0, zi ]

 QRX  GHILQLW  SHQWUX

xij x j

j =1

sj

Y2 (i) = u2 j (

) .

Pentru exemplu, putem scrie Y2 global:


Y2 = 0.03Cilidree* + 0.42 Putere * + 0.66Viteza * 0.26Greutate *
0.30 Lungime* 0.48Latime*

Valorile sale sunt date n Tabelul 5. Pentru Rover, Y2 ia valoarea 0.77.


$GRXDFRPSRQHQW SULQFLSDO 

Y2 HVWHFHQWUDW

LGHGLVSHUVLHHJDO FX

2 .

Putem scrie:
1 n
1 n 2
2
Y
(
i
)
=
2
d ( zi ,0) =
n i =1
n i =1
= I ({z1 ,...., z n },0) = 2
Disp (Y2 ) =

0DL PXOW FRUHOD LD vQWUH

variabilele X J L Y2 VHFDOFXOHD]

Y1  L Y2  HVWH HJDO

(2)

 FX ]HUR &RUHOD LLOH vQWUH

FXDMXWRUXOIRUPXOHL

cor ( X J , Y2 ) = 2 u 2 j
&RUHOD LLOH  GLQWUH YDULDELOHOH L FRPSRQHQWD SULQFLSDO 

Y2 din exemplul

nostru sunt datH vQ 7DEHOXO  3XWHP REVHUYD F  Y2  HVWH FRUHODW
YDULDELOHOH

PRWRU &LOLQGUHH

3XWHUH

9LWH] 

YDULDELOHOHFRQIRUW *UHXWDWH/XQJLPH/
$ GRXD FRPSRQHQW  SULQFLSDO 

Y2

L

FRUHODW 

 SR]LWLY FX
QHJDWLY

FX

LPH 

RSXQH DVWIHO PDLQL VSRUWLYH FX XQ

motor prea puternic n raport cu confortul


( Y2 (Peugeot 205 Rallye)=1.48, Y2 (Audi 90 Quattro)=1.36)
ODPDLQLIDPLOLDOHFXXQFRQIRUWVSRULWvQUDSRUWFXPRWRUXO

( Y2 (VW Caravelle= - 2.38, Y2 (Nissan Vanette)= - 1.82).


&DOLWDWHDJOREDO
GRX

DFHOHLGH DGRXDFRPSRQHQW

SULQFLSDO

LDSULPHORU

FRPSRQHQWHSULQFLSDOH
'LQ HFXD LLOH   L   VH GHGXFH F  SDUWHD GH LQHU LH H[SOLFDW  GH D GRX

D[  SULQFLSDO  HVWH HJDO  FX


HJDO FX

  LDU DFHHD H[SOLFDW  GH SODQXO

(1 , 2 ) este

(1 + 2 )
.
p

n exemplu, 2 H[SOLF
H[SOLF   

&DOLWDWHDUHSUH]HQW

 

 GLQ LQHU LD WRWDO  LDU

(1 , 2 )

GLQLQHU LDWRWDO 

ULLLQGLYL]LORUSHDGRXDD[

SULQFLSDO

LSHSULPXO

plan principal
&DOLWDWHDUHSUH]HQW ULLILHF UXLSXQFW

xi* pe axa 2 LSHSODQXO (1 , 2 ) se

P VRDU  FX DMXWRUXO S WUDWHORU FRVLQXVXULORU XQJKLXULORU IRUPDWH GH YHFWRUXO

xi* SHGHRSDUWHLGHD[D 2 sau planul (1 , 2 ) SHGHDOW

SDUWH

Pe 2 :
cos 2 ( x i* , 2 ) =

d 2 ( z i ,0)
Y2 (i ) 2
=
d 2 ( xi* ,0) d 2 ( xi* ,0)

Pe (1 , 2 ) :
cos 2 ( xi* , (1 , 2 )) =

d 2 (a i ,0) d 2 ( y i ,0) + d 2 ( zi ,0)


=
=
d 2 ( xi* ,0)
d 2 ( xi* ,0)

= cos 2 ( xi* , 1 ) + cos 2 ( xi* , 2 )

Pentru Rover, avem:


cos 2 ( Rover , 2 ) = 0.06
cos 2 ( Rover , (1 , 2 )) = 0.94 + 0.06 = 1.00
&XFkWHYDDSUR[LP ULGDWRUDWHURWXQMLULORUSXWHPDILUPDF 5RYHUVHDIO 
FRQ LQXWvQSULPXOSODQSULQFLSDO

Rezultate generale
Extinznd
PXO LPH

GH

UH]XOWDWHOH SUH]HQWDWH vQ VHF LXQLOH SUHFHGHQWH VH RE LQ R

S

D[H

SULQFLSDOH

1 ,......., p

generate de vectorii proprii


RUWRQRUPD L u1 ,......., u p DVRFLD LODYDORULOHSURSULL 1 ,......., p aranjate n ordinea
descUHVF WRDUHGLQPDWULFHDGHFRUHOD LL5
)LJXUDYL]XDOL]HD] DFHVWQRXUHSHU

Figura 7
Axele principale. Componentele principale

Componentele principale Y1 ,......,Y p sunt definite prin Yh (i ) = u hj xij* .


j =1

(OHUHSUH]LQW FRRUGRQDWHOH

punctelor xi* n noul reper.

6HSRDWHDU WDF HOHVXQWFHQWUDWHGHGLVSHUVLH

h LQHFRUHODWHvQWUHHOH

*
i

Punctele x pot fi exprimate n acest nou reper:


p

xi* = Yh (i )u k
h =1

Formulele carH XUPHD]  VXQW IRDUWH LPSRUWDQWH L VH GHGXF GLUHFW GLQ
procesul de construire al componentelor principale:
Formula de reconstituire a datelor:
p

xij* = Yh (i )u hj

(3)

h =1

)RUPXODGHUHFRQVWLWXLUHDPDWULFHLFRUHOD LLORUGLQWUHYDULDELOH

cor ( X j , X l ) = huhj uhl

(4)

h =1

WUDWXOXLGLVWDQ HLXQXLSXQFWODRULJLQH

)RUPXODGHGHVFRPSXQHUHDS

d 2 ( xi* ,0) =|| x i* ||2 = Yh (i ) 2


h =1

de unde se deduce:
p

(i)

cos ( x ,
2

h =1

*
i

(ii)

h =1

&DOFXOXOFRUHOD LLORUvQWUHYDULDELOHOH

) =1

=p
X j LFRPSRQHQWHOHSULQFLSDOH Yh

cor ( X j , Yh ) = h u hj

(5)

'HGXFHP F  DVLPLODULWDWHD FRPSRQHQWHL SULQFLSDOH

X 1 ,...., X p 

HVWH HJDO  FX 

H[SOLFDW GHD[DSULQFLSDO 

1
p

cor
j =1

( X j ,Yh ) =

p
p

Yh cu variabilele

 DGLF  SDUWHD GH LQHU LH

h .

'LVWDQ DOXL0DKDODQRELV
3HQWUXDP VXUDGLVWDQ DGLQWUHXQLQGLYLGLFHQWUXOGHJUHXWDWHDOQRUXOXL
VHXWLOL]HD] DGHVHDGLVWDQ DOXL0DKDODQRELV
(D VH GHILQHWH vQ IHOXO XUP WRU VH FRQVWUXLHVF PDL vQWkL FRPSRQHQWHOH

principale Z h preferabil pentru datele de origine dect pentru datele centrateUHGXVH 3HQWUX  DFHDVWD VH XWLOL]HD]  YHFWRULL SURSULL vh din matricea
de
covariaQ  D YDULDELOHORU X j  L VH FDOFXOHD]  YDULDELOHOH Z h cu ajutorul
formulei:

Z h (i ) =
'LVWDQ DOXL0DKDODQRELV

v
j =1

hj

( x ij x j )

d M ( xi , x ) dintre punctul xi LFHQWUXOGHJUHXWDWH

x DOQRUXOXLIRUPDWGLQGDWHOHGHRULJLQHVHGHILQHWHFXDMXWRUXOIRUPXOHL
p

d ( xi , x ) = Z h* (i ) 2
2
M

unde Z h* este variabila Z h UHGXV

h =1

5HSUH]HQW ULJUDILFH
(VWHYRUEDGHUHSUH]HQW ULJUDILFHDOHLQGLYL]LORULYDULDELOHORU

Harta indivizilor
3URLHF LLOH SXQFWHORU

xi* pe primul plan principal (1 , 2 ) au drept

coordonate pe axele principale 1 , 2 valorile Y1 (i ) L Y2 (i ) .


5HSUH]HQWDUHDJUDILF  D SXQFWHORU

Ai = (Y1 (i ), Y2 (i ))  QH G

rezultat al datelor dintr-XQ SODQ $FHDVW


n Figura 8.

 DVWIHO FHO PDL EXQ

 KDUW  D LQGLYL]LORU HVWH UHSUH]HQWDW 

6H YHULILF  LQWHUSUHWDUHD D[HORU SUH]HQWDW  DQWHULRU PDLQLOH DSDU GH

-a

OXQJXO SULPHL D[H vQ IXQF LH GH PRGHOXO ORU GH OD FHOH PDL PLFL )LDW 8QR
)RUG)LHVWD ODFHOHPDLPDUL 5HQDXOW%09L LGH

-a lungul celei dean


Vanette, VW Caravelle) la cele sportive (Citroen AX Sport, Peugeot 205
Rallye).

D GRXD D[H vQ IXQF LH GH FDUDFWHULVWLFD ORU GH OD PDLQLOH IDPLOLDOH  1LVV

Figura 8
3ULPXOSODQSULQFLSDOLFHUFXOFRUHOD LLORU

Harta variabilelor
Variabilele sunt reprezentate ntr-un plan cu ajutorul punctelor:
B j = ( cor ( X j , Y1 ), cor ( X j ,Y2 )) Se RE LQH UHSUH]HQWDUHD  JUDILF  GLQ )LJXUD 
QXPLW FHUFXOGHFRUHOD LL
(VWH YL]XDOL]DW ELQH IDSWXO F  SULPD FRPSRQHQW  SULQFLSDO  FRUHODW 
SR]LWLYFXWRDWHYDULDELOHOHSUREOHPHLHVWHXQIDFWRUGHWDOLH P ULPH LF 
DGRXDFRPSRQHQW SULQFLSDO RSXQkQG 9LWH]D3XWHUH OD /

LPH/XQJLPH

L*UHXWDWH FODVHD] PRGHOHOHFRQIRUPFDUDFWHUXOXLORUVSRUWLYVDXIDPLOLDO

Lungimea R j a vectorilor-variabile B j  UHSUH]LQW  FRUHOD LD PXOWLSO 


R( X j ;Y1 ,Y2 ) dintre variabila X j LFHOHGRX FRPSRQHQWHSULQFLSDOH6HRE LQH
vQDGHY U

|| B j ||2 = cor 2 ( X j , Y1 ) + cor 2 ( X j , Y2 ) = R 2 ( X j ;Y1 ,Y2 ),


F FL YDULDELOHOH

Y1 ,Y2 sunt necorelate ntre ele. Pentru exemplul prezentat se

RE LQH

Variabile
Cilindree
Putere
9LWH]

Greutate
Lungime
/

LPH

Rj
0.96
0.98
0.97
0.92
0.97
0.93

7RDWHYDULDELOHOHVXQWELQHUHSUH]HQWDWHSHFHUFXOGHFRUHOD LL
3DUWHD GH LQHU LH H[SOLFDW  GH SULPXO  SODQ SULQFLSDO ILLQG IRDUWH PDUH
 FRUHOD LLOHvQWU

e variabile sunt bine reconstituite utiliznd doar primii


doi termeni din formula (4):

cor ( X j , X l ) = h u hj u hl

FDUHGHYLQHGDF  LQHPFRQWGH 

h =1

cor ( X j , X l ) = cor ( X j , Y1 )cor ( X l , Y2 ).


h =1

$VWIHO FRUHOD LD vQWUH YDULDELOHOH

X j L X l 

SRDWH IL DSUR[LPDW  SULQ

produsul scalar < B j , B l > dintre vectorii B j L B l .


Exemplu:FRU &LOLQGUHH3XWHUH  HVWHELQHDSUR[LPDW

SULQ

cor (Cilindree, Y1 )cor ( Putere, Y1 ) + cor (Cilindree,Y2 ) cor ( Putere,Y2 ) =


= 0.96 0.89 + 0.03 0.40 = 0.8664

Cosinusul unghiului format de vectorii B j L B l fiind dat de formula:


cos( B j , B l ) =

< B j ,Bl >


|| B j |||| B l ||

, FRUHOD

LDvQWUH

X j L X l se scrie aproximativ:

cor ( x j , xl ) =|| B j ||||B l || cos( B j , B l ).


$VWIHO FRUHOD LLOH vQWUH YDULDELOH

X j sunt aproximativ reconstituite pe

FHUFXO GH FRUHOD LL vQ IXQF LH GH OXQJLPHD YHFWRULORU

YDULDELOH L D

FRVLQXVXULORUXQJKLXULORUGLQWUHDFHWLYHFWRUL
6H SRDWH YHULILFD GH H[HPSOX F  GHQGRJUDPD GLQ )LJXUD  H[SULP  ELQH
SR]L LDYHFWRU

ilor-YDULDELOHGLQFHUFXOGHFRUHOD LLXQLLvQUDSRUWFXFHLODO L

Biplotul
Lundu-QH FkWHYD SUHFDX LXQL vQ FHHD FH SULYHWH VFDUD GH UHSUH]HQWDUH
HVWH SRVLELO V  VXSUDSXQHP FHOH GRX  JUDILFH SULPXO SODQ  SULQFLSDO L FHUFXO
GHFRUHOD LLRE LQkQGDVWIHORUHSUH]HQWDUHvPERJ

LW 

$FHDVW  UHSUH]HQWDUH VLPXOWDQ  D LQGLYL]LORU L D YDULDELOHORU VH QXPHWH


ELSORWH[SUHVLHLQWURGXV GH*DEULHO  
3UHVXSXQHP PDL vQWkL F  SDUWHD HVHQ LDO  GLQ LQHU LD WRWDO  HVWH H[SOLFDW 
GHSULPXOSODQSULQFLSDO'DF QXHVWHDDWUHEXLHV OLPLW PUH]XOWDWHOHFDUH
YRU XUPD OD SXQFWH ELQH UHSUH]HQWDWH SH SULPXO SODQ SULQFLSDO L OD YDULDELOH
IRDUWHSXWHUQLFFRUHODWHFXSULPHOHGRX FRPSRQHQWHSULQFLSDOH

Cu aceste ipoteze, formula


p

xij* = Yh (i )uhj
h =1

de reconstituLUHDGDWHORUSHUPLWHRE LQHUHDXQHLEXQHDSUR[LP

ULDSXQFWHORU

*
ij XWLOL]kQGGRDUSULPHOHGRX GLPHQVLXQL

*
xij* = Yh (i )u hj Notnd Yh =
h =1

utiliznd faptulF

Yh
h

FRPSRQHQWD SULQFLSDO 

cor ( X j ,Yh ) = h u hj DFHDVW


2

IRUPXO GHYLQH

xij* = Yh* (i )cor ( X j , Yh ) (6).


h =1

Yh 

UHGXV  L

2675 1906.1
= 1.49 bine reconstituit prin
516.79
Y1* ( Rover )cor (Cilindree, Y1 ) + Y2* ( Rover ) cor (Cilindree, Y2 ) =

Exemplu Avem x(*Rover ,Cilindree ) =


=

1
1
3.19 0.96 +
0.77 0.03 = 1.44.
4.656
0.9152

)RUPXOD   H[SULP  IDSWXO F 

xij* este aproximativ reconstituit prin

produsul scalar dintre vectorii Ai* = (Y1* (i ),Y2* (i )) L B j = ( cor ( X j , Y1 ), cor ( X j ,Y2 ))
*
1RW P Pij SURLHF LDYHFWRUXOXL Ai pe axa ( B j ) JHQHUDW GHYHFWRUXO B j .
$FHVWHQRWD LLVXQWYizualizate n Figura 9.
/XQJLPHDDOJHEULF 

OPij =

OPij

HVWHHJDO FX

Y (i)cor(X ,Y )

cor ( X j ,Y1 ) + cor ( X j ,Y2 ) h=1


2

*
h

Figura 9

3XQFWHLQGLYL]LLD[HYDULDELOH
1XPLWRUXOILLQGHJDOFXFRUHOD LDPXOWLSO 

R j ntre X j LSULPHOHGRX

D[H

principale, avem deci: x = R j OPij .


$DGDU SURLHF LLOH SXQFWHORU-indivizi
Ai* pe axele variabile ( Bij ) au
*
OXQJLPLOH DOJHEULFH SURSRU LRQDOH FX GDWHOH xij 5HSDUWL LD SURLHF LLORU Pij pe
axa ( B j )  UHIOHFW  GHFL ELQH UHSDUWL LD YDORULORU xij* ale variabilei X *j  L vQ
FRQVHFLQ LDFHHDDYDORULORU xij ale variabilei de origine X j .
*
ij

Q)LJXUDDPFRQVWUXLWELSORWXOUHSUH]HQWDUHDVLPXOWDQ DLQGLYL]LORUL
DYDULDELOHORUvQIHOXOXUP WRU

- LQGLYL]LLVXQWUHSUH]HQWD LSULQSXQFWHOH Ai* = (Y1* (i ),Y2* (i ));


- variabilele sunt reprezentate prin axele ( B j ) situate pe grafic cu
ajutorul punctelor (3cor( X j ,Y1 ),3 cor( X j ,Y2 )). Coeficientul 3 a fost
DOHVvQVFRSXORE LQHULLXQHLPDLEXQHYL]LELOLW LDSXQFWHORU-variabile.

Figura 10
%LSORWUHSUH]HQWDUHDVLPXOWDQ

DLQGLYL]LORULDYDULDELOHORU

$VWIHO VH SRDWH YHULILFD IDSWXO F  SURLHF LD PDLQLORU SH D[D 9LWH] 
UHVWLWXLHELQHUHSDUWL LDGDWHORUGHSOHFDUHSURLHF LLOHPDLQLORUFHOHPDLUDSLGH

(BMW 530i, Renault 25, Audi 90 Quatro) se opun bine la cele mai lente (Ford
Fiesta, Nissan Vanette, Fiat Uno, VW Caravelle).
'H DVHPHQHD SURLHF LLOH PDLQLORU SH D[D /

LPH RSXQ ELQH PDLQD FHD

PDLODW  9:&DUDYHOOH ODFHDPDLvQJXVW  )LDW8QR 

Prezentarea Analizei n Componente Principale (A.C.P.) conform


DERUG ULLlui Hotelling)
3URFHVXO GH FRQVWUXLUH DO FRPSRQHQWHORU SULQFLSDOH SUH]HQWDW SkQ  DFXP

este laborios, dar conduce la un ansamblu de rezultate foarte complet.


+RWHOOLQJ   D SURSXV  FULWHULL FDUH V  SHUPLW  RE LQHUHD PDL GLUHFW  D

componentelor principaOHGDUVHSLHUGHvQDFHVWFD]GLPHQVLXQHDJHRPHWULF
a problemei.

9RPSUH]HQWDFULWHULXOFRUHOD LHLDSRLDOGLVSHUVLHL

&ULWHULXOFRUHOD LHL
6H FDXW  P YDULDELOH

maximizeze criteriul :

F1,....., Fm centrate-UHGXVH

L QHFRUHODWH FDUH V 

[ p cor
h =1

j =1

( X j , Fh )]

(7)

&X DOWH FXYLQWH VH FDXW  UH]XPDUHD YDULDELOHORU GH RULJLQH

X 1 ,....., X p

printr-XQ QXP U PDL PLF GH YDULDELOH F1,....., Fm  QHFRUHODWH vQWUH HOH L FDUH V
reprezinte principalele dimensiuni ale fenomenului studiat.

6HSRDWHGHPRQVWUDF PD[LPXOIRUPXOHL  HVWHDWLQVSHQWUXYDULDELOHOH

Fh = Yh* =

Yh

,care sunt tocmai componentele principale reduse. Valoarea

PD[LPXOXLHVWHHJDO FX

(1 + .... + m ) / p .

Criteriul dispersiei
6H FDXW  P YDULDELOH

Z1,....., Z m de forma Z h = v hj X j

cu vectorii

j =1

vh = (vh1,....., vhp ) RUWRQRUPD

LFDUHV PD[LPL]H]HFULWHULXO

Dispersie( Z
h =1

) (8)

6H GHPRQVWUHD]  F  PD[LPXO IRUPXOHL   HVWH DWLQV SHQWUX YHFWRULL


SURSULLQRUPD L

v1,....., vm DLPDWULFHLGHFRYDULDQ

vQWUHYDULDELOHOH

x j asociate

la cele mai mari m valori proprii v1,....., vm LDUHGUHSWYDORDUH v1 + ..... + vm .


'DF OX PP

SVHRE LQH

v1 + ..... + v p = Dispersie( X j )
j =1

Suma SULPHORU P YDORUL SURSULL UHSUH]LQW  GLVSHUVLD H[SOLFDW  GH FHOH P


variabile Z1 ,....., Z m .
*
*
'DF  VH OXFUHD]  FX YDULDELOHOH FHQWUDWH-reduse X 1 ,...... X p , atunci Z h = Yh
LRE LQHP

Dispersie(Z
h =1

Metoda de clDVLILFDUH DVFHQGHQW


Ward

) = 1 + ....... + m .

 LHUDUKLF

 FX DMXWRUXO FULWHULXOXL OXL

$FHDVW  PHWRG  FRQGXFH OD XQ DOW SURFHGHX GH D UH]XPD GDWHOH
FRQVWUXLUHDXQXLWLSRORJLL VDXSDUWL LL DLQGLYL]LORUvQFODVHDVWIHOFDLQGLYL]LL
FDUH DSDU LQ DFHOHLDL FODVH V  ILH DVHP Q WR

ri (similari) n timp ce indivizii

FDUHDSDU LQODFODVHGLIHULWHV ILHGHRVHEL LGHS UWD L GLVLPLODUL 

Calitatea unei tipologii


6 FRQVLGHU PRWLSRORJLHDPXO LPLLQRDVWUHGHLQGLYL]LvQNFODVHILHFDUH
FODV DYkQGUHVSHFWLY
6 QRW PFX

n1 ,....., nk indivizi.

G1 ,....., Gk WLSRORJLDFRUHVSXQ]

WRDUHQRUXOXLGHSXQFWHDVRFLDW

N = {x1 ,....., xn } LFX g1 ,....., g k centrele de greutate ale acestor clase.


,QHU LDWRWDO DQRUXOXL1VHGHVFRPSXQHvQIHOXOXUP WRU

k
k
n
n
I ( N , g ) = ( i )d 2 ( gi , g ) + i I (Gi , gi ).
i =1 n
i =1 n

3ULPXO WHUPHQ GLQ GUHDSWD VH QXPHWH LQHU LD LQWHU FODVH L P VRDU  IHOXO

vQFDUHFODVHOHVHGHS UWHD] XQHOHGHDOWHOH


$FHVW WHUPHQ VH QRWHD]  FX ,

G1 ,....., Gk  L UHSUH]LQW

 LQHU LD H[SOLFDW  GH

tipologie.
Al doilea termHQ GLQ GUHDSWD VH QXPHWH LQHU LD LQWUD-FODVH L P
omogenitatea claselor.

VRDU 

&DOLWDWHD WLSRORJLHL VH P VRDU  FX DMXWRUXO UDSRUWXOXL GLQWUH LQHU LD LQWHU

FODVHLLQHU LDWRWDO 

Criteriul lui Ward


Cnd n tipologia G1 ,....., Gk se nlocuiesc GRX  FODVH Gi  L G j prin
reuniunea lor, Gi  G j VHSURGXFHRGLPLQXDUHDLQHU LHLLQWHU-clase.
$FHDVW PLFRUDUH

D (Gi , G j ) = I (G1 ,....., Gi ,...., G j ,....., Gk ) I (G1 ,....., Gi G j ,....., Gk )

poate

fi

FDOFXODW LHVWHHJDO FX

D (Gi , G j ) =

ni n j
n(ni + n j )

d 2 ( gi , g j )

$FHVW FULWHULX XWLOL]DW SHQWUX P VXUDUHD GLVWDQ HL vQWUH GRX  FODVH

Gi  L

G j VHQXPHWHFULWHULXOGHDJUHJDUHDOOXL:DUG

Exemplu:
6 OX P

*
G1 = {xCitroenBX
}

G2 = {x*Peugeot 405 }

. Avem

*
*
d 2 ( x CitroenBX
, x Peugeot
405 ) =

(1769 1769) 2 (90 90) 2 (182 180) 2


+
+
+
267072
1442
609
(1060 1080) 2 (424 440) 2 (168 169) 2
+
+
+
= 0.189
50824
1638
56
=

*
D 2 ( xCitroenBX
, x*Peugeot 405 ) =

&ODVLILFDUHDLHUDUKLF

1 1
0.189 = 0.00393
24 (1 + 1)

DVFHQGHQW

$OJRULWPXO GH FODVLILFDUH LHUDUKLF  DVFHQGHQW   HVWH LWHUDWLY Q HWDSD


LQL LDO VHSOHDF GHODRSDUWL LHDPXO LPLLGHLQGLYL]LvQNFODVH

G1 ,....., Gk LVH

UHJUXSHD] FHOHGRX FODVH

Gi L G j , minimiznd criteriul lui Ward, D( Gi , G j ).

'HFLvQWLPSXODFHVWHLLWHUD LLLQHU LDLQWHU FODVHVFDGHFXRFDQWLWDWHHJDO 

cu D( Gi , G j  /D HWDSD LQL LDO

 ILHFDUH LQGLYLG IRUPHD]  R FODV  L LQHU LD

WRWDO HVWHDWXQFLHJDO FXLQHU LDLQWHU

-clase.

/D  HWDSD ILQDO  QX PDL H[LVW  GHFkW R VLQJXU  FODV  L LQHU LD LQWHU
HVWHGHFLQXO

-clase
6XPDSLHUGHULORULQHU LHLLQWHU-clase a diferitelor etape este deci

HJDO  FX LQHU LD WRWDO  /D ILHFDUH HWDS  VH FDOFXOHD]  XQ LQGLFH RE LQXW SULQ
vPS U LUHDSLHUGHULLGHLQHU LHLQWHU FODVHODLQHU LDWRWDO 

6H DOHJHWLSRORJLDRE LQXW OD HWDSD FRUHVSXQ] WRDUHXQHLFUHWHULEUXWDOH

a indicelui
$SOLFD LH
$P UHDOL]DW R FODVLILFDUH LHUDUKLF  DVFHQGHQW  D GDWHORU FHQWUDWH

-reduse

din exemplul cu ajutorul criteriului lui Ward.


7DEHOXO  LQGLF  GHVI XUDUHD DOJRULWPXOXL L UH]XOWDWHOH  VXQW YL]XDOL]DWH

cu ajutorul dendogramei (arborelui de clasificare) din Figura 11.


QSULPDHWDS VHUHJUXSHD] PRGHOHOH&LWURHQ%;  L3HXJHRW  
SHQWUXFDUHGLVWDQ DOXL:DUGHVWHHJDO FX

D ( x6* , x4* ) = 0.00393 .

,QGLFHOH GH DJUHJDUH HVWH HJDO FX  

 L DSDUH vQ

GHQGRJUDP  OD QLYHOXO OXL &LWURHQ %; DGLF  D HOHPHQWXOXL FDUH SUHFHGH SH
FHO ODW&ODVDHVWHQXPHURWDW 
/DDGRXDHWDS VHRE LQHFODVDUHJUXSkQG)RUG6LHUUD  L3HXJHRW
%UHDN  DOF UHLLQGLFHGHDJUHJDUH

D ( x12* , x11* ) / 6 este egal cu 0.15%.

Q DWUHLD HWDS  VHFRQVWUXLHWH FODVD  UHJUXSkQG5HQDXOW   L FODVD




  DOF UHLLQGLFHGHDJUHJDUH

D ( x2* , ( x4* , x6* ) / 6 este egal cu 0.19%.

$OJRULWPXO XUPHD]  DFHODL SURFHGHX SkQ  OD XOWLPD HWDS  FkQG VH
UHJUXSHD] FODVDD PLFLORUPDLQL &LYLF6HDW)LHVWD LFODVDIRUPDW 
GLQUHVWXOHDQWLRQXOXL
&ULWHULXOOXL:DUGFXPXODWGHODXOWLPDLWHUD LHSHUPLWHFDOFXODUHDLQHU LHL
H[SOLFDWHSULQGLIHULWHOHWLSRORJLLFRQVWUXLWHQDGHY UODXOWLPDLWHUD LHDYHP

I(43,46)=D(43,46)=3.07202
ntruct iQHU LD H[SOLFDW  ,   SULQ FODVD  IRUPDW
REVHUYD LL HVWH QXO  Q FRQVHFLQ

 GLQ DQVDPEOXO GH

 LQHU LD H[SOLFDW  GH WLSRORJLD IRUPDW  GLQ

GRX  FODVH  L  HVWH HJDO  FX  L SDUWHD GH LQHU LH H[SOLFDW  HVWH
HJDO FX

Clasa



 D IRVW IRUPDW  SULQ UHXQLXQHD FODVHORU 

%0: L

$XGL %0:

7LSR59:&DUDYHOOH 

Din D(44,45)=I(43,44,45)-I(43,46), deducem:


, 

'  ' 



 L SDUWHD GH

LQHU LHH[SOLFDW GHDFHDVW WLSRORJLHvQFODVHHVWHHJDO FX


&RQWLQX PVSUHvQFHSXWXODOJRULWPXOXL

Clasa 45 provine din reuniunea claselor:




(VSDFH2PHJD9:&DUDYHOOH`L



7LSR55`

'LQIDSWXOF ' 

, 

-I(43,44,45) deducem:
I(43,44,38,42)=D(43,46)+D(44,45)+D(38,42)=
=3.07202+1.42919+0.29270=4.79391.
&UHWHUHDLQHU LHLH[SOLFDWHILLQGPLF DWXQFLFkQGVHWUHFHGHODWLSRORJLD

IRUPDW  GLQ  FODVH   OD WLSRORJLD IRUPDW  GLQ  FODVH  
DGRSW PWLSRORJLDGDWHORUGLQFODVH

Tabelul 7
&ODVLILFDUHDLHUDUKLF

DVFHQGHQW

Descrierea claselor formate


Clasa

Elementul
care
precede

Elementul
FDUHXUPHD]

Nr.
elemente
FRQ LQXWH

Criteriul
lui
Ward

Indice(%).

Tabelul 8
0HGLLOHYDULDELOHORUSHFODVHLWHVWXO)LVKHU

Figura 11
Dendograma

Figura 12
Vizualizarea tipologiei din 3 clase

3HQWUX

D

LQWHUSUHWD

FX

PDL

PXOW 

SUHFL]LH

DFHDVW 

WLSRORJLH

DP

reprezentat-RSHSODQXOSULQFLSDOGLQ)LJXUDLDPFRQVWUXLW7DEHOXOXQGH
YDULDELOHOH VXQW DUDQMDWH vQ RUGLQHD GHVFUHVF WRare a testului Fisher ntre
YDULDELOHLWLSRORJLH

&ODVDPDLQLORUPLFLFRUHVSXQGHFODVHL

Honda Civic, Seat Ibiza Sxi, Citroen AX Sport, Peugeot 205 Rallye,
Peugeot 205, Fiat Uno, Ford Fiesta.
&ODVDPDLQLORUPHGLLFRUHVSXQGHFODVHL

Fiat Tipo, Renault 19,Citroen BX, Peugeot 405, Renault 21, Espace, Opel
Omega, Ford Sierra, Peugeot 405 Break, Nissan Vanette, VW Caravelle.
&ODVDPDLQLORUPDULFRUHVSXQGHFODVHL

Audi 90 Quatro, BMW325ix, Ford Scorpio, Renault 25, BMW 530i,


Rover 827i.

Concluzie
V-DP

SUH]HQWDW vQ DFHVW FDSLWRO  WRDWH HOHPHQWHOH FDUH V  SHUPLW 

LQWHUSUHWDUHD UH]XOWDWHORU XQXL SURJUDP GH DQDOL]  vQ FRPSRQHQWH SULQFLSDOH

S-DX XWLOL]DW SURJUDPHOH 6WDWJUDSKLFV L 63$'1 SHQWUX WUDWDUHD H[HPSOXOXL


prezentat.
3HQWUX FLWLWRUXO FDUH GRUHWH  V  WLH PDL PXOWH GHVSUH $QDOL]D vQ
&RPSRQHQWH 3ULQFLSDOH DWkW OD QLYHO WHRUHWLF FkW L SUDFWLF UHFRPDQG P
OXFU ULOHXUP WRDUH
%RXURFKH L 6DSRUWD   -DFNVRQ   -ROOLIIH   /HEDUW
0RULQHDX L )pQpORQ   /HEDUW 0RULQHDX L 7DEDQ

d (1977), Saporta

 6DSRUWDWHI QHVFX  


Q FHHD FH SULYHWH PHWRGHOH GH RE LQHUH D WLSRORJLLORU UHFRPDQG P vQ

special:
(YHULWW   L SURFHGXULOH $&(&/86&/867(5)$67&/86 GLQ

programul SAS.

B
BIIB
BLLIIO
OG
GR
RA
AF
FIIE
E
Michel Tenenhaus
Gilbert Saporta,
9LRULFDWHI QHVFX

- Methodes Statistiques en Gestion. Editura Dunod


1994, Paris.
- $QDOL]D GDWHORU L ,QIRUPDWLF  (GLWXUD (FRQRPLF 
1996.

S-ar putea să vă placă și