Sunteți pe pagina 1din 30

ANALIZA

N
COMPONENTE PRINCIPALE
IIN
NTTR
ROOD
DUUC
CEER
REE
Studiem cu ajutorul acestei metode un tabel indivizi x variabile, n cazul
n care toate variabilele sunt numerice.
3UH]HQW P PDL vQWkL R DERUGDUH H[SORUDWRDUH FDUH SHUPLWH GHVFULHUHD

indivizilor n multiplele lor GLPHQVLXQL L YL]XDOL]DUHD DFHVWRU UHOD LL vQWUH


variabile.
8UPHD]  $QDOL]D vQ &RPSRQHQWH 3ULQFLSDOH $&3  $FHDVW  PHWRG 

SHUPLWHRE LQHUHDXQHLK U LDLQGLYL]LORUvQIXQF LHGHDVHP Q ULOHGLQWUHHLL

RKDUW DYDULDELOHORUvQIXQF LHGHFRUHOD LLOH lor.


(VWHGH DVHPHQL SRVLELO  RE LQHUHD XQHL UHSUH]HQW UL VLPXOWDQH SH R KDUW 

DLQGLYL]LORULDYDULDELOHORU

0DLPXOWHVWHQHFHVDUV FRPSOHW PUHSUH]HQWDUHDJUDILF DGDWHORUFXR

tipologie a indivizilor.
3UH]HQW P GH DVHPHQL PHWRGD GH FODVLILFDUH DVFHQGHQW  LHUDUKLF  FDUH

IRORVHWHFULWHULXOOXL:DUGIRDUWHELQHDGDSWDWODWUDWDUHDGDWHORUQXPHULFH

Analiza exploratoare a datelor multidimensionale


7DEHOXO  YD VHUYL GUHSW ILU FRQGXF WRU SHQWUX DFHDVW  SUH]HQWDUH /LQLLOH

WDEHOXOXL UHSUH]LQW  PRGHOH GH PDLQL GLQ DQXO  LDU FRORDQHOH

FDUDFWHULVWLFLOH WHKQLFH FDSDFLWDWHD FLOLQGULF  VDX FLOLQGUHH SXWHUH YLWH] 

JUHXWDWHOXQJLPHO LPH

Tabelul 1
Nr. Crt. Model Cilindree Putere 9LWH] Greutate Lungime / LPH

Caracteristicile celor 24 modHOHGHPDLQL


Studiul descriptiv al indivizilor
&HLLQGLYL]LSRWILUHSUH]HQWD LvPSUHXQ FXFHOHFDUDFWHULVWLFLDOHORU

utiliznd graficul n stea din figura 1.

Figura 1

Grafic n stea

9LWH] Putere

Greutate Cilindree

Lungime / LPH

Fiecare individ este reprezentat printr-un hexagon. Fiecare vrf al


KH[DJRQXOXL FRUHVSXQGH XQHL YDULDELOH 3HQWUX LQGLYLGXO UHSUH]HQWDW GLVWDQ D

GHODYkUIODRULJLQHHVWHSURSRU LRQDO FXDEDWHUHDYDORULORUYDULDELOHLID GH

YDORDUHDHLPLQLP HDHVWHPLQLP GDF FDUDFWHULVWLFDHVWHPLQLP LPD[LP 

GDF FDUDFWHULVWLFDHVWHPD[LP 

&HLLQGLYL]LVXQWUHSUH]HQWD LvQILJXUD

6 SXQFW PFkWHYDFD]XULSDUWLFXODUH

3HXJHRW  5DOO\H 6HDW ,EL]D 6;L L &LWURsQ$; 6SRUW DX YLWH]H

PDULLSXWHUHPDUHvQU aport cu celelalte caracteristici ale lor


1LVVDQ9DQHWWHL9:&DUDYHOOHVHFDUDFWHUL]HD] SULQYLWH]HPLFL
Figura 2
Graficele n stea pentru indivizi

5HQDXOWDUHRSXWHUHPLF vQUDSRUWFXFLOLQGUHHDVD$FHVWDHVWH

un diesel.
QJHQHUDOPXO LPHDFDUDFWHULVWLFLORUHYROXHD] vQDFHODLVHQV

*UDILFHOHvQVWHDFUHVFUHJXODWGHOD PDLQLOHPLFLSUHFXP)RUG)LHVWDL

)LDW8QRODFHOHPDLPDULSUHFXP%09,5RYHULL5HQDXOW
Studiul descriptiv al variabilelor
Tabelul 2
Rezumate statistice ale datelor
Statistici elementare

Abaterea
Variabila Media Dispersia medie Minim Maxim
S WUDWLF

&RUHOD LL

Variabile Cilindree Putere 9LWH] Greutate Lungime

Q WDEHOXO  VXQW SUH]HQWDWH FkWHYD VWDWLVWLFL HOHPHQWDUH L PDWULFHD

FRUHOD LL lor dintre variabile.


3UHFL] P F  GLVSHUVLLOH VXQW FDOFXODWH vPS U LQG SULQ Q L QX SULQ Q -1),
GHRDUHFH HVWH YRUED GH R DQDOL]  JHRPHWULF  D GDWHORU L QX H[LVW  LQIHUHQ 

VWDWLVWLF 

6WDWLVWLFD LQIHUHQ LDO  VWXGLD]  XQ HDQWLRQ L WUDJH FRQFOX]LL SHQWU u


vQWUHDJDPXO LPH 

3HQWUX D DYHD R YL]LXQH FRPSOHW  D GDWHORU L D LQWHU UHOD LLORU vQWUH


-
variabile, am construit graficul din figura 3.
Figura 3
*UDILFXOFRUHOD LLORULQWHU -variabile

Toate variabilele sunt corelate pozitiv ntre ele. AceaVWD vQVHDPQ  F 

H[LVW  XQ IDFWRU P ULPH WDOLH  L F  OD R SULP  DQDOL]  PDLQLOH SRW IL

RUGRQDWH GH OD FHOH PDL PLFL OD FHOH PDL PDUL $FHDVWD VH YHGH GH DOWIHO L

examinnd global graficele n stea din figura 2.


9LWH]DHVWHIRDUWHFRUHODW FXSXWHUHDLPDLSX LQFXDOWHYDULDELOH

Variabilele grupului (Cilindree, Lungime, Greutate) sunt puternic corelate


ntre ele:
Cor (Cilindree, Lungime) = 0.86
Cor (Cilindree, Greutate) = 0.90
Cor (Lungime, Greutate) = 0.92
&RUHOD LLLPSRUWDQWHPDLVXQWFHOHGLQWUH/XQJLPHL/ LPH  LFHOH

GLQWUH3XWHUHL9LWH]  
3XWHP V  UH]XP P L V  YL]XDOL] P DFHDVW  SULP  DQDOL]  SULQWU -o
FODVLILFDUH LHUDUKLF  DVFHQGHQW  D YDULDELOHORU OXkQG GUHSW LQGLFH GH

VLPLODULWDWHvQWUHYDULDELOHFRUHOD LLOHORU DLFLWRDWH pozitive).


'DF  FRUHOD LLOH  YDULDELOHORU DX YDORUL QHJDWLYH OX P vQ FRQVLGHUD LH

valorile lor absolute.


6 GHVFULHPDFHDVW PHWRG 

Q SULPD HWDS  VH UHJUXSHD]  FHOH GRX  YDULDELOH FHOH PDL FRUHODWH

*UHXWDWHL/XQJLPH  

QDGRXDHWDS VHFDXW FHDPDLSXWHUQLF FRUHOD LHFDUHDU PDV

(VWHYRUEDGHFRUHOD LDGLQWUHYDULDELOHOH*UHXWDWHL&LOLQGUHH  

Variabila Cilindree se va uni deci cu grupul (Greutate, Lungime)


Q D WUHLD HWDS  JUXSXO 3XWHUH 9LWH]  VH IRUPHD]  FX R FRUHOD LH GH

0.894.
QDSDWUDHWDS / LPHDvQWkOQHWHJUXSXO &LOLQGUHH*UHXWDWH/XQJLPH 

&RU / LPH/XQJLPH  

L vQ VIkULW vQ D FLQFHD HWDS  FHOH GRX  JUXSXUL 3XWHUH 9LWH]  L

&LOLQGUHH*UHXWDWH/XQJLPH/ LPH IX]LRQHD] 

&HDPDLSXWHUQLF FRUHOD LHvQWUHRYDULDELO DSULPXOXLJUXSLRYDULDELO 

a celui de-DOGRLOHDJUXSHVWHHJDO FXFHDGLQWUH3XWHUHL&LOLQGUHH


Acest procedeu de agregare iterativ este vizualizat n figura 4 printr-o
GHQGRJUDP 

Indicele de agregare calculat este cHD PDL SXWHUQLF  FRUHOD LH vQWUH

YDULDELOHOHXQXLJUXSLFHOHDOHFHOXLODOWJUXSvQPRPHQWXOUHJUXS ULL

Figura 4
&ODVLILFDUHDLHUDUKLF DVFHQGHQW DYDULDELOHORU

0HWRGDFRUHOD LLORUPD[LPH

. Indice de agregare Variabila

3XWHP GH DVHPHQL V  P VXU P DVLPLODULWDWHD vQWUH ILHFDUH YDULDELO  L

PXO LPHD WXWXURU YDULDELOHORU LQFOXVLY HD vQV L  XWLOL]kQG S WUDWHOH

FRUHOD LLORU(VWHYRUEDGHFLGHDP VXUDLPSRUWDQ DXQHLYDULDELOH

'H H[HPSOX LPSRUWDQ D YDULDELOHL &LOLQGUHH HVWH FDOFXODW  I FkQG PHG ia


WXWXURUS WUDWHORUFRUHOD LLORUVDOHFXPXO LPHDGHYDULDELOH

1 4.29
(1 + 0.8612 + 0.6932 + 0.905 2 + 0.864 2 + 0.709 2 ) = = 0.715
6 6
7DEHOXO  FRQ LQH VLPLODULWDWHD ILHF UHL YDULDELOH FX vQWUHDJD PXO LPH D

variabilelor:
Tabelul 3

$VLPLODULWDWHDILHF UHLYDULDELOHFXPXO LPHDWXWXURUYDULDEL lelor.


$VWIHOYDULDELODFDUHUH]XP FHOPDLELQHPXO LPHDFHORUYDULDELOHHVWH

FLOLQGUHHD9LWH]DHVWHRYDULDELO PDLLQGHSHQGHQW ID GHFHOHODOWH

Analiza n Componente Principale


'DWHOH FDUH WUHEXLH DQDOL]DWH VH SUH]LQW  VXE IRUPD XQXL WDEHO LQGLY izi x
YDULDELOH ([LVW  S YDULDELOH  X 1 ,......, X J ,....., X P observate pentru n indivizi
LQ 1RW P FX xij  YDORDUHD OXDW  GH YDULDELOD X j pentru individul i,
1 n
xi = ( xi1 ,.....xip ) PXO LPHDFDUDFWHULVWLFLORU pentru individul i, x j , s 2j = ( xij x j ) 2
n i =1
L s j PHGLDGLVSHUVLDLDEDWHUHDPHGLHS WUDWLF DYDULDELOHL Xj.
3UHFL] P F  GLIHUHQ HOH  vQWkOQLWH vQWUH SURJUDPHOH GH $QDOL]  vQ

&RPSRQHQWH 3ULQFLSDOH IUDQFH]H L D


mericane, la nivelul calculului
FRPSRQHQWHORU SULQFLSDOH SURYLQ GLQ vPS U LUHD FX Q VDX Q-1) n calculul

dispersiei.
$QDOL]D vQ &RPSRQHQWH 3ULQFLSDOH FRQVW  vQ F XWDUHD XQXL QXP U PLF GH

variabile noi Y1 ,......,Ym numite componente principale QHFRUHODWH vQWUH HOH L

FDUHV UH]XPHFkWPDLELQHSRVLELOGDWHOHGHSOHFDUHLQL LDOH

0DL PXOWH FULWHULL SHUPLW RE LQHUHD FRPSRQHQWHORU SULQFLSDOH &ULWHULXO

LQHU LHLHVWHFHOPDLYHFKL 3HDUVRQ LSUH]LQW DYDQWDMHOH

- DERUGDUHDHVWHJHRPHWULF FHHDFHSHUPLWHvQ HOHJHUHDPDLSURIXQG 

DPHWRGHLLDUH]XOWDWHORULQWHUPHGLDUHFDUHDMXW ODLQWHUSUHWDUHDHL

proprii din acest punct de vedere.


- $QDOL]D &RUHVSRQGHQ HORU GHYLQH R JHQHUDOL]DUH D $QDOL]HL vQ

&RPSRQHQWH 3ULQFLSDOH L VH vQ HOHJH PDL ELQ e n acest cadru


geometric.
- UH]XOWDWHOHSURJUDPHORUIUDQFH]HGHDQDOL] vQFRPSRQHQWHSULQFLSDOH

FRUHVSXQGDFHVWHLDERUG UL

&ULWHULXO LQHU LHL HVWH vQ DFHODL WLPS PXOW PDL FRPSOH[ GHFkW FHOHODOWH

GRX  FULWHULL SURSXVH GH +RWHOOLQJ  FULWHULXO FRUHOD LHL L FULWHULXO

dispersiei.
Vom prezenta de asemeni aceste criterii, ele corespunznd rezultatelor
RE LQXWHFXSURJUDPHOHDPHULFDQH$&3

3UH]HQWDUHD$&3FRQIRUPDERUG ULLJHRPHWULFHDOXL3HDUVRQ

1RUXOGHSXQFWHDVRFLDWGDWHORULFDUDFWHULVWLFLO e sale
Q DFHDVW  DERUGDUH JHRPHWULF  VH DVRFLD]  GDWHORU QRUXO GH SXQFWH

N = {x1 ,...., xi ,.., xn } ntr-XQ VSD LX  GH GLPHQVLXQH S ILHFDUH YHFWRU xi de
caracteristici ( xi1 ,.....xip ) ale individului i este considerat drept un punct ntr-un
VSD LXFXSGLPHQVLXQL

&HQWUXOGHJUHXWDWHDOQRUXOXL1HVWHSXQFWXOJDOHF UXLFRRUGRQDWHVXQW

mediile diferitelor variabile.


1 n
g= xi = ( x1 ,...., x j ,.., x p ) = x .
n i =1
Pentru exemplul nostru:
g = (1906.114,183,1111,422,169).
9HFWRUXOJUHSUH]LQW FXPYDFDUDFWHULVWLFLOHXQHLPDLQLPHGLL

PSU WLHUHD QRUXOXL vQ MXUXO FHQWUXOXL V X GH JUHXWDWH VH P VRDU  FX

1 n p
DMXWRUXOLQHU LHLWRWDOHDQRUXOXL1GHILQLW SULQ I (N , g) = ( xij x j ) 2 .
n i =1 j =1
,QHU LD WRWDO  SRDWH IL FDOFXODW  GLUHFW ILLQG HJDO  FX VXPD GLVSHUVLLO or
YDULDELOHORUGLQSUREOHP 

1 n 2 1 n p
I (N , g) = i
n i =1
d ( x , g ) = ( xij x j ) 2 =
n i =1 j =1
p p
1 n
= ( xij x j ) 2
= s 2j
j =1 n i =1 j =1

2E LQHPSHQWUXH[HPSOX

I(N,g)=267072+1441+609+50824+1638+56=321640.
6HREVHUY F LQHU LDQRUXOXLVHGDWRUHD] vQSULQFLSDOFLOLQGUHHL

$FHDVWD GLQ FDX]D DOHJHULL XQLW LORU GH P VXU  'DF  DP IL P V urat
FLOLQGUHHD vQ OLWUL LPSRUWDQ D  H[DJHUDW  D FLOLQGUHHL vQ FDOFXOXO LQHU LHL DU IL

GLVS UXW

Q SUDFWLF  DGHVHRUL HVWH SUHIHUDELO V  RE LQHP R GHVFULHUH D GDWHORU

LQGHSHQGHQW GHDOHJHUHDXQLW LORU

3XWHPV RE LQHPGDWHRPRJHQHWUDQVIRUPkQGGDWHOHLQL LDOHvQYDULDELOH

FHQWUDWH UHGXVH ILHF UHL YDULDELOH X j L VH DVRFLD]  YDULDELOD FHQWUDW  UHGXV 

X j xj
X *j = GHPHGLHLGLVSHUVLH
sj
1RXOWDEHOVWXGLDWHVWHIRUPDWGLQFDQWLW LOH

xij x j
xij* = .
sj
La individXOLVHDVRFLD] DFXPSXQFWXO xi* = ( xi*1 ,...., xip* ).
Noul nor de puncte este N * = {x1* ,...., xn* } .
Centrul de greutate al norului N *  HVWH  L LQHU LD VD WRWDO  HVWH HJDO  FX

QXP UXOSDOYDULDELOHORU

3ULPDD[ SULQFLSDO LSULPDFRPSRQHQW SULQFLSDO

9RP VWXGLD FRQVWUXF LD L SURSULHW LOH SULPHL FRPSRQHQWH SULQFLSDOH

apoi vom ilustra interpretarea ei cu ajutorul unui exemplu.


3ULPDHWDS DDFHVWHLFRQVWUXF LLFRQVW vQF XWDUHDSULPHLD[HSULQFLSDOHD

norului de puncte N * .

3ULPDD[ SULQFLSDO

& XW P V  IDFHP FD R GUHDSW  1  V  WUHDF  FkW PDL  ELQH SRVLELO SULQ
*
mijlocul norului de puncte N .
6H P VRDU  vPSU WLHUHD QRUXOXL N * n jurul unei drepte cu ajutorul
LQHU LHL I ( N * , ) norului N * UDSRUWDW ODGUHDSWD .

1 n
I ( N * , ) = d 2 ( xi* , yi ) unde yi  HVWH SURLHF LD RUWRJRQDO  P ( xi* ) a
n i =1
punctului xi pe dreapta .
*

Dreapta 1 FDXW V PLQLPL]H]H I ( N * , ) LVHQXPHWHSULPDD[ SULQFLSDO 

a norului N * .
6H SRDWH DU WD F  GUHDSWD 1 trece prin originea O, centrul de greutate al
norului N al datelor centrate-UHGXVH L HVWH JHQHUDW
*
 GH  YHFWRUXO XQLWDU u1 ,
YHFWRUSURSULXQRUPDWDOPDWULFHL5DFRUHOD LLORUvQWUHYDULDELOHOH X j , asociat
la cea mai mare valoare proprie 1 .
9DORULOHSURSULLLYHFWRULLSURSULLDLPDWULFHL5VXQWFXSULQLvQWDEHOXO

& XWDUHDSULPHLD[HSULQFLSDOH 1 HVWHYL]XDOL]DW vQILJXUD

Tabelul 4

9DORULLYHFWRULSURSULLDLPDWULFHLGHFRUHOD LL
Figura 5
& XWDUHDSULPHLD[HSULQFLSDOH

3HQWUXH[HPSOXOFXPDLQLOHDPRE LQXW

1 = 4.6745
u1 = (0.4434;0.4182;0.3497;0.4252;0.4246;0.3811).

Se poate verifica rezultatul:


6
| | u1 ||= u12j = 0.4434 2 + ...... + 0.38112 = 1
j =1

3ULPDFRPSRQHQW SULQFLSDO

Prima FRPSRQHQW  SULQFLSDO  Y1  HVWH R QRX  YDULDELO  GHILQLW  SHQWUX

ILHFDUH LQGLYLG L SULQ OXQJLPHD DOJHEULF  D SURLHF LHL SXQFWXOXL xi* pe axa
1 .Valoarea lui Y1 (i ) este deci egaO FXSURGXVXOVFDODUvQWUHYHFWRULL u1 L xi* :
p x ij x j
Y1 (i ) = Oy i = u1 j ( )
j =1 sj
Astfel, valoarea primei componente principale Y1 SHQWUX5RYHUHVWHHJDO FX

Y1 (Rover)=0.44*1.49+0.41*1.67+0.34*1.58+0.43*1.13+0.43*1.17+0.38*0.83=3.19
Global, Y1 se scrie deci:
Y1 = 0.44Cilindree* + 0.41Putere * + 0.34Viteza * +
+ 0.43Greutate* + 0.43Lungime* + 0.38Latime* .
Valorile lui Y1 pentru fiecare individ sunt cuprinse n tabelul 5.
Tabelul 5
3 WUDWHOHGLVWDQ HORUSkQ ODRULJLne, componentele principale

LS WUDWHOHFRVLQXVXULORU

3ULPD FRPSRQHQW  SULQFLSDO  Y1 HVWH FHQWUDW  ILLQG FRPELQD LH OLQLDU  GH

variabile centrate.
6HSRDWHDU WDF GLVSHUVLDVDHVWHHJDO FX 1 :
n n
1 1
Dispersie (Y1 ) =
n i =1
Y12 (i ) = d 2 ( y i ,0) = I ({ y1 ,...., y n },0) = 1 .
n i =1
Dispersia primei componente principale Y1  HVWH HJDO  FX LQHU LD QRUXOXL

de puncte proiectate pe 1 , n raport cu centrul de greutate O.


&RUHOD LLOH vQWUH YDULDELOHOH Xj L FRUHVSRQGHQ D SULQFLSDO  Y1 pot fi
calculate cu ajutorul formulei:
cor ( X j , Y1 ) = 1 u1 j

6H GHGXFH F  DVLPLODULWDWHD  OXL Y1  ID  GH PXO LPHD GH YDULDELOH HVWH

HJDO FX

1 p

p j =1
cor 2 ( X j , Y1 ) = 1
p

4.656
Pentru exempluO QRVWUX RE LQHP = 0.776 comparabil cu 0.715 al
6
cilindreei din tabelul 3.
&RUHOD LLOHvQWUH X j L Y1 DSDUvQSULPDFRORDQ DWDEHOXOXL
Tabelul 6.
&RUHOD LLYDULDELOH -componente principale

PULPD FRPSRQHQW  SULQFLSDO Y1  ILLQG IRDUWH FRUHODW  SR]LWLY FX WRDWH

YDULDELOHOH HD SRDWH IL LQWHUSUHWDW  FD XQ IDFWRU GH P ULPH FODVkQG PDLQLOH

de la cele mai mici ( Y1 (Fiat Uno)= -3.76; Y1 (Ford Fiesta)= - 3.50) la cele mai
mari ( Y1 (Renault 25)=3.44; Y1 (BMV530i)=3.95).
&DOLWDWHDJOREDO DSULPHLFRPSRQHQWHSULQFLSDOH

3HQWUX D P VXUD FDOLWDWHD JOREDO  D SULPHL FRPSRQHQWH SULQFLSDOH

FRQVLGHUDW  FD UH]XPDW DO GDWHORU VH IRORVHWH IRUPXOD GH GHVFRPSXQHUH D

LQHU LHLWRWDOH

Vectorul yi  ILLQG SURLHF LD RUWRJRQDO  D YHFWRUXOXL xi* pe dreapta 1 ,


avem:
d 2 ( xi* ,0) = d 2 ( yi ,0) + d 2 ( xi* , yi ) de unde :
1 n 2 * 1 n 2 1 n 2 *
i
n i =1
d ( x , 0 ) = i n
n i =1
d ( y ,0 ) +
i =1
d ( xi , yi )

,QHU LDWRWDO 

1 n 2 *
I ( N * ,0) = d ( xi ,0) = p
n i =1
VHGHVFRPSXQHGHFLvQGRX S U L

1 n 2
- primul termen d ( yi ,0) = I ({ y1 ,...., yn },0)  UHSUH]LQW  LQHU LD WRWDO  D
n i =1
norului { y1 ,...., yn } D SURLHF LLORU SXQFWHORU xi* pe axa 1 $FHDVW 
FDQWLWDWHUHSUH]LQW LQHU LDH[SOLFDW GHD[D 1 LHVWHHJDO FX 1
n
1
- al doilea termen
n i =1
d 2 ( xi* , yi ) = I ( N * , 1 )  UHSUH]LQW  LQHU LD UH]LGXDO 

a norului n jurul axei 1


PentruH[HPSOXOFXPDLQLOHRE LQHP
- LQHU LDWRWDO S 

- LQHU LDH[SOLFDW GH 1 = 1 =4.656

- LQHU LDUH]LGXDO  S- 1 =1.344


&DOLWDWHDJOREDO  D SULPHL FRPSRQHQWH SULQFLSDOH VH P VRDU SULQ SDUWHD

1
de iner LH H[SOLFDW  6H UHJ VHWH  DSURSULHUHD FRPSRQHQWHL SULQFLSDOH Y1
p
ID GHPXO LPHDGHYDULDELOH

4.656
QH[HPSOXSDUWHDGHLQHU LHH[SOLFDW GH 1 HVWHHJDO FX = 0.776. Se
6
poate sSXQH F   GLQ LQHU LD WRWDO  HVWH H[SOLFDW  SULQ DOXQJLUHD QRUXOXL

de-a lungul primei axe principale.

&DOLWDWHDUHSUH]HQW ULLLQGLYL]LORUSHSULPDD[ SULQFLSDO

&DOLWDWHD UHSUH]HQW ULL ILHF UXL LQGLYLG SH D[D


1  VH P VRDU  FX D jutorul
S WUDWXOXLFRVLQXVXOXLXQJKLXOXLIRUPDWGHYHFWRUXO xi cu axa 1 :
*

d 2 ( yi ,0) Y1 (i ) 2
cos 2 ( xi* , 1 ) = = .
d 2 ( xi* ,0) d 2 ( xi* ,0)

Astfel avem pentru Rover:


Y1 ( Rover ) = 3.19
d 2 ( Rover,0) = 1.49 2 + 1.67 2 + 1.582 + 1.132 + 1.17 2 +
+ 0.832 = 10.8
10.18
cos 2 ( Rover , 1 ) = = 0.94
10.80
5RYHUHVWHELQHUHSUH]HQWDWSHD[DSULQFLSDO  1 .
3 WUDWHOH GLVWDQ HORU  ILHF UXL LQGLYLG OD RULJLQH L S WUDWHOH FRVLQXVXULORU

sunt date n Tabelul 5.

$GRXDD[ SULQFLSDO LDGRXDFRPSRQHQW SULQFLSDO

3UH]HQW P FRQVWUXF LD L SURSULHW LOH FHOHL GH -a doua componente


principale.

$GRXDD[ SULQFLSDO

6HFDXW RD[  2 RUWRJRQDO FX 1 LFDUHV  PLQLPL]H]HLQHU LD I ( N * , ) .


$FHDVW  D GRXD D[  SULQFLSDO  2  WUHFH SULQ RULJLQHD 2 L HVWH JHQHUDW  GH

vectorul u2  YHFWRU SURSULX QRUPDW GLQ PDWULFHD GH FRUHOD LL 5 DVRFLDW OD D
doua cea mai mare valoare proprie 2 .
Valoarea proprie 2  L YHFWRUXO SURSULX u2  SHQWUX H[HPSOXO FX PDLQLOH
se DIO  vQ 7DEHOXO  & XWDUHD FHOHL GH-a doua axe principale 2 este
YL]XDOL]DW vQ)LJXUD
Figura 6
& XWDUHDFHOHLGH -a doua axe principale

6  QRW P FX zi  L ai  SURLHF LLOH SXQFWXOXL xi* pe axa 2  L SH SODQXO


( 1 , 2 ) respectiv. Vectorii yi L zi VXQWGHDVHPHQLSURLHF LLOHSXQFWHORU ai pe
axele 1 L 2 .
Din descompunerea:
d 2 ( xi* ,0) = d 2 (ai ,0) + d 2 ( xi* , ai ) =
deducem:
= d 2 ( yi ,0) + d 2 ( zi ,0) + d 2 ( xi* , ai )

I ( N , O ) = I ({ y1 ,...., yn },0) + I ({z1 ,...., z n },0) + I ( N * , ( 1 , 2 )) (1)


unde
1 n 2 *
d ( xi , ai )
I ( N * , (1 , 2 )) =
n i =1
HVWHLQHU LDQRUXOXL N n raport cu planul (1 , 2 ) . 6HSRDWHGHPRQVWUDF 
*

I ( N , (1 , 2 ))  HVWH PLQLP
*
 vQ UDSRUW FX LQHU LD ID  GH WRDWH FHOHODOWH SODQH

posibile.
Planul (1 , 2 )  VH QXPHWH  SULPXO SODQ SULQFLSDO (VWH SODQXO FDUH  WUHFH
cel mai bine posibil prin mijlocul norului N * vQVHQVXOFULWHULXOXLLQHU LHL

$GRXDFRPSRQHQW SULQFLSDO

$ GRXD FRPSRQHQW  SULQFLSDO   Y2  HVWH R YDULDELO  QRX  GHILQLW  SHQWUX

fiecare individ i prin :


Y2 L  OXQJLPHDDOJHEULF DVHJPHQWX lui [0, zi ]
p
xij x j
Y2 (i) = u2 j ( ) .
j =1 sj
Pentru exemplu, putem scrie Y2 global:
Y2 = 0.03Cilidree* + 0.42 Putere * + 0.66Viteza * 0.26Greutate *
0.30 Lungime* 0.48Latime*
Valorile sale sunt date n Tabelul 5. Pentru Rover, Y2 ia valoarea 0.77.
$GRXDFRPSRQHQW SULQFLSDO  Y2 HVWHFHQWUDW LGHGLVSHUVLHHJDO FX 2 .
Putem scrie:
1 n 1 n 2
Disp (Y2 ) = 2
n i =1
Y (i ) 2
= d ( zi ,0) =
n i =1 (2)
= I ({z1 ,...., z n },0) = 2
0DL PXOW FRUHOD LD vQWUH Y1  L Y2  HVWH HJDO  FX ]HUR &RUHOD LLOH vQWUH

variabilele X J L Y2 VHFDOFXOHD] FXDMXWRUXOIRUPXOHL

cor ( X J , Y2 ) = 2 u 2 j
&RUHOD LLOH  GLQWUH YDULDELOHOH L FRPSRQHQWD SULQFLSDO  Y2 din exemplul
nostru sunt datH vQ 7DEHOXO  3XWHP REVHUYD F  Y2  HVWH FRUHODW  SR]LWLY FX

YDULDELOHOH PRWRU &LOLQGUHH 3XWHUH 9LWH]  L FRUHODW  QHJDWLY FX

YDULDELOHOHFRQIRUW *UHXWDWH/XQJLPH/ LPH 

$ GRXD FRPSRQHQW  SULQFLSDO  Y2 RSXQH DVWIHO PDLQL VSRUWLYH FX XQ

motor prea puternic n raport cu confortul


( Y2 (Peugeot 205 Rallye)=1.48, Y2 (Audi 90 Quattro)=1.36)
ODPDLQLIDPLOLDOHFXXQFRQIRUWVSRULWvQUDSRUWFXPRWRUXO

( Y2 (VW Caravelle= - 2.38, Y2 (Nissan Vanette)= - 1.82).

&DOLWDWHDJOREDO -
DFHOHLGH DGRXDFRPSRQHQW SULQFLSDO LDSULPHORU

GRX FRPSRQHQWHSULQFLSDOH

'LQ HFXD LLOH   L   VH GHGXFH F  SDUWHD GH LQHU LH H[SOLFDW  GH D GRX a
D[  SULQFLSDO  HVWH HJDO  FX
2   LDU DFHHD H[SOLFDW  GH SODQXO (1 , 2 ) este
p
(1 + 2 )
HJDO FX .
p
n exemplu, 2 H[SOLF    GLQ LQHU LD WRWDO  LDU (1 , 2 )
H[SOLF    GLQLQHU LDWRWDO 

&DOLWDWHDUHSUH]HQW ULLLQGLYL]LORUSHDGRXDD[ SULQFLSDO LSHSULPXO

plan principal
&DOLWDWHDUHSUH]HQW ULLILHF UXLSXQFW xi* pe axa 2 LSHSODQXO (1 , 2 ) se
P VRDU  FX DMXWRUXO S WUDWHORU FRVLQXVXULORU XQJKLXULORU IRUPDWH GH YHFWRUXO

xi* SHGHRSDUWHLGHD[D 2 sau planul (1 , 2 ) SHGHDOW SDUWH


Pe 2 :
d 2 ( z i ,0) Y2 (i ) 2
cos 2 ( x i* , 2 ) = =
d 2 ( xi* ,0) d 2 ( xi* ,0)
Pe (1 , 2 ) :
d 2 (a i ,0) d 2 ( y i ,0) + d 2 ( zi ,0)
cos 2 ( xi* , (1 , 2 )) = = =
d 2 ( xi* ,0) d 2 ( xi* ,0)
= cos 2 ( xi* , 1 ) + cos 2 ( xi* , 2 )
Pentru Rover, avem:
cos 2 ( Rover , 2 ) = 0.06
cos 2 ( Rover , (1 , 2 )) = 0.94 + 0.06 = 1.00
&XFkWHYDDSUR[LP ULGDWRUDWHURWXQMLULORUSXWHPDILUPDF 5RYHUVHDIO 

FRQ LQXWvQSULPXOSODQSULQFLSDO

Rezultate generale
Extinznd UH]XOWDWHOH SUH]HQWDWH vQ VHF LXQLOH SUHFHGHQWH VH RE LQ R

PXO LPH GH S D[H SULQFLSDOH 1 ,......., p


generate de vectorii proprii
RUWRQRUPD L u1 ,......., u p DVRFLD LODYDORULOHSURSULL 1 ,......., p aranjate n ordinea

descUHVF WRDUHGLQPDWULFHDGHFRUHOD LL5


)LJXUDYL]XDOL]HD] DFHVWQRXUHSHU

Figura 7
Axele principale. Componentele principale
p
Componentele principale Y1 ,......,Y p sunt definite prin Yh (i ) = u hj xij* .
j =1

(OHUHSUH]LQW FRRUGRQDWHOH punctelor xi* n noul reper.


6HSRDWHDU WDF HOHVXQWFHQWUDWHGHGLVSHUVLH h LQHFRUHODWHvQWUHHOH
*
Punctele x pot fi exprimate n acest nou reper:
i
p
xi* = Yh (i )u k
h =1

Formulele carH XUPHD]  VXQW IRDUWH LPSRUWDQWH L VH GHGXF GLUHFW GLQ
procesul de construire al componentelor principale:
Formula de reconstituire a datelor:
p
xij* = Yh (i )u hj (3)
h =1

)RUPXODGHUHFRQVWLWXLUHDPDWULFHLFRUHOD LLORUGLQWUHYDULDELOH :
p
cor ( X j , X l ) = huhj uhl (4)
h =1

)RUPXODGHGHVFRPSXQHUHDS WUDWXOXLGLVWDQ HLXQXLSXQFWODRULJLQH


p
d 2 ( xi* ,0) =|| x i* ||2 = Yh (i ) 2
h =1

de unde se deduce:
p
(i) cos ( x ,
h =1
2 *
i h ) =1
p
(ii) h =1
h =p

&DOFXOXOFRUHOD LLORUvQWUHYDULDELOHOH X j LFRPSRQHQWHOHSULQFLSDOH Yh


cor ( X j , Yh ) = h u hj (5)
'HGXFHP F  DVLPLODULWDWHD FRPSRQHQWHL SULQFLSDOH Yh cu variabilele
1 p
p
X 1 ,...., X p  HVWH HJDO  FX 
p
cor
j =1
2
( X j ,Yh ) =
p
 DGLF  SDUWHD GH LQHU LH

H[SOLFDW GHD[DSULQFLSDO  h .

'LVWDQ DOXL0DKDODQRELV

3HQWUXDP VXUDGLVWDQ DGLQWUHXQLQGLYLGLFHQWUXOGHJUHXWDWHDOQRUXOXL

VHXWLOL]HD] DGHVHDGLVWDQ DOXL0DKDODQRELV

(D VH GHILQHWH vQ IHOXO XUP WRU VH FRQVWUXLHVF PDL vQWkL FRPSRQHQWHOH

principale Z h preferabil pentru datele de origine dect pentru datele centrate-


UHGXVH 3HQWUX  DFHDVWD VH XWLOL]HD]  YHFWRULL SURSULL vh din matricea de
covariaQ  D YDULDELOHORU X j  L VH FDOFXOHD]  YDULDELOHOH Z h cu ajutorul
formulei:
p
Z h (i ) = v
j =1
hj ( x ij x j )

'LVWDQ DOXL0DKDODQRELV d M ( xi , x ) dintre punctul xi LFHQWUXOGHJUHXWDWH


x DOQRUXOXLIRUPDWGLQGDWHOHGHRULJLQHVHGHILQHWHFXDMXWRUXOIRUPXOHL
p
d ( xi , x ) = Z h* (i ) 2
2
M
h =1
unde Z h* este variabila Z h UHGXV 

5HSUH]HQW ULJUDILFH

(VWHYRUEDGHUHSUH]HQW ULJUDILFHDOHLQGLYL]LORULYDULDELOHORU

Harta indivizilor
3URLHF LLOH SXQFWHORU xi* pe primul plan principal (1 , 2 ) au drept
coordonate pe axele principale 1 , 2 valorile Y1 (i ) L Y2 (i ) .
5HSUH]HQWDUHDJUDILF  D SXQFWHORU Ai = (Y1 (i ), Y2 (i ))  QH G  DVWIHO FHO PDL EXQ

rezultat al datelor dintr-XQ SODQ $FHDVW  KDUW  D LQGLYL]LORU HVWH UHSUH]HQWDW 

n Figura 8.
6H YHULILF  LQWHUSUHWDUHD D[HORU SUH]HQWDW  DQWHULRU PDLQLOH DSDU GH -a
OXQJXO SULPHL D[H vQ IXQF LH GH PRGHOXO ORU GH OD FHOH PDL PLFL )LDW 8QR

)RUG)LHVWD ODFHOHPDLPDUL 5HQDXOW%09L LGH


-a lungul celei de-
D GRXD D[H vQ IXQF LH GH FDUDFWHULVWLFD ORU GH OD PDLQLOH IDPLOLDOH  1LVV
an
Vanette, VW Caravelle) la cele sportive (Citroen AX Sport, Peugeot 205
Rallye).

Figura 8
3ULPXOSODQSULQFLSDOLFHUFXOFRUHOD LLORU
Harta variabilelor
Variabilele sunt reprezentate ntr-un plan cu ajutorul punctelor:
B j = ( cor ( X j , Y1 ), cor ( X j ,Y2 )) Se RE LQH UHSUH]HQWDUHD  JUDILF  GLQ )LJXUD 
QXPLW FHUFXOGHFRUHOD LL

(VWH YL]XDOL]DW ELQH IDSWXO F  SULPD FRPSRQHQW  SULQFLSDO  FRUHODW 

SR]LWLYFXWRDWHYDULDELOHOHSUREOHPHLHVWHXQIDFWRUGHWDOLH P ULPH LF 

DGRXDFRPSRQHQW SULQFLSDO RSXQkQG 9LWH]D3XWHUH OD / LPH/XQJLPH

L*UHXWDWH FODVHD] PRGHOHOHFRQIRUPFDUDFWHUXOXLORUVSRUWLYVDXIDPLOLDO

Lungimea R j a vectorilor-variabile B j  UHSUH]LQW  FRUHOD LD PXOWLSO 


R( X j ;Y1 ,Y2 ) dintre variabila X j LFHOHGRX FRPSRQHQWHSULQFLSDOH6HRE LQH
vQDGHY U

|| B j ||2 = cor 2 ( X j , Y1 ) + cor 2 ( X j , Y2 ) = R 2 ( X j ;Y1 ,Y2 ),


F FL YDULDELOHOH Y1 ,Y2 sunt necorelate ntre ele. Pentru exemplul prezentat se
RE LQH

Variabile Rj
Cilindree 0.96
Putere 0.98
9LWH] 0.97
Greutate 0.92
Lungime 0.97
/ LPH 0.93

7RDWHYDULDELOHOHVXQWELQHUHSUH]HQWDWHSHFHUFXOGHFRUHOD LL

3DUWHD GH LQHU LH H[SOLFDW  GH SULPXO  SODQ SULQFLSDO ILLQG IRDUWH PDUH

 FRUHOD LLOHvQWU


e variabile sunt bine reconstituite utiliznd doar primii
doi termeni din formula (4):
2
cor ( X j , X l ) = h u hj u hl FDUHGHYLQHGDF  LQHPFRQWGH 
h =1
2
cor ( X j , X l ) = cor ( X j , Y1 )cor ( X l , Y2 ).
h =1

$VWIHO FRUHOD LD vQWUH YDULDELOHOH X j L X l  SRDWH IL DSUR[LPDW  SULQ

produsul scalar < B j , B l > dintre vectorii B j L B l .


Exemplu:FRU &LOLQGUHH3XWHUH  HVWHELQHDSUR[LPDW SULQ

cor (Cilindree, Y1 )cor ( Putere, Y1 ) + cor (Cilindree,Y2 ) cor ( Putere,Y2 ) =


= 0.96 0.89 + 0.03 0.40 = 0.8664
Cosinusul unghiului format de vectorii B j L B l fiind dat de formula:
< B j ,Bl >
cos( B j , B l ) = , FRUHOD LDvQWUH X j L X l se scrie aproximativ:
|| B j |||| B l ||
cor ( x j , xl ) =|| B j ||||B l || cos( B j , B l ).
$VWIHO FRUHOD LLOH vQWUH YDULDELOH X j sunt aproximativ reconstituite pe
FHUFXO GH FRUHOD LL vQ IXQF LH GH OXQJLPHD YHFWRULORU  YDULDELOH L D

FRVLQXVXULORUXQJKLXULORUGLQWUHDFHWLYHFWRUL

6H SRDWH YHULILFD GH H[HPSOX F  GHQGRJUDPD GLQ )LJXUD  H[SULP  ELQH

SR]L LDYHFWRU ilor-YDULDELOHGLQFHUFXOGHFRUHOD LLXQLLvQUDSRUWFXFHLODO L

Biplotul
Lundu-QH FkWHYD SUHFDX LXQL vQ FHHD FH SULYHWH VFDUD GH UHSUH]HQWDUH
HVWH SRVLELO V  VXSUDSXQHP FHOH GRX  JUDILFH SULPXO SODQ  SULQFLSDO L FHUFXO

GHFRUHOD LLRE LQkQGDVWIHORUHSUH]HQWDUHvPERJ LW 

$FHDVW  UHSUH]HQWDUH VLPXOWDQ  D LQGLYL]LORU L D YDULDELOHORU VH QXPHWH

ELSORWH[SUHVLHLQWURGXV GH*DEULHO  

3UHVXSXQHP PDL vQWkL F  SDUWHD HVHQ LDO  GLQ LQHU LD WRWDO  HVWH H[SOLFDW 

GHSULPXOSODQSULQFLSDO'DF QXHVWHDDWUHEXLHV OLPLW PUH]XOWDWHOHFDUH

YRU XUPD OD SXQFWH ELQH UHSUH]HQWDWH SH SULPXO SODQ SULQFLSDO L OD YDULDELOH

IRDUWHSXWHUQLFFRUHODWHFXSULPHOHGRX FRPSRQHQWHSULQFLSDOH

Cu aceste ipoteze, formula


p
xij* = Yh (i )uhj
h =1
de reconstituLUHDGDWHORUSHUPLWHRE LQHUHDXQHLEXQHDSUR[LP ULDSXQFWHORU

ij XWLOL]kQGGRDUSULPHOHGRX GLPHQVLXQL
*
x
2 Yh
xij* = Yh (i )u hj Notnd Yh =
*
FRPSRQHQWD SULQFLSDO  Yh  UHGXV  L
h =1 h
utiliznd faptulF 

cor ( X j ,Yh ) = h u hj DFHDVW IRUPXO GHYLQH

2
xij* = Yh* (i )cor ( X j , Yh ) (6).
h =1
2675 1906.1
Exemplu Avem x(*Rover ,Cilindree ) = = 1.49 bine reconstituit prin
516.79
Y1* ( Rover )cor (Cilindree, Y1 ) + Y2* ( Rover ) cor (Cilindree, Y2 ) =
1 1
= 3.19 0.96 + 0.77 0.03 = 1.44.
4.656 0.9152

)RUPXOD   H[SULP  IDSWXO F  xij* este aproximativ reconstituit prin


produsul scalar dintre vectorii Ai* = (Y1* (i ),Y2* (i )) L B j = ( cor ( X j , Y1 ), cor ( X j ,Y2 ))
1RW P Pij SURLHF LDYHFWRUXOXL Ai pe axa ( B j ) JHQHUDW GHYHFWRUXO B j .
*

$FHVWHQRWD LLVXQWYizualizate n Figura 9.

/XQJLPHDDOJHEULF  OPij HVWHHJDO FX

2
1
OPij = Y (i)cor(X ,Y )
h
*
j h
cor ( X j ,Y1 ) + cor ( X j ,Y2 ) h=1
2 2

Figura 9

3XQFWHLQGLYL]LLD[HYDULDELOH

1XPLWRUXOILLQGHJDOFXFRUHOD LDPXOWLSO  R j ntre X j LSULPHOHGRX D[H

principale, avem deci: x = R j OPij .


*
ij

$DGDU SURLHF LLOH SXQFWHORU-indivizi Ai* pe axele variabile ( Bij ) au


OXQJLPLOH DOJHEULFH SURSRU LRQDOH FX GDWHOH xij 5HSDUWL LD SURLHF LLORU Pij pe
*

axa ( B j )  UHIOHFW  GHFL ELQH UHSDUWL LD YDORULORU xij* ale variabilei X *j  L vQ
FRQVHFLQ LDFHHDDYDORULORU xij ale variabilei de origine X j .

Q)LJXUDDPFRQVWUXLWELSORWXOUHSUH]HQWDUHDVLPXOWDQ DLQGLYL]LORUL

DYDULDELOHORUvQIHOXOXUP WRU

- LQGLYL]LLVXQWUHSUH]HQWD LSULQSXQFWHOH Ai* = (Y1* (i ),Y2* (i ));


- variabilele sunt reprezentate prin axele ( B j ) situate pe grafic cu
ajutorul punctelor (3cor( X j ,Y1 ),3 cor( X j ,Y2 )). Coeficientul 3 a fost
DOHVvQVFRSXORE LQHULLXQHLPDLEXQHYL]LELOLW LDSXQFWHORU-variabile.
Figura 10
%LSORWUHSUH]HQWDUHDVLPXOWDQ DLQGLYL]LORULDYDULDELOHORU

$VWIHO VH SRDWH YHULILFD IDSWXO F  SURLHF LD PDLQLORU SH D[D 9LWH] 

UHVWLWXLHELQHUHSDUWL LDGDWHORUGHSOHFDUHSURLHF LLOHPDLQLORUFHOHPDLUDSLGH

(BMW 530i, Renault 25, Audi 90 Quatro) se opun bine la cele mai lente (Ford
Fiesta, Nissan Vanette, Fiat Uno, VW Caravelle).
'H DVHPHQHD SURLHF LLOH PDLQLORU SH D[D / LPH RSXQ ELQH PDLQD FHD

PDLODW  9:&DUDYHOOH ODFHDPDLvQJXVW  )LDW8QR 

Prezentarea Analizei n Componente Principale (A.C.P.) conform


DERUG ULLlui Hotelling)

3URFHVXO GH FRQVWUXLUH DO FRPSRQHQWHORU SULQFLSDOH SUH]HQWDW SkQ  DFXP

este laborios, dar conduce la un ansamblu de rezultate foarte complet.


+RWHOOLQJ   D SURSXV  FULWHULL FDUH V  SHUPLW  RE LQHUHD PDL GLUHFW  D

componentelor principaOHGDUVHSLHUGHvQDFHVWFD]GLPHQVLXQHDJHRPHWULF 

a problemei.
9RPSUH]HQWDFULWHULXOFRUHOD LHLDSRLDOGLVSHUVLHL

&ULWHULXOFRUHOD LHL

6H FDXW  P YDULDELOH F1,....., Fm centrate-UHGXVH L QHFRUHODWH FDUH V 

maximizeze criteriul :
m p
1
[ p cor
h =1 j =1
2
( X j , Fh )] (7)

&X DOWH FXYLQWH VH FDXW  UH]XPDUHD YDULDELOHORU GH RULJLQH X 1 ,....., X p
printr-XQ QXP U PDL PLF GH YDULDELOH F1,....., Fm  QHFRUHODWH vQWUH HOH L FDUH V 

reprezinte principalele dimensiuni ale fenomenului studiat.


6HSRDWHGHPRQVWUDF PD[LPXOIRUPXOHL  HVWHDWLQVSHQWUXYDULDELOHOH

Yh
Fh = Yh* = ,care sunt tocmai componentele principale reduse. Valoarea
h
PD[LPXOXLHVWHHJDO FX (1 + .... + m ) / p .

Criteriul dispersiei
p
6H FDXW  P YDULDELOH Z1,....., Z m de forma Z h = v hj X j cu vectorii
j =1

vh = (vh1,....., vhp ) RUWRQRUPD LFDUHV PD[LPL]H]HFULWHULXO

Dispersie( Z
h =1
h ) (8)

6H GHPRQVWUHD]  F  PD[LPXO IRUPXOHL   HVWH DWLQV SHQWUX YHFWRULL

SURSULLQRUPD L v1,....., vm DLPDWULFHLGHFRYDULDQ vQWUHYDULDELOHOH x j asociate


la cele mai mari m valori proprii v1,....., vm LDUHGUHSWYDORDUH v1 + ..... + vm .
'DF OX PP SVHRE LQH
p
v1 + ..... + v p = Dispersie( X j )
j =1

Suma SULPHORU P YDORUL SURSULL UHSUH]LQW  GLVSHUVLD H[SOLFDW  GH FHOH P


variabile Z1 ,....., Z m .
'DF  VH OXFUHD]  FX YDULDELOHOH FHQWUDWH-reduse X 1 ,...... X p , atunci Z h = Yh
* *

LRE LQHP
m

Dispersie(Z
h =1
h ) = 1 + ....... + m .

Metoda de clDVLILFDUH DVFHQGHQW  LHUDUKLF  FX DMXWRUXO FULWHULXOXL OXL

Ward
$FHDVW  PHWRG  FRQGXFH OD XQ DOW SURFHGHX GH D UH]XPD GDWHOH

FRQVWUXLUHDXQXLWLSRORJLL VDXSDUWL LL DLQGLYL]LORUvQFODVHDVWIHOFDLQGLYL]LL

FDUH DSDU LQ DFHOHLDL FODVH V  ILH DVHP Q WR ri (similari) n timp ce indivizii
FDUHDSDU LQODFODVHGLIHULWHV ILHGHRVHEL LGHS UWD L GLVLPLODUL 

Calitatea unei tipologii


6 FRQVLGHU PRWLSRORJLHDPXO LPLLQRDVWUHGHLQGLYL]LvQNFODVHILHFDUH

FODV DYkQGUHVSHFWLY n1 ,....., nk indivizi.


6 QRW PFX G1 ,....., Gk WLSRORJLDFRUHVSXQ] WRDUHQRUXOXLGHSXQFWHDVRFLDW
N = {x1 ,....., xn } LFX g1 ,....., g k centrele de greutate ale acestor clase.
,QHU LDWRWDO DQRUXOXL1VHGHVFRPSXQHvQIHOXOXUP WRU
k k
n n
I ( N , g ) = ( i )d 2 ( gi , g ) + i I (Gi , gi ).
i =1 n i =1 n
3ULPXO WHUPHQ GLQ GUHDSWD VH QXPHWH LQHU LD LQWHU FODVH L P VRDU  IHOXO -
vQFDUHFODVHOHVHGHS UWHD] XQHOHGHDOWHOH

$FHVW WHUPHQ VH QRWHD]  FX , G1 ,....., Gk  L UHSUH]LQW  LQHU LD H[SOLFDW  GH

tipologie.
Al doilea termHQ GLQ GUHDSWD VH QXPHWH LQHU LD LQWUD-FODVH L P VRDU 

omogenitatea claselor.
&DOLWDWHD WLSRORJLHL VH P VRDU  FX DMXWRUXO UDSRUWXOXL GLQWUH LQHU LD LQWHU -
FODVHLLQHU LDWRWDO 

Criteriul lui Ward


Cnd n tipologia G1 ,....., Gk se nlocuiesc GRX  FODVH Gi  L G j prin
reuniunea lor, Gi  G j VHSURGXFHRGLPLQXDUHDLQHU LHLLQWHU-clase.
$FHDVW PLFRUDUH

D (Gi , G j ) = I (G1 ,....., Gi ,...., G j ,....., Gk ) I (G1 ,....., Gi G j ,....., Gk ) poate fi


FDOFXODW LHVWHHJDO FX

ni n j
D (Gi , G j ) = d 2 ( gi , g j )
n(ni + n j )
$FHVW FULWHULX XWLOL]DW SHQWUX P VXUDUHD GLVWDQ HL vQWUH GRX  FODVH Gi  L
G j VHQXPHWHFULWHULXOGHDJUHJDUHDOOXL:DUG

Exemplu:
G1 = {xCitroenBX
*
}
6 OX P . Avem
G2 = {x*Peugeot 405 }

d 2 ( x CitroenBX 405 ) =
* *
, x Peugeot
(1769 1769) 2 (90 90) 2 (182 180) 2
= + + +
267072 1442 609
(1060 1080) 2 (424 440) 2 (168 169) 2
+ + + = 0.189
50824 1638 56

1 1
D 2 ( xCitroenBX
*
, x*Peugeot 405 ) = 0.189 = 0.00393
24 (1 + 1)

&ODVLILFDUHDLHUDUKLF DVFHQGHQW

$OJRULWPXO GH FODVLILFDUH LHUDUKLF  DVFHQGHQW   HVWH LWHUDWLY Q HWDSD

LQL LDO VHSOHDF GHODRSDUWL LHDPXO LPLLGHLQGLYL]LvQNFODVH G1 ,....., Gk LVH


UHJUXSHD] FHOHGRX FODVH Gi L G j , minimiznd criteriul lui Ward, D( Gi , G j ).
'HFLvQWLPSXODFHVWHLLWHUD LLLQHU LDLQWHU FODVHVFDGHFXRFDQWLWDWHHJDO 
-
cu D( Gi , G j  /D HWDSD LQL LDO  ILHFDUH LQGLYLG IRUPHD]  R FODV  L LQHU LD

WRWDO HVWHDWXQFLHJDO FXLQHU LDLQWHU -clase.


/D  HWDSD ILQDO  QX PDL H[LVW  GHFkW R VLQJXU  FODV  L LQHU LD LQWHU
-clase
HVWHGHFLQXO 6XPDSLHUGHULORULQHU LHLLQWHU-clase a diferitelor etape este deci

HJDO  FX LQHU LD WRWDO  /D ILHFDUH HWDS  VH FDOFXOHD]  XQ LQGLFH RE LQXW SULQ

vPS U LUHDSLHUGHULLGHLQHU LHLQWHU FODVHODLQHU LDWRWDO 


-
6H DOHJHWLSRORJLDRE LQXW OD HWDSD FRUHVSXQ] WRDUHXQHLFUHWHULEUXWDOH

a indicelui

$SOLFD LH

$P UHDOL]DW R FODVLILFDUH LHUDUKLF  DVFHQGHQW  D GDWHORU FHQWUDWH -reduse


din exemplul cu ajutorul criteriului lui Ward.
7DEHOXO  LQGLF  GHVI XUDUHD DOJRULWPXOXL L UH]XOWDWHOH  VXQW YL]XDOL]DWH

cu ajutorul dendogramei (arborelui de clasificare) din Figura 11.


QSULPDHWDS VHUHJUXSHD] PRGHOHOH&LWURHQ%;  L3HXJHRW  

SHQWUXFDUHGLVWDQ DOXL:DUGHVWHHJDO FX D ( x6* , x4* ) = 0.00393 .


,QGLFHOH GH DJUHJDUH HVWH HJDO FX    L DSDUH vQ

GHQGRJUDP  OD QLYHOXO OXL &LWURHQ %; DGLF  D HOHPHQWXOXL FDUH SUHFHGH SH

FHO ODW&ODVDHVWHQXPHURWDW 

/DDGRXDHWDS VHRE LQHFODVDUHJUXSkQG)RUG6LHUUD  L3HXJHRW

%UHDN  DOF UHLLQGLFHGHDJUHJDUH D ( x12* , x11* ) / 6 este egal cu 0.15%.


Q DWUHLD HWDS  VHFRQVWUXLHWH FODVD  UHJUXSkQG5HQDXOW   L FODVD

   DOF UHLLQGLFHGHDJUHJDUH D ( x2* , ( x4* , x6* ) / 6 este egal cu 0.19%.
$OJRULWPXO XUPHD]  DFHODL SURFHGHX SkQ  OD XOWLPD HWDS  FkQG VH

UHJUXSHD] FODVDD PLFLORUPDLQL &LYLF6HDW)LHVWD LFODVDIRUPDW 

GLQUHVWXOHDQWLRQXOXL

&ULWHULXOOXL:DUGFXPXODWGHODXOWLPDLWHUD LHSHUPLWHFDOFXODUHDLQHU LHL

H[SOLFDWHSULQGLIHULWHOHWLSRORJLLFRQVWUXLWHQDGHY UODXOWLPDLWHUD LHDYHP

I(43,46)=D(43,46)=3.07202
ntruct iQHU LD H[SOLFDW  ,   SULQ FODVD  IRUPDW  GLQ DQVDPEOXO GH

REVHUYD LL HVWH QXO  Q FRQVHFLQ  LQHU LD H[SOLFDW  GH WLSRORJLD IRUPDW  GLQ

GRX  FODVH  L  HVWH HJDO  FX  L SDUWHD GH LQHU LH H[SOLFDW  HVWH

HJDO FX 

Clasa  D IRVW IRUPDW  SULQ UHXQLXQHD FODVHORU  $XGL %0:

%0: L 7LSR59:&DUDYHOOH 

Din D(44,45)=I(43,44,45)-I(43,46), deducem:


,  '  '    L SDUWHD GH

LQHU LHH[SOLFDW GHDFHDVW WLSRORJLHvQFODVHHVWHHJDO FX

&RQWLQX PVSUHvQFHSXWXODOJRULWPXOXL
Clasa 45 provine din reuniunea claselor:
 (VSDFH2PHJD9:&DUDYHOOH`L

 7LSR55`

'LQIDSWXOF '  , 


-I(43,44,45) deducem:
I(43,44,38,42)=D(43,46)+D(44,45)+D(38,42)=
=3.07202+1.42919+0.29270=4.79391.
&UHWHUHDLQHU LHLH[SOLFDWHILLQGPLF DWXQFLFkQGVHWUHFHGHODWLSRORJLD

IRUPDW  GLQ  FODVH   OD WLSRORJLD IRUPDW  GLQ  FODVH  

DGRSW PWLSRORJLDGDWHORUGLQFODVH

Tabelul 7
&ODVLILFDUHDLHUDUKLF DVFHQGHQW

Descrierea claselor formate


Elementul Nr. Criteriul
Elementul
Clasa care elemente lui Indice(%).
FDUHXUPHD]
precede FRQ LQXWH Ward
Tabelul 8
0HGLLOHYDULDELOHORUSHFODVHLWHVWXO)LVKHU

Figura 11
Dendograma
Figura 12
Vizualizarea tipologiei din 3 clase

3HQWUX D LQWHUSUHWD FX PDL PXOW  SUHFL]LH DFHDVW  WLSRORJLH DP

reprezentat-RSHSODQXOSULQFLSDOGLQ)LJXUDLDPFRQVWUXLW7DEHOXOXQGH
YDULDELOHOH VXQW DUDQMDWH vQ RUGLQHD GHVFUHVF WRare a testului Fisher ntre

YDULDELOHLWLSRORJLH

&ODVDPDLQLORUPLFLFRUHVSXQGHFODVHL

Honda Civic, Seat Ibiza Sxi, Citroen AX Sport, Peugeot 205 Rallye,
Peugeot 205, Fiat Uno, Ford Fiesta.

&ODVDPDLQLORUPHGLLFRUHVSXQGHFODVHL

Fiat Tipo, Renault 19,Citroen BX, Peugeot 405, Renault 21, Espace, Opel
Omega, Ford Sierra, Peugeot 405 Break, Nissan Vanette, VW Caravelle.

&ODVDPDLQLORUPDULFRUHVSXQGHFODVHL

Audi 90 Quatro, BMW325ix, Ford Scorpio, Renault 25, BMW 530i,


Rover 827i.
Concluzie
V-DP SUH]HQWDW vQ DFHVW FDSLWRO  WRDWH HOHPHQWHOH FDUH V  SHUPLW 

LQWHUSUHWDUHD UH]XOWDWHORU XQXL SURJUDP GH DQDOL]  vQ FRPSRQHQWH SULQFLSDOH

S-DX XWLOL]DW SURJUDPHOH 6WDWJUDSKLFV L 63$'1 SHQWUX WUDWDUHD H[HPSOXOXL


prezentat.
3HQWUX FLWLWRUXO FDUH GRUHWH  V  WLH PDL PXOWH GHVSUH $QDOL]D vQ

&RPSRQHQWH 3ULQFLSDOH DWkW OD QLYHO WHRUHWLF FkW L SUDFWLF UHFRPDQG P

OXFU ULOHXUP WRDUH

%RXURFKH L 6DSRUWD   -DFNVRQ   -ROOLIIH   /HEDUW

0RULQHDX L )pQpORQ   /HEDUW 0RULQHDX L 7DEDQ d (1977), Saporta


 6DSRUWDWHI QHVFX  

Q FHHD FH SULYHWH PHWRGHOH GH RE LQHUH D WLSRORJLLORU UHFRPDQG P vQ

special:
(YHULWW   L SURFHGXULOH $&(&/86&/867(5)$67&/86 GLQ

programul SAS.

B
BIIB
BLLIIO
OGGR
RAAF
FIIE
E
Michel Tenenhaus - Methodes Statistiques en Gestion. Editura Dunod
1994, Paris.
Gilbert Saporta, - $QDOL]D GDWHORU L ,QIRUPDWLF  (GLWXUD (FRQRPLF 
9LRULFDWHI QHVFX 1996.

S-ar putea să vă placă și