Sunteți pe pagina 1din 5

Proiect Inteligenta Computationala

Datele sunt luate pentru anul 2012


Atribute:
Suprafata tarii
Populatia tarii
GDP=PIB
GDP per capita
GDP growth rate=rata de crestere a PIB (masurata in procente)
Inflation=rata inflatiei (%)
Trade=suma exporturilor si importurilor de bunuri si servicii (% din PIB)

Employment rate-rata de angajare


Surse :
http://data.worldbank.org/indicator/NY.GDP.MKTP.KD.ZG/countries
http://ec.europa.eu/eurostat/tgm/table.do?
tab=table&language=en&pcode=tsdec420&tableSelection=3&footnotes=yes&labeling=labels
http://ec.europa.eu/eurostat/tgm/table.do?
tab=table&init=1&language=en&pcode=tec00115&plugin=1
1. Calculati statisticile descriptive ale setului de date: medie,
dispersie, variante, matrice de covarianta si de corelatie, histograme.
2. Determinati posibile dependente intre variabile, ecuatii de regresie,
coeficientii dreptelor de regresie estimati, trasarea dreptelor de
regresie+ interpretari.
a=read.table("tariue.txt",header=TRUE, sep="\t")
attach(a)
summary(a)
hist(Suprafata)
hist(Populatie)

hist(GDP)
hist(GDPgrowthrate)
hist(GDPpercapita)
hist(Employmentrate)
hist(Trade)
hist(Inflation)
lin<-lm(Employmentrate~Inflation )# dreapta de regresie/dependenta intre variabile
summary(lin)

3. ACP: vectori proprii, valori proprii, criterii de determinare a


componentelor principale, scree plot, matricea scorurilor, biplot: grafice
+ interpretari.
4. SVM: construirea setului de antrenare si de testare, diverse forme ale
functiei kernel: liniara, polinomiala, sigmoid, radiala, nr. de vectori
suport in fiecare situatie,
predictii, matricea de confuzie, rata de exactitate a modelului pt fiecare
situatie, coef. Cohen
5. Analiza cluster: kmeans, kmedoids, fuzzy clustering, ierarhica,
dendograme, grafice, interpretari; diverse valori pentru numarul de
clustere, comentarii asupra siluetei clusterelor, matrice de confuzie,
rata de exactitate a modelului pt fiecare situatie
6. Arbori de decizie
7. SOM : construirea hartilor,
8. Clasificatorul Naiv Bayesian,
9. Retele neuronale, predictie, rata de exactitate a modelului , coef. Cohen

>fisier<-data.frame(Tari=c("Austria", "Belgia", "Bulgaria", "Cipru", "Croatia",


"Danemarca", "Estonia", "Finlanda","Franta", "Germania", "Grecia", "Irlanda",

"Italia", "Letonia", "Lituania", "Luxemburg", "Malta", "Polonia", "Portugalia", "Regatul


Unit","Republica Ceha","Romania","Slovacia","Slovenia","Spania","Suedia","Tarile de
Jos","Ungaria",), Suprafata=c(83 879, 30 528, 110 899, 9 251, 87 661, 42 915, 45 227,
338 434, 632 833, 357 137, 131 957, 69 797, 301 336, 64 562, 65 300, 2 586, 316, 312 679,
92 211, 248 527, 78 866, 238 390, 49 036, 20 273, 505 990, 438 575, 41 540, 93 023 ),
Populatie=c(8 408 121,11 094 850, 7 327 224, 82 011, 4 275 984, 5 580 516, 1
325 217, 5 401 267, 65 287 861, 80 327 900, 11 123 034, 4 582 707, 59 394 207,
2 044 813, 3 003 641, 524 853, 417 546, 38 538 447, 10 542 398, 63 495 303, 10
505 445, 20 095 996, 5 404 322, 2 055 496, 46 818 219, 9 482 855, 16 730 348, 9
931 925 ), PIB=c(307,375.881,39.927,17.72,43.682,245.252, 17.415, 192.35,
2032, 2666, 193.347, 163.938, 1567, 22.257, 32.94, 42.918, 6.88, 381.48,
165.107, 1933,152.926,131. 579, 71.096, 35.319, 1029, 407.82, 599.338, 96.98),
RataPIB=c(0.9,1.6,2,-2.4,-2.2,-0.7,4.7,-1.5,0.3,0.4,-6.6,-0.3,-2.3,4.8,3.8,0.2,2.5,1.8,-3.3,0.7,0,.6,1.6,-2.6,-2.1,-0.3,-1.6,-1.5),
RataAngajare=c(70.3,61.7,60.64.8,50.2,72.2,69.4,72.5.65.1,71.5,45.2,59.4,50.5,6.
4,67.9,64.1, 46.6,57.5,63,68.4,62.5,56.3,57.3,64.6,54.6,76.8,71.9,56.4),
RataInflatie=c(2.6,2.6,2.4,3.1,3.4,2.4,4.2,3.2,2.2,2.1,1,1.9,3.3,2.3,3.2,2.9,3.2,3.7,2.
8,2.8,3.5,3.4,3.7,2.8,2.4,0.9,2.8,5.7))
>fisier
>set.seed(5)

> km <- kmeans(fisier[,2:4], 3, 15) Datele sunt clusterizate cu algoritmul k-means


cu 3 clustere si 15 iteratii.

>print(km)
> plot(fisier, col = km$cluster)

4.SVM/L
library(e1071)
library(scales)
data<-read.table (file="Tari.txt", header= TRUE, sep="\t")
data
train <- data Setul de date de antrenare este chiar setul initial.
t <- ncol(train) t=nr. decoloane ale setului de antrenare
t
target <- train[, t] target retine ultima coloana din setul antrenare, care contine
variabila calitativa calitate=>Ratainf
model<- svm(Ratai~ ., data = train)
model
RETEA NEUR

1-2.
Call:
lm(formula = PIB ~ Suprafata)
Residuals:
1
2
3
4
5
6
154.90 178.65 -89.32 -197.50 -105.22

58.50

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.230e+02 1.381e+02 1.615 0.182
Suprafata -8.458e-04 1.958e-03 -0.432 0.688
Residual standard error: 171.3 on 4 degrees of freedom
Multiple R-squared: 0.04458, Adjusted R-squared: -0.1943
F-statistic: 0.1867 on 1 and 4 DF, p-value: 0.688
Interpretare:Ecuatia PiB=2.230e+02 -8.458e-04 *Suprafata +