Documente Academic
Documente Profesional
Documente Cultură
PERTEMUAN 6
WILDAN BUDIAWAN Z
OUTLINE
What is classification?
Preparing and comparing classification method/algorythm
CLASSIFICATION OVERVIEW
Databases are rich with hidden information that can be used for making intelligent
business decision.
Clasification and prediction are two forms of data analysis which can be used to
extract model describing important data classes or to predict future data trends.
A model is built describing a predetermined set of data classes or concepts. The
The individual tuples making up the training set are refered to as training samples and
are randomly selected from the sample population.
Data
cleaning
Accuracy
Speed
Robusness
Scalability
Interpretablity
DATA MINING PERTEMUAN 6
similarity =
=1 ( , )
: Kasus baru
: jumlah atribut
Jenis Kelamin
Pendidikan
Agama
Bermasalah
S1
Islam
Ya
SMA
Kristen
Tidak
SMA
Islam
Tidak
Definisi Bobot
Atribut
Kedekatan Nilai
Atribut Jenis
Kelamin
Kedekatan Nilai
Atribut
Pendidikan
Kedekatan Nilai
Atribut Agama
Bobot
0.5
1
0.75
Nilai 1
L
P
L
P
Agama
Nilai 2
L
P
P
L
Kedekatan
1
1
0.5
0.5
Nilai 2
S1
SMA
SMA
S1
Kedekatan
1
1
0.4
0.4
Pendidikan
Nilai 1
S1
SMA
S1
SMA
Calculate
KASUS BARU
Kedekatan kasus baru dengan kasus 1
Kasus Baru:
Jenis kelamin : L
Pendidikan
: SMA
Agama
: Kristen
Class
: ???
a
b
c
d
e
f
:
:
:
:
:
:
:
:
:
:
:
:
=
=
+ +()
++
10.5 + 0.41 +(0.750.75)
0.5+1+0.75
1.46
2.25
Jarak = 0.065
DATA MINING PERTEMUAN 6
| ()
=
()
Keterangan :
X
P (H|X)
P(H)
P(X | H)
P(X)
: probabilitas dari X
Income
High
High
High
Medium
Low
Low
Low
Medium
Low
Medium
Medium
Medium
Low
Medium
Student
No
No
No
No
Yes
Yes
Yes
No
Yes
Yes
Yes
No
Yes
No
Credit Rating
Fair
Excellent
Fair
Fair
Fair
Excellent
Excellent
Fair
Fair
Fair
Excellent
Excellent
Fair
Excellent
Buy Computer:
No
No
Yes
Yes
Yes
No
Yes
No
Yes
Yes
Yes
Yes
Yes
No
Kasus baru:
Age
: <=30
Income
: Medium
Student
:Yes
Credit Rating
: Fair
Class
: ???
Hitung
Probabilitas
Class
Hitung
Probabilitas
Atribut Age
Hitung
Probabilitas
Atribut
Income
Hitung
Probabilitas
Atribut
Student
Hitung
Probabilitas
Atribut
Credit Rating
Kalkulasi
Class
Calculate
Class = Yes
= P(X | Buy Computer = Yes) *
P(Buy Computer=yes)
= 0.044 * 0.643
= 0.028
DATA MINING PERTEMUAN 6
Class = No
= P(X | Buy Computer = No) *
P(Buy Computer=No)
= 0.019 * 0.357
= 0.007