Documente Academic
Documente Profesional
Documente Cultură
Type of method
Infrastructure preparation exploration analysis intepretation - exploration Supervised unsupervised Classification - prediction
Process
Training Data
RANK Assistant Prof Assistant Prof Professor Associate Prof Assistant Prof Associate Prof
YEARS TENURED (Model) 3 no 7 yes 2 yes 7 yes IF rank = professor 6 no OR years > 6 3 no
Classifier
Unseen Data
(Jeff, Professor, 4)
NAME RANK T om M erlisa G eorge 5 Joseph A ssistant P rof A ssociate P rof P rofessor A ssistant P rof YEARS TENURED 2 7 5 7 no no yes yes
Tenured?
Decision trees
Information gain
Information gain:
Info ( D) pi log 2 ( pi )
i 1
v
InfoA ( D)
j 1
| Dj | | D|
Info( D j )
Concepts
Overfitting Pruning: postpruning and prepruning
Nave bayes
10
Nave Bayes
Bayes theorem:
n P(X | C i) P( x | C i) P( x | C i) P( x | C i) ... P( x | C i) k 1 2 n k 1
11
Independence assumption:
( xi -mi )2 2s i2
12
Try it:
Outlook Sunny Sunny Overcast Rainy Rainy Rainy Overcast Sunny Sunny Rainy Sunny Overcast Overcast Temp Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot Humidity High High High High Normal Normal Normal High Normal Normal Normal High Normal Windy False True False False False True True False False False True True False Play No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes
Rainy
Mild
High
True
No
13
Yes
Temperature Concepts
Humidity
Windy
Play
No
3 0 2
Yes
2 4 3
No
2 2 1 2/5 2/5 1/5 High Normal High Normal
Yes
3 6 3/9 6/9
No
4 1 4/5 1/5 False True False True
Yes
6 3 6/9 3/9
No
2 3 2/5 3/5
Yes
9
No
5
9/ 14
5/ 14
Outlook Sunny
Temp. Cool
Humidity High
Windy True
Play ?
Likelihood of the two classes For yes = 2/9 3/9 3/9 3/9 9/14 = 0.0053 For no = 3/5 1/5 4/5 3/5 5/14 = 0.0206 Conversion into a probability by normalization: P(yes) = 0.0053 / (0.0053 + 0.0206) = 0.205 P(no) = 0.0206 / (0.0053 + 0.0206) = 0.795
14
Concepts
Zero-frequency problem Smoothing / Laplacian correction
15
K-nearest neighbor
16
Concepts
Lazy learner Distance function
Which ones?
17
And now
Assignment classification, classification 2
18