Documente Academic
Documente Profesional
Documente Cultură
2
1. Pemilihan training experiences
• The type of training experience E available to
a system can have significant impact on
success or failure of the learning system.
• Type learning experiences (feedback)
– Direct
– Indirect
• The degree to which the learner controls the sequence
of training examples
– Direct
– Indirect 3
Contoh 1: Credit Approval
• Training experiences:
direct feedback
experiences
• Fungsi target f:
data{+,-}
4
Contoh 2: Checker Problem
• Checker Learning Problem
– A computer program that learns to
play checkers might improve its
performance as measured by its
ability to win at the class of tasks
involving playing checkers games,
through experience obtained by
playing games against itself
• Task T : playing checkers
• Performance measure P: % of game
won against opponents
• Training experience E : playing practice
game against itself
5
Direct training experiences
8 opening
moves (as
Black):
9
Data Latih yang Baik
11
Asumsi
• Let us assume that our system will train by
playing games against itself.
• And it is allowed to generate as much training
data as time permits
• Feedback yang dipilih:
– indirect (playing against itself)
– Kelebihan no external trainer dapat men-
generate data sebanyak mungkin
12
2. Pemilihan Fungsi Target
• Thus, the program needs to learn how to
choose the best move for any given board
state
– Fungsi Target : ChooseMove: Board Move
atau M = ChooseMove (B)
– Input : Board (set of legal board states (B))
– Output : Move (set of legal moves (M))
13
2. Pemilihan Fungsi Target:
Alternatif lain
• Memberi skor untuk setiap board
• V: Board
Real {higher
score for
better
board}
• Move:
max(V(legal
successor
board state))
14
Fungsi Target V: Board Real
15
3. Pemilihan Representasi Fungsi
Target
• Collection of rules
• Neural Net
• Polynomial Function of board features
• Linear Function of board features
16
4. Pemilihan Algoritma Pembelajaran
• The problem of learning a checkers strategy reduces to
the problem of learning values for the coefficients w0
through w6 in the target function representation
17
4. Pemilihan Algoritma Pembelajaran
18
Design
Choices
19
Contoh Lain
Fungsi Target Fungsi Estimasi
Checkers Player ChooseMove: Board Move V: Board R ;
Dataset: <Move>* Boolean Move: max(V(legal succ board
state))
Prediksi harga PredictPrice: Date, Real* Real
Dataset: <Date, Real>*
Prediksi top Ranker: Date, Game* <game,rank>*
games Dataset: <Date, <Game, Rank>*>*
Deteksi fraud isFraud: Transaction 0-100 isFraud: Transaction 0-100
Dataset: <transaction, boolean>*
Best buzzer ChooseBuzzer: Product , buzzer* Profiler: user, tweet
<Buzzer, rank>* <gender, education, age,
Dataset: <user, tweets, gender, interest>
education, age, interest>* ChooseBuzzer: product_spec
<buzzer, rank>*
13/09/2018 MLK-DPL/IF4071 20
Referensi
• Mitchell, T., Machine Learning, 1997,
McGraw-Hill, Chapter 1.
21