Documente Academic
Documente Profesional
Documente Cultură
Lecture #12
Learning
Automaton
The learning cycle begins with an input to the learning
automata system from the environment.
This input elicits one of the finite number of possible response
from the automaton.
The environment receives and evaluates the response and then
provides some form of feedback.
Feedback is used by the automaton to alter its stimulus-
response mapping structure to improve its behavior.
October 14, 2008 Artificial Intelligence, Lecturer #12 4
Learning Automata (Cont.)
(Example: Best Temperature Control Setting)
Initial probability values
1/10 1/10 1/10 …………… 1/10 1/10
Control Selection
50 55 60 65 70 75 80 85 90 95 100
Since the probability values are uniformly distributed, any one of the setting
will be selected with equal distribution.
If the response is good, the automata will modify its probability vector with a
positive increment and reducing all other probabilities
If response is bad the reduce the probability for that temperature.
This process continues until the good selection has max value and others are
near zero.
P (a1 ) Q(b1 )
P (a2 ) Q(b2 )
.
.
P (ak ) Q(bk )
.
Question/Suggestions?