Documente Academic
Documente Profesional
Documente Cultură
I.
INTRODUCTION
651
Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
may cause the system become unstable, and the process must
be stopped in order to prevent damages. So, BELBIC cannot be
applied on such systems. To solve this problem, another
approach is introduced.
In this paper we employed BELBIC to control an inverted
pendulum. As the pendulum angle is very sensitive to the
control signal and any wrong changes in control signal makes
the system oscillates and the pendulum falls down. So, if we
use BELBIC as a complete model free controller that learn
from scratch individually the learning phase will be impossible,
and the pendulum will fall down many times. For solving the
problem and accelerating the learning phase, a new approach is
used. First, BELBIC learns from a classical simple controller
of the system imitatively. The classical controller can be a
simple one that only stabilizes the system, regardless of good
performance and robustness. Then the output of BELBIC is
gradually applied to the system and it replaces the initial
controller. The important part in switching between controllers
is changing the emotional signal of BELBIC due to change in
objective. When BELBIC learns to imitate behavior of the
initial controller, the objective is reducing the error between
controllers output, and when BELBIC replaces the initial
controller the objective is reduction of tracking and angle error.
This paper organized as follows: Section 2 briefly
introduces BELBIC, in Section 3 a description of the proposed
controller is demonstrated, section 4 discusses the simulation
results and finally section 5 concludes the paper.
II.
S th = max( S i )
(1)
BELBIC
(2)
(3)
A - O
i
(4)
( [ (
D V i = a max 0, S i stress -
A )])
j
(5)
652
Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
A. Proposed Controller
According to idea of hierarchical controller structure, to
design a controller to satisfy various objectives, at first it is
assumed that the objectives can be decoupled and then a
separate controller is designed to satisfy each objective. After
that, outputs of these controllers must be fused together. Fig. 2
shows the proposed BELBIC structure. As there are two major
objectives, position tracking and pendulum angle regulating,
two BELBICs are employed. The cart position error and its
first derivation are defined as sensory signals for one of
BELBICs and the pendulum angel and its first derivation are
for the other. In most of the previously reported structures of
BELBIC, they have only one neuron, because the sensory input
signal was one dimensional. In our structure, as each BELBIC
has two sensory inputs, they must have more than one neuron
and for this task two neurons seems to be adequate.
D W i = b ( S i ( E - stress )
(6)
CONTROLLER DESIGN
653
Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
Stress
0.1
0.08
654
0.01
0.1
0.005
0.05
0
Angle-Error
Position-Error
Stress
0.06
0.04
0.02
1
0.1
0
0.05
-1
Control-Force
Position-Error
0.08
Stress
0.04
0.02
0.06
0.06
0.04
0.02
1
0
Derivation-of-Control-Force
-1
-1
-0.5
0.5
Control-Force
RESULTS
Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
0.5
60
80
100
120
140
40
60
80
100
120
140
0.05
0.04
Pendulum Angle
Stress
40
0.06
0.02
0.1
50
0.05
40
30
Imitative-Stress
20
-0.05
20
time
0.06
Stress
Actual Position
-0.5
20
0.08
Desired Position
0.04
0.5
Desired Position
Cart Position
0.02
0.1
50
0.05
40
30
Performance-Stress
20
-0.5
20
time
Actual Position
40
60
80
100
120
140
40
60
80
100
120
140
Pendulum Angle
0.05
Stress
0.03
0.02
-0.05
20
0.01
0.1
50
0.05
40
30
error
20
time
No-Disturbance
BELBIC- fuzzy stress
Double PID
655
IAE (position)
3.301
3.925
IAE (angle)
0.367
0.457
IACF
9.432
12.219
IADCF
9.262
14.353
Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
[4]
[5]
[6]
[7]
[8]
Employing
Disturbance
DISTURBANCE
BELBICfuzzy
stress
Double
PID
IAE
(position)
IAE (angle)
IACF
[9]
MADCF
STD
STD
STD
STD
5.699
0.915
0.476
0.174
73.247
3.152
151.26
6.245
11.725
2.141
1.692
0.371
82.705
3.247
160.58
7.831
V.
[10]
[11]
[12]
CONCLUSIONS
[13]
[14]
[15]
[16]
[17]
[18]
[19]
REFERENCES
[1]
[2]
[3]
A.P. Shon, D.B. Grimes, C.L. Baker, R.P.N. Rao, A.N. Meltzoff, A
Model-Based Goal-Directed Bayesian Framework for Imitation
Learning in Humans and Machines, Cognitive Science, 2004.
M.I. Kuniyoshi and I. Inoue, Learning by watching: extracting reusable
task knowledge from visual observation of human performance, in
Proc. of IEEE Trans. on Robotics and Automation, vol. 10, no. 6, pp.
799822, 1994.
Alissandrakis, C. L. Nehaniv and K. Dautenhahn, Learning how to do
things with imitation, in Proc. Learning How to Do Things, AAAI Fall
656
[20]
[21]
[22]