Emotional Control of Inverted Pendulum System A Soft Switching From Imitative To Emotional Learning

Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
Emotional Control of Inverted Pendulum System

A soft switching from Imitative to emotional learning
Mehrsan Javan-Roshtkhari, Arash Arami and Caro Lucas

Control and Intelligent Processing Center of Excellence School of ECE, University of Tehran
Tehran, Iran
m.javan@ece.ut.ac.ir, a.arami@ece.ut.ac.ir, lucas@ut.ac.ir
Abstract Model-free control of unidentified systems with
unstable equilibriums results in serious problems. In order to
surmount these difficulties, firstly an existing model-based
controller is used as a mentor for emotional-learning controller.
This learning phase prepares the controller to behave like the
mentor, while prevents any instability. Next, the controller is
softly switched from model based to emotional one, using a FIS1.
Also the emotional stress is softly switched from the mentorimitator output difference to the combination of objectives
generated by a FIS which attentionally modulated stresses. For
evaluating the proposed model free controller, a laboratorial
inverted pendulum2 is employed.
Keywords- BELBIC, Imitative learning, Fuzzy inference
system, model free control, Inverted pendulum system
I.
INTRODUCTION
Imitation is a powerful mechanism for knowledge sharing,

particularly for intelligent agents. Imitative learning is an
approach to artificial intelligence to transfer knowledge from
an expert agent to another agent without any brain to brain
transfers [1]. The imitation accelerates the learning speed and
performance of the learner. In learning by imitation, an agent
who wants to learn (imitator) tries to achieve the same result
which mentor has been gained. It is different from mimicking,
because the imitator does not act like mentor, it only performs
an action which tends to the same results in environment. In
many tasks in which there is an expert agent and new agent
come to learn the proper action for doing the task, the imitative
learning is employed. Imitative learning is widely used in
robotics [2-5].
Development of new algorithms based on biological
systems is an area of interest. The most important aspect of any
intelligent system is its capability of learning. Although the
learning process can be done in various ways, the main aim is
adaptation of parameters to improve the performance of the
system and come over the changes in the environment [6]. As
the emotional behavior of humans and other animals is an
important part of their intelligence, modeling this process leads
to have an intelligent system with fast learning ability [7].
Although evolution mechanism codes emotional reactions in
1 Fuzzy Inference Systems
2 The Digital Pendulum Control System, crane system, manufactured by Feedback Instruments
Limited, England
978-1-4244-2713-0/09/$25.00 2009 IEEE
651
animals, the mammalian can learn them very fast. In biological

system, emotional reactions are utilized for fast decision
making in complex environments or emergency situations. The
main part of mammalians brain which is responsible for
emotional processes is called the limbic system. Several
attempts have been made to model the limbic system [8, 9].
The computational models of Amygdala and Orbitofrontal
cortex which are the main parts of limbic system in the brain
were first introduced in [10]. Consequently, based on works in
[10], brain emotional learning based intelligent controller
(BELBIC) which is an intelligent controller introduced in [11].
The fast learning ability of BELBIC makes it a powerful model
free controller for many tasks. BELBIC is applied on several
applications such as control of intelligent washing machines
[12, 13], in [14] a modified version of BELBIC is employed
for controlling heating, ventilating and air conditioning
(HVAC) systems. Moreover, the BELBIC is used in time series
prediction [15] and sensor-data fusion [16]. The real-time
implementation of the BELBIC for interior permanent magnet
synchronous motor (IPMSM) drives was first introduced in
[17]. The controller was successfully implemented real-time by
using a digital signal processor board for a laboratory 1-hp
IPMSM and the results show fast response, simple
implementation, robustness with respects to uncertainties such
as manufacturing imperfections and good disturbance rejection.
Another real-time implementation of BELBIC in position
tracking and swing damping of laboratorial overhead crane in
computer control via MATLAB external mode is described in
[18]. The stability of brain emotional learning (BEL) system
which is used in control as BELBIC was discussed in [19], also
in order to ensure stability of the system, a general idea for
choosing control parameters is described [19]. Also nonlinear
combinations of objectives are used to design emotional
stresses for BELBIC to control an overhead crane under
uncertainties and disturbances [20].
The main drawback of model free controllers with learning
ability -without any prior knowledge of the systems dynamicssuch as reinforcement learning based controllers and BELBIC
is that in early stages of learning process, they may cause the
low performance, due to producing wrong control signal. This
preliminary phase of learning can result in instability in some
cases. After this first period of learning, if no instability occurs,
the controller can learn the proper control signals to improve
performance gradually. Although BELBIC shows fast learning
ability, it has the same problem, but in a shorter period of time.
If the system is inherently unstable, applying these controllers
may cause the system become unstable, and the process must
be stopped in order to prevent damages. So, BELBIC cannot be
applied on such systems. To solve this problem, another
approach is introduced.
In this paper we employed BELBIC to control an inverted
pendulum. As the pendulum angle is very sensitive to the
control signal and any wrong changes in control signal makes
the system oscillates and the pendulum falls down. So, if we
use BELBIC as a complete model free controller that learn
from scratch individually the learning phase will be impossible,
and the pendulum will fall down many times. For solving the
problem and accelerating the learning phase, a new approach is
used. First, BELBIC learns from a classical simple controller
of the system imitatively. The classical controller can be a
simple one that only stabilizes the system, regardless of good
performance and robustness. Then the output of BELBIC is
gradually applied to the system and it replaces the initial
controller. The important part in switching between controllers
is changing the emotional signal of BELBIC due to change in
objective. When BELBIC learns to imitate behavior of the
initial controller, the objective is reducing the error between
controllers output, and when BELBIC replaces the initial
controller the objective is reduction of tracking and angle error.
This paper organized as follows: Section 2 briefly
introduces BELBIC, in Section 3 a description of the proposed
controller is demonstrated, section 4 discusses the simulation
results and finally section 5 concludes the paper.
II.
Figure 1. Structure of BELBIC [10]
As the thalamus must provide a fast response to stimuli, in

this model the maximum signal, over all sensory inputs, S, is
sent directly to the Amygdala as another input (Eq. 1). Unlike
other inputs to the Amygdala, the thalamic input is not
projected into the Orbitofrontal cortex, so it cannot be inhibited
by itself.
S th = max( S i )
(1)
In the Amygdala, each A node has a plastic connection

weight V . The sensory input is multiplied by the weight and
forms output of the node.
Ai = SiVi
BELBIC
The BELBIC structure is a simple computational model of

most important parts in limbic system of brain, Amygdala and
Orbitofrontal cortex. Fig. 1 shows the schematic diagram of
BELBIC structure and each part of it will be described briefly
[10].
As depicted in Fig. 1, the system consists of four main
parts. As it is seen, Sensory Input signals first entered in
Thalamus. Thalamus is a simple model of real thalamus in the
brain in which some simple pre-processing on sensory input
signals is done. After pre-processing in Thalamus, the signal
will be sent to Amygdala and Sensory Cortex. Sensory cortex
is responsible for subdivision and discrimination of the coarse
output from thalamus and then sent it to Amygdala and
Orbitofrontal cortex. Amygdala is a small structure in the
medial temporal lobe of brain which is thought to be
responsible for the emotional evaluation of stimuli. This
evaluation is in turn used as a basis of emotional states,
emotional reactions and is used to signal attention and laying
down long-term memories. And the last part, Orbitofrontal
Cortex, is supposed to inhibit inappropriate responses from the
Amygdala, based on the context given by the hippocampus
[10]. In this section, functionality of these parts and the
learning algorithm is based on what is stated in [10].
(2)
In the Orbitofrontal cortex, each O is similar to A nodes,

and the output is calculated by applying connection weight
W the input signal.
Oi = SiWi
(3)
The model output can be computed as follow:

E=
A - O
i
(4)
Where the A nodes produce their outputs proportionally to

their contribution in predicting the stress, while the O nodes
inhibit the output of E if necessary.
As it is observed in Fig. 1, except the thalamic signal going
directly to the Amygdala, the Amygdala and the Orbitofrontal
Cortex receive the same input signals. But the main difference
between them is the learning rules.
The connection weights Vi are adjusted proportionally to
the difference between the reinforcement signal and the
activation of the A nodes. The a term is a constant used to
adjust the learning speed.
( [ (
D V i = a max 0, S i stress -
A )])
j
(5)
As mentioned before, the task of the Amygdala is learning

the associations between the sensory and the emotional input to
generate an output. But the Eq 5 is mainly different from
similar associative learning systems, because this weight
adaptation rule is monotonic, i.e., the weights V cannot be
decreased. At first, it may seem as a drawback of learning rule,
652
but this adaptation rule has biological reasons. According to

what occurs in Amygdala, once an emotional reaction is
learned, this should become permanent. The Orbitofrontal
cortex inhibits inappropriate reactions of Amygdala.
controller, the controller is replaced with BELBIC. Due to the

capability of learning, BELBIC will learn to enhance the
systems performance.
The Orbitofrontal cortex learning rule is very similar to the

Amygdala rule:
A. Proposed Controller
According to idea of hierarchical controller structure, to
design a controller to satisfy various objectives, at first it is
assumed that the objectives can be decoupled and then a
separate controller is designed to satisfy each objective. After
that, outputs of these controllers must be fused together. Fig. 2
shows the proposed BELBIC structure. As there are two major
objectives, position tracking and pendulum angle regulating,
two BELBICs are employed. The cart position error and its
first derivation are defined as sensory signals for one of
BELBICs and the pendulum angel and its first derivation are
for the other. In most of the previously reported structures of
BELBIC, they have only one neuron, because the sensory input
signal was one dimensional. In our structure, as each BELBIC
has two sensory inputs, they must have more than one neuron
and for this task two neurons seems to be adequate.
D W i = b ( S i ( E - stress )
(6)
The reinforcement signal for the O nodes is defined as

difference between model output E and the stress signal. In
other words, the O nodes compare expected and received
reinforcement signal, and inhibit output of the model if there is
a mismatch. The main difference between adaptation rule of
Orbitofrontal cortex and Amygdala, is that the Orbitofrontal
connection weight can be increased and decreased as needed to
track the required inhibiting of Amygdala. Parameter b is
another learning rate constant.
As discussed, BELBIC learns from its emotional signal and
produce its output based on sensory inputs and connection
weights. In [19] the stability of BELBIC is demonstrated by
using cell to cell mapping method.
III.
CONTROLLER DESIGN
As mentioned before, the test bed for evaluating the

controller performance is an inverted pendulum which is a well
known SIMO system. Controlling the inverted pendulum is a
hard and interesting control task. The control task is tracking
reference signal and stabilizing the pendulum. The system
which is used to evaluate the controller performance is a
nonlinear model of a laboratorial inverted pendulum system
provided by Feedback Ltd.
Due to nonlinearity of systems state equations and
nonlinear properties of driving motor and friction, designing a
model based controller is a hard task. We used BELBIC as a
model free controller to control the inverted pendulum. The
main challenge in using BELBIC as a model free controller in
unstable systems or stable systems with unstable equilibrium
point such as our test bed is the learning phase at the beginning.
As BELBIC has no information of systems dynamics,
performance of the controlled system may seems to be awful at
the beginning of learning process and the pendulum falls down.
BELBIC has fast learning ability; and theoretically in the short
time it should learn the proper control action according to its
sensory inputs and emotional stress. But in this task the
pendulum angle is very sensitive to the control signal, and any
wrong changes in control signal make the system unstable.
First we used BELBIC as the only controller of the system. It is
possible for BELBIC to learn the proper control strategy, but in
our simulation we find that this process will take too long or
probably impossible in real applications.
To accelerate the learning process and avoiding making the
system unstable, we proposed a new approach. This approach
consists of two parts, in the first part, a simple stabilizing
controller used as the main control system and BELBIC learns
to imitate the behavior of this controller. In the second, after
BELBIC imitatively learned to stabilize the system from initial
653
Figure 2. Structure of controller
The emotional stress signal which will be described,

couples the two separate controllers. Also by employing this
kind of stress signal there will be no need to use complex
fusion block to combine the output of controllers, and just a
summation operator is adequate [16]. And the computational
cost of output fusion is reduced to the cost of fusing some main
and auxiliary objectives in stress generator block. Also to
change the control objectives, and switching from imitative
learning to normal learning, there is no need to change the
controller structure and only changing the emotional stress
signal is enough.
B. Stress generation
As stated before, BELBIC can show various behaviors by
applying different stress signal on it. So, to satisfy different
control objective, proper stress signal must be defined based on
each objective. The ability of achieving more different
objectives can be obtained by defining different stress signals.
1) Imitative learning
In imitative learning, the objective is that BELBIC
produces a similar control signal to the initial controller. So
reducing the difference between these two control signals is
the main goal in this part and there is no more control

objectives. So the emotional stress signal is consists of two
signals, error of control signal and first derivation of this error.
(7)
The reason to add e&u in above sum is that in sometimes
Stress
Stress = w1eu + w2 e&u + w3 eu e&u
0.1
0.08
a) Fuzzy stress generation

In order to enhance the behavior of system the major
objectives are defined. Moreover some extra objectives should
be attended. Tracking error of the cart position and error of
pendulum angle are the main concerns which needed to be
decreased as much as possible. One of the extra objectives is to
avoid reaching edges of the rail which leads to breaking the
operation. To impose this behavior to the cart, closeness to the
edges of rail must be punished via stress signal. Another
important index which must be considered in every control
tasks is energy of control force and its variations. This
objective is imposed to the stress generation unit by such a
behavior that when the stresses of previous parts are small,
BELBIC tries to decrease the control forces. Also when these
stresses are significant, the limiting of control force is relaxed
to increase the possibility of fast responses. Figure 3 shows the
stress generating function. By use of a set of linguistic rules the
most salient objectives can be reinforced to generate emotional
stress with respect to contemporary situation.
To generate the stress signal, we used linguistic rules and
then import them to Sugeno fuzzy inference system [21]. Using
this method to generate stress signal makes BELBIC capable to
attend important parts of stress at any time. As it is mentioned
before, four effective variables, errors of the cart position and
pendulum angle, control force and first derivation of it and 16
rules are employed for generating the emotional stress. Fig. 3
shows some of the resulted fuzzy surface. The inputs of the
stress generation function are angle and position error, control
force and its first derivation.
654
0.01
0.1
0.005
0.05
0
Angle-Error
Position-Error
Stress
0.06
0.04
0.02
1
0.1
0
0.05
-1
Control-Force
Position-Error
0.08
Stress
To satisfy more than two objectives, more complicated

combination of stress signals which are associated with each
objective is necessary. In order to make BELBIC capable to
satisfy more objectives, the emotional stress signal by
combining of stresses is applied, as describe as follow. To
generate the proper stress signal for all objectives, at any time,
the more important objective must be attended more than the
others. To generate this attention mechanism we use linguistic
rules.
0.04
0.02
eu may become zero, while these control signals may have

completely different behaviors.
2) Improving the performance
After BELBIC imitatively learns the control action from
initial controller, it becomes the main controller of the system
and the initial controller is replaced with it. At this time the
control objective changes and reducing position tracking error
and angle error become new control objectives. So the stress
signal must be modified.
0.06
0.06
0.04
0.02
1
0
Derivation-of-Control-Force
-1
-1
-0.5
0.5
Control-Force
Figure 3. Resulted fuzzy surfaces
C. Switching between controllers and stress signals

As we observed in experimental results, hard switching
between controllers and changing stress signals makes the
BELBICs become unstable. So instead of hard switching, a soft
switching must be employed, and the BELBIC control system
must gradually replace the initial controller and at the same
time its emotional stress must be gradually changed. To do this,
we employed a fuzzy inference system to make soft switching,
as it common solution for soft switching. The human linguistic
rules can be imported to Sugeno fuzzy inference system [21]
easily. Fig. 4 shows the fuzzy surface for this task. The inputs
of this fuzzy system are the two mentioned stresses (for
imitative learning and improving performance), the error in
imitative learning phase (difference between initial controller
output and BELBIC output). The mentioned fuzzy switch is
used for switching between both controllers and stresses.
IV.
RESULTS
To validate the result of proposed controller, the results are

compared with the original supplied controller, which consists
of two PID controllers [22]. The initial controller for imitative
learning phase is the mentioned original controller. In addition,
without employing imitative learning, BELBIC did not learn
the proper control signal in more than 150 seconds of training.

Also the pendulum falls down many times in this duration.
Cart Position
0.5
60
80
100
120
140
40
60
80
100
120
140
0.05
0.04
Pendulum Angle
Stress
40
0.06
0.02
0.1
50
0.05
40
30
Imitative-Stress
20
-0.05
20
time
Figure 5. Results Results of originally supplied controller (Double PID) in

presence of disturbance
0.06
Stress
Actual Position
-0.5
20
0.08
Desired Position
0.04
0.5
Desired Position
Cart Position
0.02
0.1
50
0.05
40
30
Performance-Stress
20
-0.5
20
time
Actual Position
40
60
80
100
120
140
40
60
80
100
120
140
Pendulum Angle
0.05
Stress
0.03
0.02
-0.05
20
0.01
0.1
50
0.05
Figure 6. Results of BELBIC with fuzzy stress in presence of disturbance
40
30
error
20
time
Figure 4. Resulted fuzzy surfaces
In order to evaluate the ability of controller to reject

disturbances, a random voltage produced by a Gaussian
distribution with zero mean and 0.1 of variance is applied to
the motor from in some instances. The time of applying this
voltage is random variable which obtained from a uniform
distribution. The mentioned disturbance is applied 8 times from
the 55th seconds until the end of simulation. The results of
original PID controller and proposed controller are depicted in
Figs 5 and 6 respectively. As it is seen BELBIC can imitate the
behavior of original controller in about 10 second from starting
time completely. After that, based on fuzzy switch structure,
from 30 second to 50 the both controller are controlling the
system and after it BELBIC controls the system individually. It
is clear that after imitative learning, BELBIC performance in
reducing tracking and angle error is far better.
As it can be seen, BELBIC clearly shows better
performance in tracking and disturbance rejection which is the
results of its learning capability.
To have a meaningful comparison these controllers, four

performance measures are defined as follow and calculated to
both control systems, originally supplied controller and
BELBIC. As the disturbance applied in randomly selected
times, the experiments carried out 20 times and the statistical
moments of the following parameters (Mean and standard
deviation) are calculated.
IAE: Integral Absolute Error (for cart position and
pendulum angle)
IACF: Integral of Absolute values of Control Force
IADCF: Integral of Absolute values of derivation of
Control Force (shows the fluctuations of the control force)
These performance measures are calculated for the two
mentioned controllers, in normal operation and without
applying disturbance and the results are demonstrated in Table
I.
TABLE I.
PERFORMANCE MEASURES OF VARIOUS CONTROLLERS

WITHOUT DISTURBANCE
No-Disturbance
BELBIC- fuzzy stress
Double PID
655
IAE (position)
3.301
3.925
IAE (angle)
0.367
0.457
IACF
9.432
12.219
IADCF
9.262
14.353
From the Table I it is seen that BELBIC shows the fast

learning ability for tracking. Also the control force signal
which is penalized by stress signal is lower than control force
in other controllers and has less oscillation.
In the presence of disturbance, the above mentioned
measures are calculated for the controllers and the results are
presented in Table II. As it is seen, in presence of disturbance
BELBIC again shows better performance although its
performance decreases slightly in comparison with normal
operation. But it shows better disturbance rejection and
robustness.
TABLE II.
[4]
[5]
[6]
[7]
[8]
PERFORMANCE MEASURES OF VARIOUS CONTROLLERS WITH
Employing
Disturbance
DISTURBANCE
BELBICfuzzy
stress
Double
PID
IAE
(position)
IAE (angle)
IACF
[9]
MADCF
STD
STD
STD
STD
5.699
0.915
0.476
0.174
73.247
3.152
151.26
6.245
11.725
2.141
1.692
0.371
82.705
3.247
160.58
7.831
V.
[10]
[11]
[12]
CONCLUSIONS
In this paper a new approach in employing model free

controller with learning ability is introduced based on a soft
switching between two phases of learning. Although BELBIC
has rapid and powerful learning capability, it cannot simply
used to control systems with unstable equilibriums. The
experimental results show that by employing imitative learning
at first phase, BELBIC can rapidly learn to produce proper
control signal for controlling a system with unstable
equilibrium point. After BELBIC learns imitatively from a
simple classical designed controller, by gradually alter the
objectives and stress signals, we can reduce the tracking and
angle error. Moreover it shows more robustness with
disturbances. Another advantage of BELBIC is that it produces
smoother control force with lower energy.
The fuzzy stress generation leads to superior performance
in terms of tracking and angle error than alternative method for
stress generation. Another interesting result is that BELBIC has
better performance in presence of disturbance than the
originally supplied controller, which is a model based
controller that is well tuned especially for this task. This is the
effect of learning capability of BELBIC, which can produce
more appropriate control force at various working conditions.
[13]
[14]
[15]
[16]
[17]
[18]
[19]
REFERENCES
[1]
[2]
[3]
A.P. Shon, D.B. Grimes, C.L. Baker, R.P.N. Rao, A.N. Meltzoff, A
Model-Based Goal-Directed Bayesian Framework for Imitation
Learning in Humans and Machines, Cognitive Science, 2004.
M.I. Kuniyoshi and I. Inoue, Learning by watching: extracting reusable
task knowledge from visual observation of human performance, in
Proc. of IEEE Trans. on Robotics and Automation, vol. 10, no. 6, pp.
799822, 1994.
Alissandrakis, C. L. Nehaniv and K. Dautenhahn, Learning how to do
things with imitation, in Proc. Learning How to Do Things, AAAI Fall
656
[20]
[21]
[22]
Symp. Series Sea Crest Oceanfront Resort &Conf. Center, North

Falmouth, MA, USA, pp. 1-8, Nov. 3-5, 2000.
H. Mobahi, M. N. Ahmadabadi, B. N. Araabi, Concept Oriented
Imitation, Towards Verbal Human-Robot Interaction, Proc. of IEEE
International Conference on Robotics and Automation, Barcelona,
Spain, April 2005, pp.1507-12.
Lopes, M., Santos-Victor, J. Visual learning by imitation with motor
representations. IEEE Transactions on Systems, Man and Cybernetics,
Part B, Vol. 35, No. 3 (2005) 438449
D. Shahmirzadi, COMPUTATIONAL MODELING OF THE BRAIN
LIMBIC SYSTEM AND ITS APPLICATION IN CONTROL
ENGINEERING, MSc thesis, Texas A&M University, USA, 2005.
C. Balkenius, and J. Moren, Emotional learning: a computational model
of the Amygdala, Cybernetics and Systems, 2001.
C. Balkenius, J. Moren, A Computational Model of Emotional
conditioning in the Brain, workshop on Grounding Emotions in
Adaptive Systems, Zurich, 1998.
J. Moren, Emotion and Learning: A computational model of the
amygdale, PhD thesis, Lund university, Lund, Sweden, 2002.
J. Moren, C. Balkenius, A Computational Model of Emotional
Learning in the Amygdala: From animals to animals, Proc. Of 6th
International Conference on the Simulation of Adaptive Behavior,
Cambridge, MIT Press, pp.383-391, 2000.
C. Lucas, D. Shahmirzadi, N. Sheikholeslami, Introducing BELBIC:
Brain Emotional Learning Based Intelligent Controller, International
Journal of Intelligent Automation and Soft Computing, pages11- 22,
2004.
R.M. Milasi, C. Lucas, B. N. Araabi, Intelligent Modeling and Control
of Washing Machine Using Locally Linear Neuro- Fuzzy (LLNF)
Modeling and Modified Brain Emotional Learning Based Intelligent
Controller (BELBIC), Asian Journal of Control, 8 (4), 2005.
R.M. Milasi, M.R. Jamali, C. Lucas, Intelligent Washing Machine: A
Bioinspired and MultiObjective Approach, Accepted in International
Journal of Control, Automation, and Systems.
N. Sheikholeslami, D. Shahmirzadi, E. Semsar, C. Lucas, M. J.
Yazdanpanah, Applying Brain Emotional Learning Algorithm for
Multivariable Control of HVAC Systems, International Journal
of Intelligent and Fuzzy Systems, 17 (1), pp 35- 46, 2006.
A. Gholipour, C. Lucas, D. Shahmirzadi, Purposeful prediction of
space weather phenomena by simulated emotional learning, IASTED
International Journal of Modelling and Simulation, Vol. 24, No. 2,
pp.6572, 2004.
D. Shahmirzadi, C. Lucas, R. Langari, Intelligent signal fusion
algorithm using BEL- brain emotional learning, 7th Joint Conference
on Information Sciences, JCIS03, 1st Symposium on Brain-Like
Computer Architecture, September 2630, Cary, NC, 2003.
R. M. Milasi, C. Lucas, B. N. Araabi, T. S. Radwan, M. A. Rahman,
Implementation of Emotional Controller for Interior Permanent Magnet
Synchronous Motor Drive, IEEE / IAS 41st Annual Meeting: Industry
Applications, Tampa, Florida, USA, October 8 12, 2006.
M. R. Jamali, A. Arami, B. Hosseini, B. Moshiri, C. Lucas, Real Time
Emotional Control for Anti-Swing and Positioning Control of SIMO
Overhead Travelling Crane, Int. Journal of Innovative Computing,
Information and Control (IJICIC), Vol.4, No. 9, pp. 2333-2344,
September 2008.
D. Shahmirzadi and R. Langari, Stability of Amygdala Learning
System using Cell-to-Cell Mapping Algorithm, Journal of Intelligent
system and control, 497-119, (2005)
A. Arami, M. Javan Roshtkhari and C. Lucas, A Fast Model Free
Intelligent Controller Based on Fused Emotions: A Practical Case
Implementation, 16th IEEE Mediterranean Conference on Control and
Automation June 25-27, 2008, Corsica, France.
T. Takagi and M. Sugeno. Derivation of Fuzzy control rules from human
operators control actions. Proc of IFAC symp. On fuzzy information,
knowledge representation and decision analysis. Pp. 55-60, July 1983
Feedback Instrument Ltd, Digital Pendulum Control Experiments,
Manual: 33-935/936-1V60, 2002.

Emotional Control of Inverted Pendulum System A Soft Switching From Imitative To Emotional Learning

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Emotional Control of Inverted Pendulum System A Soft Switching From Imitative To Emotional Learning

Încărcat de

Drepturi de autor:

Formate disponibile

Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand

Emotional Control of Inverted Pendulum System

Mehrsan Javan-Roshtkhari, Arash Arami and Caro Lucas

Imitation is a powerful mechanism for knowledge sharing,

978-1-4244-2713-0/09/$25.00 2009 IEEE

animals, the mammalian can learn them very fast. In biological

Figure 1. Structure of BELBIC [10]

As the thalamus must provide a fast response to stimuli, in

In the Amygdala, each A node has a plastic connection

The BELBIC structure is a simple computational model of

In the Orbitofrontal cortex, each O is similar to A nodes,

The model output can be computed as follow:

Where the A nodes produce their outputs proportionally to

As mentioned before, the task of the Amygdala is learning

but this adaptation rule has biological reasons. According to

controller, the controller is replaced with BELBIC. Due to the

The Orbitofrontal cortex learning rule is very similar to the

The reinforcement signal for the O nodes is defined as

As mentioned before, the test bed for evaluating the

Figure 2. Structure of controller

The emotional stress signal which will be described,

the main goal in this part and there is no more control

The reason to add e&u in above sum is that in sometimes

Stress = w1eu + w2 e&u + w3 eu e&u

a) Fuzzy stress generation

To satisfy more than two objectives, more complicated

eu may become zero, while these control signals may have

Figure 3. Resulted fuzzy surfaces

C. Switching between controllers and stress signals

To validate the result of proposed controller, the results are

the proper control signal in more than 150 seconds of training.

Figure 5. Results Results of originally supplied controller (Double PID) in

Figure 6. Results of BELBIC with fuzzy stress in presence of disturbance

Figure 4. Resulted fuzzy surfaces

In order to evaluate the ability of controller to reject

To have a meaningful comparison these controllers, four

PERFORMANCE MEASURES OF VARIOUS CONTROLLERS

From the Table I it is seen that BELBIC shows the fast

PERFORMANCE MEASURES OF VARIOUS CONTROLLERS WITH

In this paper a new approach in employing model free

Symp. Series Sea Crest Oceanfront Resort &Conf. Center, North

S-ar putea să vă placă și