Documente Academic
Documente Profesional
Documente Cultură
:
The learning theory of Thorndike represents the original S-R framework of behavioral
psychology: Learning is the result of associations forming between stimuli and responses.
Such associations or "habits" become strengthened or weakened by the nature and frequency
of the S-R pairings. The paradigm for S-R theory was trial and error learning in which certain
responses come to dominate others due to rewards. The hallmark of connectionism (like all
behavioral theory) was that learning could be adequately explained without refering to any
unobservable internal states.
Thorndike's theory consists of three primary laws: (1) law of effect - responses to a situation
which are followed by a rewarding state of affairs will be strengthened and become habitual
responses to that situation, (2) law of readiness - a series of responses can be chained together
to satisfy some goal which will result in annoyance if blocked, and (3) law of exercise -
connections become strengthened with practice and weakened when practice is discontinued.
A corollary of the law of effect was that responses that reduce the likelihood of achieving a
rewarding state (i.e., punishments, failures) will decrease in strength.
The theory suggests that transfer of learning depends upon the presence of identical elements
in the original and new learning situations; i.e., transfer is always specific, never general. In
later versions of the theory, the concept of "belongingness" was introduced; connections are
more readily established if the person perceives that stimuli or responses go together (c.f.
Gestalt principles). Another concept introduced was "polarity" which specifies that
connections occur more easily in the direction in which they were originally formed than the
opposite. Thorndike also introduced the "spread of effect" idea, i.e., rewards affect not only
the connection that produced them but temporally adjacent connections as well.
:
Connectionism was meant to be a general theory of learning for animals and humans.
Thorndike was especially interested in the application of his theory to education including
mathematics (Thorndike, 1922), spelling and reading (Thorndike, 1921), measurement of
intelligence (Thorndike et al., 1927) and adult learning (Thorndike at al., 1928).
The classic example of Thorndike's S-R theory was a cat learning to escape from a "puzzle
box" by pressing a lever inside the box. After much trial and error behavior, the cat learns to
associate pressing the lever (S) with opening the door (R). This S-R connection is established
because it results in a satisfying state of affairs (escape from the box). The law of exercise
specifies that the connection was established because the S-R pairing occurred many times
(the law of effect) and was rewarded (law of effect) as well as forming a single sequence (law
of readiness).
' :
i i t ti l t/ i
i R ti i t t i t l t t ti
l i
l i i l t it ti
4 t lli i ti t ti l
c
c V l V' VV
V'
tVt
Vl iV
VlV
ll
VtVtVunconditioned stimulusV !"V
Vunconditioned responseV !RV tilV
VtVC"V
VtV!"VVt
lVi
V
tllVtVtVtiliV V it
V
V
tV
iVi
VtV
VVilV
VtVtVC"VlV ll
VtiVtV
conditioned responseV CRV
VV
V
V
VV
VVVV
V
VV
V
V
Vii
lV
VtVV lVV li lV
iti
i
Vi
l
VtVliV
iti
i
VVl#V
V$i
ViV V
VtVilVV
iti
Vi
V
V
lV
ti
VttVtVt
VilVliti
Vi
VtV
VVtV
V
Vi
tV
VtV
VttVV ll
VtVunconditioned responseVtV
V
VtVlitVi
VtV
VVtVlVt
i i
VV
llV
VtVlV ll
VtVpsychic secretionsV
%VtiVti
VV
i t
VttViVVti lVtilVi
VtV
#V
i
V
V
tV
VtV
VV
t
VitVtV
Vt
VtiVtilVl
V
VassociatedVitV
V
V Vliti
V
VitV
V
ViVi
itilV i
tV
lV
VVllVtV llVtV
VtVtiV
V
VtVVVtiti
VtV
Vtt
VtV
litVi
V
VtVtVllV
c
V
V V
V
V VV
VV
V V
V
VV
V V
V!"!V
VV
V #V$V%VV
V &VV%VV
V V'(V
V #V)
VV
V #V$V)%V
V #V'V
V
V &V
VV
VV
V V!VVV
V oV"V
V
V (V*
V
V VV+VV
V ãV,-V(
V
[
.V
V
VVVVV
VVVVV
"
V [ V V "VV"VVV
V
V
VVV
V
VV
VVV
VVV VV
0
V
V
VV
V
VV
V0
V
!
[e
e
e va
at
In addition to the simple procedures described above, some classical conditioning studies are
designed to tap into more complex learning processes. Some common variations are
discussed below.
In this procedure, two CSs and one US are typically used. The CSs may be the same modality
(such as lights of different intensity), or they may be different modalities (such as auditory
CS and visual CS). In this procedure, one of the CSs is designated CS+ and its presentation is
always followed by the US. The other CS is designated CSí and its presentation is never
followed by the US. After a number of trials, the organism learns to i$%&i'i()t* CS+ trials
and CSí trials such that CRs are only observed on CS+ trials.
During R*v*&$)l T&)i(i(g, the CS+ and CSí are reversed and subjects learn to suppress
responding to the previous CS+ and show CRs to the previous CSí.
This is a discrimination procedure in which two different CSs are used to signal two different
interstimulus intervals. For example, a dim light may be presented 30 seconds before a US,
while a very bright light is presented 2 minutes before the US. Using this technique,
organisms can learn to perform CRs that are appropriately timed for the two distinct CSs.
In this procedure, a CS is presented several times before paired CS-US training commences.
The pre-exposure of the subject to the CS before paired training slows the rate of CR
acquisition relative to organisms that are not CS pre-exposed. Also see Latent inhibition for
applications.
' VV
V "V "23V
VVVV"VV
V V
VV
V
' VV
-
"2 "V
VVVV
VV
VVVV "2VV
VV
-
V
V "VVVVV"V
V "2 "4V
3
V V
V
V
VV
- -
"2 "V
VV
V
VV "2 "4V
' VV
V
VV
VV
V "4V
VVVV"
VVVV
V
VVVV1
VVV
V "4V
VV
VVV
VVVV. V
V
V
[et Blon
/ / 3 3 /
0 1 2 4
2 4421
' VV
V "V "3V
VVVV"
V
' VV
à VV
V
V
VVV "V "VV "3V
V
V V5 VV
V
VVV
5VVV
VV "V
VVV
V
VV V5VV
1
VVV
V "
V
[et Alatons
Classical conditioning is short-term, usually requiring less time with therapists and less effort
from patients, unlike humanistic therapies.[itBtiCD DEE ] The therapies mentioned are designed
to cause either aversive feelings toward something, or to reduce unwanted fear and aversion.
There are two competing theories of how classical conditioning works. The first, stimulus-
response theory, suggests that an association to the unconditioned stimulus is made with the
conditioned stimulus within the brain, but without involving conscious thought. The second,
stimulus-stimulus theory involves cognitive activity, in which the conditioned stimulus is
associated to the concept of the unconditioned stimulus, a subtle but important distinction.
The opposing theory, put forward by cognitive behaviorists, is stimulus-stimulus theory (S-S
theory). S-S theory is a theoretical model of classical conditioning that suggests a cognitive
component is required to understand classical conditioning and that S-R theory is an
inadequate model. It proposes that a cognitive component is at play. S-R theory suggests that
an animal can learn to associate a conditioned stimulus (CS) such as a bell, with the
impending arrival of food termed the unconditioned stimulus, resulting in an observable
behavior such as salivation. S-S theory suggests that instead the animal salivates to the bell
because it is associated with the concept of food, which is a very fine but important
distinction.
To test this theory, psychologist Robert Rescorla undertook the following experiment. [2] Rats
learned to associate a loud noise as the unconditioned stimulus, and a light as the conditioned
stimulus. The response of the rats was to freeze and cease movement. What would happen
then if the rats were habituated to the US? S-R theory would suggest that the rats would
continue to respond to the CS, but if S-S theory is correct, they would be habituated to the
concept of a loud sound (danger), and so would not freeze to the CS. The experimental results
suggest that S-S was correct, as the rats no longer froze when exposed to the signal light.[3]
His theory still continues and is applied in everyday life.[1]
une of the
F
earliest literary references to classical conditioning can be found in the comic
novel T if iiGfTHit H
I
tl (1759) by Laurence Sterne. The
[ ]
narrator Tristram Shandy explains how his mother was conditioned by his father's habit of
winding up a clock before having sex with his wife:
VV
V
VVVVVV
V V
VVV VVV
VVV
V
VVVV
V
VV
VVV
V"0 VVV
V VVV
5
VV
VVV"0V5V
VVVV
0(V
V
VV
VVV%(0
VV
V
V
V
65)V%V
V%
VVV
-V
VVVVVV!VV%V
(V5VV(
VV%V
V
VVV
VVV
VVVV
VV
VV
VVVV
%VVVVVVVV
VVVVV%VVVVV
V
VV
V
VVVVVV
%V
V
V iti
VVi
Vi VV
V
ti
Vi
V
tVitVVllVtVtV
l
tVttVVVtV l
V
VVtVi
V l V
V&tVtVttVV
VtVti
V
i
lV
Vi
tVV
&'Vi V:&(i Vt
V
i
ti
VVi
VtV iV) VV ti
lV
t
VtV
tVVtV
ti
VttVt
VtV
ViVtVV
VVV ti
Vt
VllVtV V
Vj
i VtV
*
tV lViVi
VtV
ti
V
lVh Clockwork OranJ eVi
Vi VtVil#V
ti V
Vt
itV*l ViVi
VVlti
VtV VV
V
ViV
VtVt V
il
tV tViV
ViV
lVtVV
Vil
tV tVittVi
i
ViilV
V
V
V"$i
VtV
V"C
iV"t
iV-"VVV"C
i"V ,i
V(illiVVV
""tiVV
V#Vlit
i
VtV
itVli
V
iti
i
#VVi
i
"V
u
8V6VV
VV
K K L M
|
| |
| |
| | N |
!V
V%V
VV| |
| V%VVV
VVV
V
V9.
OV
u
ViVtVVV
.
VtV
iVtV
V
VVV
iV/
tV
iti
i
ViV
iti
i
VV li lV
iti
i
V lV ll
V
respondent conditioning) in that operant conditioning deals with the modification of
"voluntary behavior" or operant behavior. uperant behavior "operates" on the environment
and is maintained by its consequences, while classical conditioning deals with the
conditioning of respondent behaviors which are elicited by antecedent conditions. Behaviors
conditioned via a classical conditioning procedure are not maintained by consequences.[1]
Contents
V
V V
V
VVP VV
Q
V
V+V
VVV V
V V R
VVVV
V #V' V
VVV V
V &V+
VVVV
VV
1
V
V VVV
V oVV VV
V o
V
VV V
V o
V+0VV V
V (V 0
VVVV
V VVV'V
V ãV+V
V V
V VV V
V VVVVVVVV
V V"V
V
V #V
V
V &VV
V
Reinforcement and punishment, the core tools of operant conditioning, are either positive
(delivered following a response), or negative (withdrawn following a response). This creates
a total of four basic consequences, with the addition of a fifth procedure known as extinction
(i.e. no change in consequences following a response).
It's important to note that actors are not spoken of as being reinforced, punished, or
extinguished; it is the actions that are reinforced, punished, or extinguished. Additionally,
reinforcement, punishment, and extinction are not terms whose use is restricted to the
laboratory. Naturally occurring consequences can also be said to reinforce, punish, or
extinguish behavior and are not always delivered by people.
V
V
VV
1VV
VVVVVV V1
V
V
V
VV
1VV
VVVVVV
V1
V
V V
VVVVV
1V VV
VVVV
V
1V VVVVV
1
VVVV
V
V1
VVV
VVV
VV VVV
V
VV V
VV
VVVVVV
V
[et Fo
ontexts of oe
ant ontonn
Here the terms itiv and gtiv are not used in their popular sense, but rather: itiv
refers to addition, and gtiv refers to subtraction.
V ' V
3 SV
VVVV
3V
VVVV
VV
V V
VV1VVV
VVV"VV
VV
V
V
VVV
V
VVVVVVV
V
VV VV
V
V
VV
V
V A V
3 TV
VVVV
3V
VVVV
VVV
V
VV
VV
V1
VVV"V
V
V V
VVVVV
V
V
V
V
V
V VVV
VVV VV
V
V
VVVVVV
V
V
V
V
#
V ' V
3V
VV
VV V
3 UV
VVVV
3V
VVVV
V
V
V VV
VV
V
V
VVV
VVV
V
&
V A V3V
VV
VV V3 VV
V
VVV
3V
VVVV
VVV
V
V
V VVV
VV VV
VV
VVV
VVV
V
hl
V h V
VVVV VVVVVV
VVV
V
VV
V
V
V V
VVVV
3VVV
VVV
VV
V
VVV"VV
V
V
VVV
VVVV V
VVVVV
V
VVV
VVV VVV
VVVV
VVVVV
V
VV
V
V A V
VVVV V
V
VVV
V3V
V VV
VVV VV
V
VV
V
V V
VVVV
V
VV
V
0
VV
V
VV
V
VVVV
VV
VVVV
V V
V
VV
VV
VV
V V
VV
V
V
VV
VVV
V V
# V
V " V
VV
VVV VVVV
VV
V
VV
V
VV
& V
V V
VV
VVV
V VV
V
VVV
1VV
VV
V
&V
The first scientific studies identifying neurons that responded in ways that suggested they
encode for conditioned stimuli came from work by +ahlon deLong.[7][8] and by R.T. "Rusty"
Richardson and deLong.[8] They showed that nucleus basalis neurons, which release
acetylcholine broadly throughout the cerebral cortex, are activated shortly after a conditioned
stimulus, or after a primary reward if no conditioned stimulus exists. These neurons are
equally active for positive and negative reinforcers, and have been demonstrated to cause
plasticity in many cortical regions.[9] Evidence also exists that dopamine is activated at
similar times. There is considerable evidence that dopamine participates in both
reinforcement and aversive learning.[10] Dopamine pathways project much more densely onto
frontal cortex regions. Cholinergic projections, in contrast, are dense even in the posterior
cortical regions like the primary visual cortex. A study of patients with Parkinson's disease, a
condition attributed to the insufficient action of dopamine, further illustrates the role of
dopamine in positive reinforcement.[11] It showed that while off their medication, patients
learned more readily with aversive consequences than with positive reinforcement. Patients
who were on their medication showed the opposite to be the case, positive reinforcement
proving to be the more effective form of learning when the action of dopamine is high.
V O V V
VVV
1VVVVVV
V
VVV
VV
V
VV
V
VV
VVV
1VV
V
VVV
VVVV
VV
V
V
V VVVVVVVVV
V"V
V VVV
V
VV
V
V
VVVVVVVV
V
V
VV
V
V 4 VVV
VV
VV
1V
VVV
VV
VVV
1
VV
VVVV
VVV
V
V
VV
V
VV
V VVVV
VV
V
VVVV
VVVV
VVVV
V
1VVVVV
V
V
V'VV
V
V
VV
V VVVVVVV
V
V
VVVV
VV
V
VVVV
V
#
V c VVV
1V
VV VVV
3VVV
V
V
V
VVV
V
V
V'VVV
1V
V
V
V
VV
V
V
VVV
VV
V
V
V V
VV
VV
V
VV
V
VV
V
V
VVV V
V
VV
V
VVV V
V V
V
VV
V
V
VV V
V
VV V
V
V
&
V O V
V
VV
0V
VVVV
1VVV
VVV
VV
VVV
1V
V V VVVVVVV
1V
VV
VVVV
VV
V VV VV
V
VV VV V
VVVV0VVVVV V
V
3
V'VVVV V
V
VV
V
V
VVVVVVVV
VV VVV VVVVVV
VV
V
V
V
V
VV
VVV
VV
V
1
VV
V
VV
1
V
VV
3VVVV
VVVV
V
VV
V
+ost of these factors exist for biological reasons. The biological purpose of the Principle of
Satiation is to maintain the organism's homeostasis. When an organism has been deprived of
sugar, for example, the effectiveness of the taste of sugar as a reinforcer is high. However, as
the organism reaches or exceeds their optimum blood-sugar levels, the taste of sugar becomes
less effective, perhaps even aversive.
The Principles of Immediacy and Contingency exist for neurochemical reasons. When an
organism experiences a reinforcing stimulus, dopamine pathways in the brain are activated.
This network of pathways "releases a short pulse of dopamine onto many dendrites, thus
broadcasting a rather global reinforcement signal to postsynaptic neurons."[12] This results in
the plasticity of these synapses allowing recently activated synapses to increase their
sensitivity to efferent signals, hence increasing the probability of occurrence for the recent
responses preceding the reinforcement. These responses are, statistically, the most likely to
have been the behavior responsible for successfully achieving reinforcement. But when the
application of reinforcement is either less immediate or less contingent (less consistent), the
ability of dopamine to act upon the appropriate synapses is reduced.
uperant variability is what allows a response to adapt to new situations. uperant behavior is
distinguished from reflexes in that its
(the form of the response) is
subject to slight variations from one performance to another. These slight variations can
include small differences in the specific motions involved, differences in the amount of force
applied, and small changes in the timing of the response. If a subject's history of
reinforcement is consistent, such variations will remain stable because the same successful
variations are more likely to be reinforced than less successful variations. However,
behavioral variability can also be altered when subjected to certain controlling variables.[13]
[et Avoane lea
nn
Avoidance learning belongs to negative reinforcement schedules. The subject learns that a
certain response will result in the termination or prevention of an aversive stimulus. There are
two kinds of commonly used experimental settings: discriminated and free-operant avoidance
learning.
In this experimental session, no discrete stimulus is used to signal the occurrence of the
aversive stimulus. Rather, the aversive stimulus (mostly shocks) are presented without
explicit warning stimuli. There are two crucial time intervals determining the rate of
avoidance learning. This first one is called the S-S-interval (shock-shock-interval). This is the
amount of time which passes during successive presentations of the shock (unless the operant
response is performed). The other one is called the R-S-interval (response-shock-interval)
which specifies the length of the time interval following an operant response during which no
shocks will be delivered. Note that each time the organism performs the operant response, the
R-S-interval without shocks begins anew.
VV
V
VVV VV
V
VV "VV
V"V
0
3
V VV
VV V
V
V
'
V VV
V
VV "V
VV
V"V
V
V
V
V
V VVVV
V V
V
V
VV
VV
VVV
V V
V
VV
VV
V VV"V
V
VV
VVVVVV "
VV
V
VV
V
V
VVV
VV
VVV
VVV
V
V
VV
VVVV
V"VVV
VV V
VVV
V
V
V
VV
V
VVV "
V
In 1957, Skinner published * l vi, a theoretical extension of the work he had
pioneered since 1938. This work extended the theory of operant conditioning to human
behavior previously assigned to the areas of language, linguistics and other areas. * l
vi is the logical extension of Skinner's ideas, in which he introduced new functional
relationship categories such as intraverbals, autoclitics, mands, tacts and the controlling
relationship of the audience. All of these relationships were based on operant conditioning
and relied on no new mechanisms despite the introduction of new functional categories.
Applied behavior analysis, which is the name of the discipline directly descended from
Skinner's work, holds that behavior is explained in four terms: conditional stimulus (SC), a
discriminative stimulus (Sd), a response (R), and am reinforcing stimulus (Srein or Sr for
reinforcers, sometimes Save for aversive stimuli).[1 ]
is a referring to the choice made by a rat, on a compound schedule called
a multiple schedule, that maximizes its rate of reinforcement in an operant conditioning
context. +ore specifically, rats were shown to have allowed food pellets to accumulate in a
food tray by continuing to press a lever on a continuous reinforcement schedule instead of
retrieving those pellets. Retrieval of the pellets always instituted a one-minute period of
extinction during which no additional food pellets were available but those that had been
accumulated earlier could be consumed. This finding appears to contradict the usual finding
that rats behave impulsively in situations in which there is a choice between a smaller food
object right away and a larger food object after some delay. See schedules of
reinforcement.[15]
c
!
Guthrie's contiguity theory specifies that "a combination of stimuli which has accompanied a
movement will on its recurrence tend to be followed by that movement". According to
Guthrie, all learning was a consequence of association between a particular stimulus and
response. Furthermore, Guthrie argued that stimuli and responses affect specific sensory-
motor patterns; what is learned are movements, not behaviors.
In contiguity theory, rewards or punishment play no significant role in learning since they
occur after the association between stimulus and response has been made. Learning takes
place in a single trial (all or none). However, since each stimulus pattern is slightly different,
many trials may be necessary to produce a general response. une interesting principle that
arises from this position is called "postremity" which specifies that we always learn the last
thing we do in response to a specific stimulus situation.
Contiguity theory suggests that forgetting is due to interference rather than the passage of
time; stimuli become associated with new responses. Previous conditioning can also be
changed by being associated with inhibiting responses such as fear or fatigue. The role of
motivation is to create a state of arousal and activity which produces responses that can be
conditioned.
Contiguity theory is intended to be a general theory of learning, although most of the research
supporting the theory was done with animals. Guthrie did apply his framework to personality
disorders (e.g. Guthrie, 1938).
The classic experimental paradigm for Contiguity theory is cats learning to escape from a
puzzle box (Guthrie & Horton, 19 6). Guthrie used a glass paneled box that allowed him to
photograph the exact movements of cats. These photographs showed that cats learned to
repeat the same sequence of movements associated with the preceding escape from the box.
Improvement comes about because irrelevant movements are unlearned or not included in
successive associations.
'
1. In order for conditioning to occur, the organism must actively respond (i.e., do things).
2. Since learning involves the conditioning of specific movements, instruction must present
very specific tasks.
. The last response in a learning situation should be correct since it is the one that will be
associated.
Swiss biologist and psychologist wean Piaget (1896-1980) observed his children (and their
process of making sense of the world around them) and eventually developed a four-stage
model of how the mind processes new information encountered. He posited that children
progress through stages and that they all do so in the same order. These four stages are:
V (Birth to 2 years old). The infant builds an understanding of
himself or herself and reality (and how things work) through interactions with the
environment. It is able to differentiate between itself and other objects. Learning takes
place via assimilation (the organization of information and absorbing it into existing
schema) and accommodation (when an object cannot be assimilated and the schemata
have to be modified to include the object.
V ' (ages 2 to ). The child is not yet able to conceptualize
abstractly and needs concrete physical situations. ubjects are classified in simple
ways, especially by important features.
V c (ages 7 to 11). As physical experience accumulates,
accomodation is increased. The child begins to think abstractly and conceptualize,
creating logical structures that explain his or her physical experiences.