Sunteți pe pagina 1din 13

Operant conditioning

From Wikipedia, the free encyclopedia

Operant conditioning (or instrumental conditioning) is a type of learning in which an individual's behavior is modified
by its consequences; the behaviour may change in form, frequency, or strength. Operant conditioning is a term that was
coined by B.F Skinner in 1937[1] Operant conditioning is distinguished from classical conditioning (or respondent
conditioning) in that operant conditioning deals with the modification of "voluntary behaviour" or operant behaviour.
Operant behavior operates on the environment and is maintained by its consequences, while classical conditioning deals
with the conditioning of reflexive (reflex) behaviours which are elicited by antecedent conditions. Behaviours conditioned
via a classical conditioning procedure are not maintained by consequences. [2]
Contents
[hide]

1 Reinforcement, punishment, and extinction

2 Skinner Box

3 Thorndike's law of effect

4 Biological correlates of operant conditioning

5 Factors that alter the effectiveness of consequences

6 Operant variability

7 Avoidance learning

8 Two-process theory of avoidance

9 Verbal Behavior

10 Four term contingency

11 Operant hoarding

12 Questions about the law of effect

13 See also

14 References

15 External links

Keadaan/kondisi pengoperan ( atau pengaruh keadaan sebagai penolong/alat) adalah suatu jenis
pelajaran di mana perilaku perorangan dimodifikasi oleh konsekwensi nya; perilaku boleh berubah di
luarnya, frekwensi, atau kekuatan. Operant Pengaruh keadaan adalah suatu istilah yang adalah coined
oleh B.F skinner di (dalam) 1937[1] Operant Pengaruh keadaan dibedakan dari pengaruh keadaan klasik
( atau pengaruh keadaan responden) di (dalam) yang operant pengaruh keadaan berhadapan dengan
modifikasi " perilaku sukarela/fakultatif" atau operant perilaku. Operant Perilaku membedah/beroperasi
pada lingkungan [itu] dan dirawat oleh konsekwensi nya, [selagi/sedang] pengaruh keadaan klasik
berhadapan dengan pengaruh keadaan [dari;ttg] refleksif ( refleks) yang perilaku ditimbulkan oleh kondisikondisi yang terdahulu. Perilaku mengkondisikan via suatu prosedur pengaruh keadaan klasik tidaklah
dirawat oleh konsekwensi [edit]Reinforcement, punishment, and extinction
Reinforcement and punishment, the core tools of operant conditioning, are either positive (delivered following a response),
or negative (withdrawn following a response). This creates a total of four basic consequences, with the addition of a fifth
procedure known as extinction (i.e. no change in consequences following a response).
It is important to note that actors are not spoken of as being reinforced, punished, or extinguished; it is the actions that are
reinforced, punished, or extinguished. Additionally, reinforcement, punishment, and extinction are not terms whose use is
restricted to the laboratory. Naturally occurring consequences can also be said to reinforce, punish, or extinguish behavior
and are not always delivered by people.

Reinforcement is a consequence that causes a behavior to occur with greater frequency.

Punishment is a consequence that causes a behavior to occur with less frequency.

Extinction is caused by the lack of any consequence following a behavior. When a behavior is inconsequential
(i.e., producing neither favorable nor unfavorable consequences) it will occur less frequently. When a previously
reinforced behavior is no longer reinforced with either positive or negative reinforcement, it leads to a decline in that
behavior.

[edit]Four contexts

of operant conditioning

Here the terms positive and negative are not used in their popular sense, but rather: positive refers to addition,
and negative refers to subtraction.

What is added or subtracted may be either reinforcement or punishment. Hence positive punishment is sometimes a
confusing term, as it denotes the "addition" of a stimulus or increase in the intensity of a stimulus that is aversive (such as
spanking or an electric shock). The four procedures are:
1. Positive reinforcement (Reinforcement): occurs when a behavior (response) is followed by a stimulus that is
appetitive or rewarding, increasing the frequency of that behavior. In theSkinner box experiment, a stimulus
such as food or a sugar solution can be delivered when the rat engages in a target behavior, such as pressing a
lever.
2. Negative reinforcement (Escape): occurs when a behavior (response) is followed by the removal of
an aversive stimulus, thereby increasing that behavior's frequency. In the Skinner box experiment, negative
reinforcement can be a loud noise continuously sounding inside the rat's cage until it engages in the target
behavior, such as pressing a lever, upon which the loud noise is removed.
3. Positive punishment (Punishment) (also called "Punishment by contingent stimulation"): occurs when a
behavior (response) is followed by a stimulus, such as introducing a shock or loud noise, resulting in a decrease
in that behavior.
4. Negative punishment (Penalty) (also called "Punishment by contingent withdrawal"): occurs when a behavior
(response) is followed by the removal of a stimulus, such as taking away a child's toy following an undesired
behavior, resulting in a decrease in that behavior.
[edit]Operant

conditioning to change human behavior

1. State Goal (aims for the study)


2. Monitor Behavior (log conditions)
3. Reinforce desired behavior (give reward for proper behavior)
4. Reduce incentives to perform undesirable behavior
Also:

Avoidance learning is a type of learning in which a certain behavior results in the cessation of an aversive
stimulus. For example, performing the behavior of shielding one's eyes when in the sunlight (or going outdoors) will
help avoid the aversive stimulation of having light in one's eyes.

Extinction occurs when a behavior (response) that had previously been reinforced is no longer effective. In the
Skinner box experiment, this is the rat pushing the lever and being rewarded with a food pellet several times, and then
pushing the lever again and never receiving a food pellet again. Eventually the rat would cease pushing the lever.

Noncontingent reinforcement refers to delivery of reinforcing stimuli regardless of the organism's (aberrant)
behavior. The idea is that the target behavior decreases because it is no longer necessary to receive the reinforcement.
This typically entails time-based delivery of stimuli identified as maintaining aberrant behavior, which serves to
decrease the rate of the target behavior.[3] As no measured behavior is identified as being strengthened, there is
controversy surrounding the use of the term noncontingent "reinforcement". [4]

Token economy is an exchange system using the principles of operant conditioning where a token is given as a
reward for a desired behaviour. Tokens may later be exchanged for a desired prize or rewards such as power, prestige,
goods or services.

Shaping is a form of operant conditioning in which the increasingly accurate approximations of a desired
response are reinforced.[5]

Chaining is an instructional procedure which involves reinforcing individual responses occurring in a sequence
to form a complex behavior.[5]

Response Cost is a form of punishment in which the annihilation of an appetitive stimulus always follows the
reducing in the occurrence of a response.[6]

Myers, Psychology text (300-400)

[edit]Skinner

Box

Main article: operant conditioning chamber


To show the effects of operant conditioning, B. F. Skinner created what is known as the Skinner box, or operant
conditioning chamber. A rat or other suitably small animal is placed in a typical Skinner box, and observed during learning
trials that use operant conditioning principles. [7]
[edit]Thorndike's

law of effect

Main article: Law of effect


Operant conditioning, sometimes called instrumental learning, was first extensively studied by Edward L.
Thorndike (18741949), who observed the behavior of cats trying to escape from home-made puzzle boxes. [8] When first
constrained in the boxes, the cats took a long time to escape. With experience, ineffective responses occurred less
frequently and successful responses occurred more frequently, enabling the cats to escape in less time over successive

trials. In his law of effect, Thorndike theorized that behaviors followed by satisfying consequences tends to be repeated
and those that produce unpleasant consequences are less likely to be repeated. In short, some
consequences strengthened behavior and some consequences weakened behavior. Thorndike produced the first
known learning curves through this procedure.
B.F. Skinner (19041990) formulated a more detailed analysis of operant conditioning based on reinforcement,
punishment, and extinction. Following the ideas of Ernst Mach, Skinner rejected Thorndike's mediating structures [clarification
needed]

required by "satisfaction" and constructed a new conceptualization of behavior without any such references. So,

while experimenting with some homemade feeding mechanisms, Skinner invented the operant conditioning
chamber which allowed him to measure rate of response as a key dependent variable using a cumulative record of lever
presses or key pecks.[9]
Principles of Operant Conditioning: 1)Discrimination, generalization and the importance of context. Learning takes place
in contexts, not in the free range of any plausible situation. Most behaviour is under stimulus control which developed
when a particular response only occurs when an appropriate discriminative stimulus is present. Stimulus control, and its
ability to foster stimulus discrimination and stimulus generalization, is effective even if the stimulus has no meaning to the
respondent.
Variable interval schedule: the delivery reinforcement is based on a particular average number of responses. Fixed interval
schedule: reinforcement is delivered after a specific number of responses have been made. The special case of presenting
reinforcement after each response is called continuous reinforcement. Variable interval schedule: a behaviour is
reinforced based on an average time that has expired since the last reinforcement. b)Ratio schedules: based on the ratio of
responses to reinforcements. Fixed interval schedule: reinforcers are presented at fixed time periods, provided that the
appropriate response is made. Schedules of reinforcement: the pattern with which reinforcements appeared is crucial.
a)Interval schedules: based on the time intervals between reinforcements. 2)Extinction: operant behaviour undergoes
extinction when the reinforcements stop. The reinforcements only occur when the proper response has been made, and
they dont always occur even then. Behaviours dont weaken and gradually extinguish because of this. Depends partly on
how often reinforcement is received.
[edit]Biological

[10]

correlates of operant conditioning

The first scientific studies identifying neurons that responded in ways that suggested they encode for conditioned stimuli
came from work by Mahlon deLong[11][12] and by R.T. Richardson.[12]They showed that nucleus basalis neurons, which
release acetylcholine broadly throughout the cerebral cortex, are activated shortly after a conditioned stimulus, or after a
primary reward if no conditioned stimulus exists. These neurons are equally active for positive and negative reinforcers,
and have been demonstrated to cause plasticity in many cortical regions.[13] Evidence also exists that dopamine is activated
at similar times. There is considerable evidence that dopamine participates in both reinforcement and aversive learning.
[14]

Dopamine pathways project much more densely onto frontal cortex regions. Cholinergic projections, in contrast, are

dense even in the posterior cortical regions like the primary visual cortex. A study of patients with Parkinson's disease, a

condition attributed to the insufficient action of dopamine, further illustrates the role of dopamine in positive
reinforcement.[15] It showed that while off their medication, patients learned more readily with aversive consequences than
with positive reinforcement. Patients who were on their medication showed the opposite to be the case, positive
reinforcement proving to be the more effective form of learning when the action of dopamine is high.
[edit]Factors

that alter the effectiveness of consequences

When using consequences to modify a response, the effectiveness of a consequence can be increased or decreased by
various factors. These factors can apply to either reinforcing or punishing consequences.
1. Satiation/Deprivation: The effectiveness of a consequence will be reduced if the individual's "appetite" for that
source of stimulation has been satisfied. Inversely, the effectiveness of a consequence will increase as the
individual becomes deprived of that stimulus. If someone is not hungry, food will not be an effective reinforcer
for behavior. Satiation is generally only a potential problem with primary reinforcers, those that do not need to
be learned such as food and water.
2. Immediacy: After a response, how immediately a consequence is then felt determines the effectiveness of the
consequence. More immediate feedback will be more effective than less immediate feedback. If someone's
license plate is caught by a traffic camera for speeding and they receive a speeding ticket in the mail a week
later, this consequence will not be very effective against speeding. But if someone is speeding and is caught in
the act by an officer who pulls them over, then their speeding behavior is more likely to be affected. [citation needed]
3. Contingency: If a consequence does not contingently (reliably, or consistently) follow the target response, its
effectiveness upon the response is reduced. But if a consequence follows the response consistently after
successive instances, its ability to modify the response is increased. The schedule of reinforcement, when
consistent, leads to faster learning. When the schedule is variable the learning is slower. Extinction is more
difficult when learning occurs during intermittent reinforcement and more easily extinguished when learning
occurs during a highly consistent schedule.
4. Size: This is a "cost-benefit" determinant of whether a consequence will be effective. If the size, or amount, of
the consequence is large enough to be worth the effort, the consequence will be more effective upon the
behavior. An unusually large lottery jackpot, for example, might be enough to get someone to buy a one-dollar
lottery ticket (or even buying multiple tickets). But if a lottery jackpot is small, the same person might not feel it
to be worth the effort of driving out and finding a place to buy a ticket. In this example, it's also useful to note
that "effort" is a punishing consequence. How these opposing expected consequences (reinforcing and
punishing) balance out will determine whether the behavior is performed or not.

Most of these factors exist for biological reasons. The biological purpose of the Principle of Satiation is to maintain the
organism's homeostasis. When an organism has been deprived of sugar, for example, the effectiveness of the taste of sugar
as a reinforcer is high. However, as the organism reaches or exceeds their optimum blood-sugar levels, the taste of sugar
becomes less effective, perhaps even aversive.
The Principles of Immediacy and Contingency exist for neurochemical reasons. When an organism experiences a
reinforcing stimulus, dopamine pathways in the brain are activated. This network of pathways "releases a short pulse of
dopamine onto many dendrites, thus broadcasting a rather global reinforcement signal to postsynaptic neurons."[16] This
results in the plasticity of these synapses allowing recently activated synapses to increase their sensitivity to efferent
signals, hence increasing the probability of occurrence for the recent responses preceding the reinforcement. These
responses are, statistically, the most likely to have been the behavior responsible for successfully achieving reinforcement.
But when the application of reinforcement is either less immediate or less contingent (less consistent), the ability of
dopamine to act upon the appropriate synapses is reduced.
[edit]Operant

variability

Operant variability is what allows a response to adapt to new situations. Operant behavior is distinguished from reflexes in
that its response topography (the form of the response) is subject to slight variations from one performance to another.
These slight variations can include small differences in the specific motions involved, differences in the amount of force
applied, and small changes in the timing of the response. If a subject's history of reinforcement is consistent, such
variations will remain stable because the same successful variations are more likely to be reinforced than less successful
variations. However, behavioral variability can also be altered when subjected to certain controlling variables. [17]
[edit]Avoidance

learning

Avoidance learning belongs to negative reinforcement schedules. The subject learns that a certain response will result in
the termination or prevention of an aversive stimulus. There are two kinds of commonly used experimental settings:
discriminated and free-operant avoidance learning.
[edit]Discriminated

avoidance learning

In discriminated avoidance learning, a novel stimulus such as a light or a tone is followed by an aversive stimulus such as
a shock (CS-US, similar to classical conditioning). During the first trials (called escape-trials) the animal usually
experiences both the CS (Conditioned Stimulus) and the US (Unconditioned Stimulus), showing the operant response to
terminate the aversive US. During later trials, the animal will learn to perform the response already during the presentation
of the CS thus preventing the aversive US from occurring. Such trials are called "avoidance trials."
[edit]Free-operant

avoidance learning

In this experimental session, no discrete stimulus is used to signal the occurrence of the aversive stimulus. Rather, the
aversive stimulus (mostly shocks) are presented without explicit warning stimuli. There are two crucial time intervals

determining the rate of avoidance learning. This first one is called the S-S-interval (shock-shock-interval). This is the
amount of time which passes during successive presentations of the shock (unless the operant response is performed). The
other one is called the R-S-interval (response-shock-interval) which specifies the length of the time interval following an
operant response during which no shocks will be delivered. Note that each time the organism performs the operant
response, the R-S-interval without shocks begins anew.
[edit]Two-process

theory of avoidance

This theory was originally proposed in order to explain discriminated avoidance learning, in which an organism learns to
avoid an aversive stimulus by escaping from a signal for that stimulus. The theory assumes that two processes take place:
a) Classical conditioning of fear.
During the first trials of the training, the organism experiences the pairing of a CS with an aversive US. The
theory assumes that during these trials an association develops between the CS and the US through classical
conditioning and, because of the aversive nature of the US, the CS comes to elicit a conditioned emotional
reaction (CER) "fear."
b) Reinforcement of the operant response by fear-reduction.
As a result of the first process, the CS now signals fear; this unpleasant emotional reaction serves to motivate
operant responses, and those responses that terminate the CS are reinforced by fear termination. Although, after
this training, the organism no longer experiences the aversive US, the term "avoidance" may be something of a
misnomer, because the theory does not say that the organism "avoids" the US in the sense of anticipating it, but
rather that the organism escapes an aversive internal state that is caused by the CS.
[edit]Verbal

Behavior

Main article: Verbal Behavior (book)


In 1957, Skinner published Verbal Behavior, a theoretical extension of the work he had pioneered since 1938.
This work extended the theory of operant conditioning to human behavior previously assigned to the areas of
language, linguistics and other areas. Verbal Behavior is the logical extension of Skinner's ideas, in which he
introduced new functional relationship categories such as intraverbals, Autoclitics, mands, tacts and the
controlling relationship of the audience. All of these relationships were based on operant conditioning and relied
on no new mechanisms despite the introduction of new functional categories.
[edit]Four

term contingency

Applied behavior analysis, which is the name of the discipline directly descended from Skinner's work, holds
that behavior is explained in four terms: conditioned stimulus (S C), a discriminative stimulus (Sd), a response
(R), and a reinforcing stimulus (Srein or Sr for reinforcers, sometimes Save for aversive stimuli).[18]
[edit]Operant

hoarding

Operant hoarding is a referring to the choice made by a rat, on a compound schedule called a multiple
schedule, that maximizes its rate of reinforcement in an operant conditioning context. More specifically, rats
were shown to have allowed food pellets to accumulate in a food tray by continuing to press a lever on
a continuous reinforcement schedule instead of retrieving those pellets. Retrieval of the pellets always instituted
a one-minute period of extinction during which no additional food pellets were available but those that had been
accumulated earlier could be consumed. This finding appears to contradict the usual finding that rats behave
impulsively in situations in which there is a choice between a smaller food object right away and a larger food
object after some delay. See schedules of reinforcement.[19]
[edit]Questions

about the law of effect

A number of observations seem to show that operant behavior can be established without reinforcement in the
sense defined above. Most cited is the phenomenon of autoshaping (sometimes called "sign tracking"), in which
a stimulus is repeatedly followed by reinforcement, and in consequence the animal begins to respond to the
stimulus. For example, a response key is lighted and then food is presented. When this is repeated a few times a
pigeon subject begins to peck the key even though food comes whether the bird pecks or not. Similarly, rats
begin to handle small objects, such as a lever, when food is presented nearby.[20][21] Strikingly, pigeons and rats
persist in this behavior even when pecking the key or pressing the lever leads to less food (omission training). [22]
[23]

These observations and others appear to contradict the law of effect, and they have prompted some researchers
to propose new conceptualizations of operant reinforcement (e.g. [24][25][26] A more general view is that
autoshaping is an instance of classical conditioning; the autoshaping procedure has, in fact, become one of the
most common ways to measure classical conditioning. In this view, many behaviors can be influenced by both
classical contingencies (stimulus-reinforcement) and operant contingencies (response-reinforcement), and the
experimenters task is to work out how these interact.[27]
[edit]

B. F. Skinner (1904 - 1990)


Operant Conditioning
Biography
Burrhus Frederic Skinner was born March 20, 1904, in Susquehanna Pennsylvania. Burrhus received his BA in English from Hamilton College in upstate New York. After some
traveling, he decided to go back to school, and earned his masters in psychology in 1930 and his doctorate in 1931, both from Harvard University., and stayed there to do
research until 1936.
In 1931 he moved to Minneapolis to teach at the University of Minnesota. There he met and soon married Yvonne Blue. In 1945, another move took him to the psychology
department at Indiana University, where he became department chair. In 1948, he was invited back to Harvard, where he remained for the rest of his life. He was a very active
man, doing research and guiding hundreds of doctoral candidates as well as writing many books.
August 18, 1990, B. F. Skinner died of leukemia after becoming perhaps the most celebrated psychologist since Sigmund Freud.
Skinner accepted the model of classical conditioning as originated by Pavlov and elaborated on by Watson and Guthrie, but he thought this type of conditioning only explained a
small portion of human and animal behavior. He thought that the majority of response by humans do not result from obvious stimuli. The notion of reinforcement had been
introduced by Thorndike, and Skinner developed this idea much further.
Skinner's Theory: Operant Conditioning
B. F. Skinner's system is based on operant conditioning. The organism, while going about it's everyday activities, is in the process of operating on the environment. In the
course of its activities, the organism encounters a special kind of stimulus, called a reinforcing stimulus, or simply a reinforcer. This special stimulus has the effect of increasing
the the behavior occurring just before the reinforcer. This is operant conditioning: the behavior is followed by a consequence, and the nature of the consequence modifies the
organism's tendency to repeat the behavior in the future. A behavior followed by a reinforcing stimulus results in an increased probability of that behavior occurring in the

future.
Skinner's observations can be divided into independent variables which can be manipulated by the experimenter, and dependent variables, which can not be manipulated by the
experimenter and are thought to be affected by the independent variables.
Independent variables:
Type of reinforcementSchedule of reinforcement
Dependent variables (measures of learning):
Acquisition rate- how rapidly an animal can be trained to a new operant behavior as a function of reinforcement. Skinner typically deprived his lab animals of food for 24 or
more hours before beginning a schedule of reinforcement. This tended to increase acquisition rate.
Rate of response- this is a measure of learning that is very sensitive to different schedules of reinforcement. In most cases, animals were given intermittent schedules of
reinforcement, so they were called upon to elicit the desired response at other times as well. Rate of response is a measure of correct responses throughout a testing schedule
including the times when reinforcement is not provided after a correct response. It appears as if test animals build expectations when they are given rewards at predictable times
(Animals which are fed at the same time each day become active as that time approaches, and a dog whose master comes home at the same time each day becomes more
attentive around that time of day.) Also, Skinner found that when fixed interval reinforcement was used, the desired behavior would decrease or disappear just after a
reinforcement, but when it was almost time for the next reinforcement, the animal would resume the desired responses.
Extinction rate- The rate at which an operant response disappears following the withdrawal of reinforcement. Skinner found that continuous reinforcement schedules
produced a faster rate of learning in the early stages of a training program, and also a more rapid extinction rate once the reinforcement was discontinued. A behavior no longer
followed by the reinforcing stimulus results in a decreased probability of that behavior occurring in the future.
Types of reinforcement:
1 Primary reinforcement- instinctive behaviors lead to satisfaction of basic survival needs such as food, water, sex, shelter. No learning takes place because the behaviors emerge
spontaneously
2 Secondary reinforcement - the reinforcer is not reinforcing by itself, but becomes reinforcing when paired with a primary reinforcer, such as pairing a sound or a light with
food.
3 Generalized reinforcement - stimuli become reinforcing through repeated pairing with primary or secondary reinforcers. Many are culturally reinforced. For example, in
human behavior, wealth, power, fame, strength, and intelligence are valued in many cultures. The external symbols of these attributes are generalized reinforcers. Money, rank,
recognition, degrees and certificates, etc are strongly reinforcing to many individuals in the cultures that value the attributes they symbolize.
Reinforcers always follow a behavior and could be pleasant or unpleasant (noxious) and could be added to or removed from a situation. The following table summarizes the
various combinations:
Add to a Situation After a Response:
Pleasant = Positive Reinforcement- Reward. Increases the probability of the same response occurring again. (Example: praise, monetary reward, food)
Noxious = Punishment- Administering a painful or unpleasant reinforcer after an unwanted response. Decreases the probability of the same response occurring again.
(Examples: corporal punishment, electrical shocks, yelling)
Remove from a Situation After a Response:
Pleasant = Punishment - Decrease the probability of the same response occurring again (Example: punishing a teenager by taking away his cell phone or car keys.)
Noxious = Negative Reinforcement - Removing or decreasing an unpleasant or painful situation after a desirable response is produced. Increases the probability of the same
response occurring again (Example: time off for good behavior)
Schedules of Reinforcement:
Continuous reinforcement - reinforcement is given every time the animal gives the desired response.
Intermittent reinforcement - reinforcement is given only part of the times the animal gives the desired response.
Ratio reinforcement - a pre-determined proportion of responses will be reinforced.
Fixed ratio reinforcement - reinforcement is given on a regular ratio, such as every fifth time the desired behavior is produced.
Variable (random) fixed reinforcement- reinforcement is given for a predetermined proportion of responses, but randomly instead of on a fixed schedule.
Interval reinforcement- reinforcement is given after a predetermined period of time.
Fixed interval reinforcement - reinforcement is given on a regular schedule, such as every five minutes.
Variable interval reinforcement - reinforcement is given after random amounts of time have passed.
In animal studies, Skinner found that continuous reinforcement in the early stages of training seems to increase the rate of learning. Later, intermittent reinforcement keeps the
response going longer and slows extinction.

Sistem BF Skinner didasarkan pada pengkondisian operan. Organisme, sementara akan tentang
kegiatan sehari-hari itu, sedang dalam proses "operasi" pada lingkungan. Dalam perjalanan kegiatannya,
organisme bertemu dengan jenis khusus dari stimulus, yang disebut stimulus penguat, atau hanya
penguat. Ini stimulus khusus memiliki efek meningkatkan perilaku yang terjadi sebelum penguat tersebut.
Ini adalah operant conditioning: "perilaku yang diikuti oleh konsekuensi, dan sifat dari konsekuensi
memodifikasi kecenderungan organisme untuk mengulangi perilaku di masa depan." Sebuah perilaku
yang diikuti dengan hasil stimulus memperkuat dalam probabilitas peningkatan perilaku yang terjadi di
masa depan.

Pengamatan Skinner dapat dibagi menjadi variabel independen yang dapat dimanipulasi oleh
eksperimen, dan variabel terikat, yang tidak dapat dimanipulasi oleh percobaan dan diperkirakan akan
dipengaruhi oleh variabel independen.

Independen variabel:
Jenis penguatan Jadwal penguatan

Dependent variabel (ukuran learning):


Akuisisi tingkat-seberapa cepat hewan dapat dilatih untuk perilaku operant baru sebagai fungsi
penguatan. Skinner biasanya dirampas laboratorium hewan makanan selama 24 jam atau lebih sebelum
memulai jadwal penguatan. Hal ini cenderung untuk meningkatkan tingkat akuisisi.

Tingkat respon-ini adalah ukuran dari pembelajaran yang sangat sensitif terhadap jadwal yang berbeda
penguatan. Dalam kebanyakan kasus, hewan diberi jadwal intermiten penguatan, sehingga mereka
dipanggil untuk mendapatkan respon yang diinginkan pada waktu lain juga. Tingkat respon adalah ukuran
respon yang benar seluruh jadwal pengujian termasuk saat-saat penguatan tidak diberikan setelah
respon yang benar. Tampak seolah-olah hewan uji membangun harapan ketika mereka diberi hadiah
pada waktu yang diprediksi (Hewan yang makan pada waktu yang sama setiap hari menjadi aktif seperti
itu pendekatan waktu, dan seekor anjing yang menguasai pulang pada waktu yang sama setiap hari
menjadi lebih perhatian sekitar bahwa waktu hari.) Juga, Skinner menemukan bahwa ketika penguatan
interval tetap digunakan, perilaku yang diinginkan akan berkurang atau menghilang setelah penguatan,
tetapi ketika itu hampir waktu untuk penguatan berikutnya, hewan akan melanjutkan respon yang
diinginkan.

Kepunahan Tingkat-Tingkat di mana respon operant menghilang setelah penarikan penguatan. Skinner
menemukan bahwa jadwal penguatan terus menerus menghasilkan tingkat yang lebih cepat dari
pembelajaran pada tahap awal dari program pelatihan, dan juga tingkat kepunahan lebih cepat sekali
penguatan dihentikan. Sebuah perilaku tidak lagi diikuti oleh hasil stimulus penguat dalam probabilitas
penurunan perilaku yang terjadi di masa depan.

Jenis penguatan:

1 Primer penguatan-naluriah perilaku mengarah pada kepuasan hidup kebutuhan dasar seperti makanan,
air, seks, tempat berlindung. Pembelajaran tidak terjadi karena perilaku muncul secara spontan
2 penguatan Sekunder - penguat tersebut tidak memperkuat dengan sendirinya, tetapi menjadi penguat
ketika dipasangkan dengan penguat utama, seperti pasangan suara atau cahaya dengan makanan.

3 penguatan Generalized - rangsangan menjadi penguat melalui pasangan berulang dengan reinforcers
primer atau sekunder. Banyak budaya diperkuat. Misalnya, dalam perilaku manusia, kekayaan,
kekuasaan, ketenaran, kekuatan, dan kecerdasan dinilai dalam banyak kebudayaan. Simbol-simbol
eksternal dari atribut-atribut ini reinforcers umum. Uang, pangkat, pengakuan, gelar dan sertifikat, dll
sangat memperkuat untuk banyak individu dalam budaya yang menghargai atribut mereka
melambangkan.

Reinforcers selalu mengikuti perilaku dan bisa menyenangkan atau tidak menyenangkan (beracun) dan
dapat ditambahkan atau dihapus dari situasi. Tabel berikut merangkum berbagai kombinasi:

Menambah Situasi yang Setelah Respon a:

Menyenangkan = Positif Penguatan-Reward. Meningkatkan kemungkinan respon yang sama terjadi lagi.
(Contoh: pujian, hadiah makanan, uang)

Berbahaya = Hukuman-Penyelenggara suatu penguat yang menyakitkan atau tidak menyenangkan


setelah respon yang tidak diinginkan. Mengurangi kemungkinan respon yang sama terjadi lagi (Contoh:
hukuman fisik, sengatan listrik, berteriak-teriak).

Hapus dari Situasi yang Setelah Respon a:

Menyenangkan = Hukuman - Penurunan kemungkinan respon yang sama terjadi lagi (Contoh:.
Menghukum remaja dengan mengambil telepon genggamnya atau kunci mobil)

Berbahaya = Penguatan Negatif - Menghapus atau menurunkan situasi yang tidak menyenangkan atau
menyakitkan setelah respon yang diinginkan dihasilkan. Meningkatkan kemungkinan respon yang sama
terjadi lagi (Contoh: waktu off untuk perilaku yang baik)

Jadwal dari Tulangan:


penguatan terus menerus - penguatan diberikan setiap kali hewan memberikan respon yang diinginkan.
penguatan terputus - penguatan diberikan hanya bagian dari kali hewan memberikan respon yang
diinginkan.
Rasio penguatan - proporsi yang telah ditentukan tanggapan akan diperkuat.

rasio tulangan Tetap - penguatan diberikan pada rasio rutin, seperti setiap kali kelima perilaku yang
diinginkan dihasilkan.
Variable (random) penguatan penguatan-tetap diberikan untuk proporsi yang telah ditentukan
tanggapan, namun secara acak bukan pada jadwal tetap.
Interval penguatan-penguatan diberikan setelah periode waktu yang telah ditentukan.
Interval penguatan Tetap - penguatan diberikan pada jadwal teratur, misalnya setiap lima menit.
Interval penguatan Variabel - penguatan diberikan setelah jumlah acak waktu telah berlalu.
Dalam penelitian hewan, Skinner menemukan bahwa penguatan terus menerus dalam tahap awal
pelatihan tampaknya meningkatkan tingkat pembelajaran. Kemudian, penguatan intermiten terus respon
akan lebih lama dan memperlambat kepunahan.

Skinner specifically addressed the applications of behaviorism and operant conditioning to educational practice. He believed that the goal of education was to train learners in
survival skills for self and society. The role of the teacher was to reinforce behaviors that contributed to survival skills, and extinguish behaviors that did not. Behaviorist views
have shaped much of contemporary education in children and adult learning.
Learning Theory Bibliography
Boeree, C. G. (1998). B. F. Skinner. Retrieved September 19, 2003 from http://www.ship.edu/%7Ecgboeree/skinner.html
Lefrancois, 1972
Santrock, 1988
Merriam & Caffarella, 1991

S-ar putea să vă placă și