Documente Academic
Documente Profesional
Documente Cultură
Durchgeführt im
Inaugural-Dissertation
zur Erlangung der Doktorwürde
der Medizinischen Fakultät
der
Friedrich-Alexander-Universität
Erlangen-Nürnberg
vorgelegt von
Constanze Elisabeth Anna Mikyska
aus
München
Gedruckt mit Erlaubnis der
Medizinischen Fakultät der
Friedrich-Alexander-Universität
Erlangen-Nürnberg
To my family
Table of contents
1 Summary 1
1.1 Summary 1
1.2 Zusammenfassung 3
2 Introduction 5
2.1 Auditory system: anatomy and function 5
2.1.1 Ear 5
2.1.2 Sub-cortical auditory relays 10
2.1.3 Auditory cortex 13
2.2 Investigation of auditory perception and processing 14
2.2.1 Psychophysics – psychoacoustics 15
2.2.2 Brain activity – electroencephalography (EEG) 17
2.2.2.1 Introduction and history 17
2.2.2.2 Physiological fundamentals 17
2.2.2.3 Recording 22
2.2.2.4 Classification of frequency 25
2.2.2.5 Artifacts 27
2.2.2.6 Data analysis: preprocessing and
event-related potentials (ERP) 30
2.2.2.7 Main auditory electrophysiological components 31
2.3 Auditory attention 35
2.3.1 Psychological theories 35
2.3.1.1 Introduction to selective attention 35
2.3.1.2 Bottleneck theories: early- versus late-selection 37
2.3.1.3 Other capacity-limitation theories 40
2.3.2 Electrophysiological findings and theories 42
2.4 Aims of this dissertation 49
3.3 Procedure 57
3.4 EEG recording 57
3.5 EEG data analysis 57
3.6 Statistical analysis 58
3.6.1 Selection of applied methods 58
3.6.1.1 Analysis of variance (ANOVA) 58
3.6.1.2 Statistic permutation test 59
3.6.2 Behavioral data 60
3.6.3 ERP standards 61
3.6.4 ERP deviants 62
4 Results 64
4.1 Behavioral data 64
4.2 ERP results of standards 65
4.2.1 Main attention effect (attended versus ignored) 65
4.2.2 Influence of the memory task difficulty
on attention effects 66
4.2.3 Timing of attention facilitation and inhibition 70
4.2.4 Topographies of attention facilitation and inhibition 72
4.3 ERP results of deviants 73
4.3.1 Attention enhancement of deviant processing 73
4.3.2 Memory effect on the P3-Component 75
5 Discussion 77
6 References 82
7 List of abbreviations 90
8 Publication 92
9 Acknowledgements 93
10 Curriculum vitae 94
1
1 Summary
1.1 Summary
Objective
Auditory selective attention is a complex brain function that is still not completely
understood. The classic example is the so-called “cocktail party effect” (Cherry,
1953), which describes the impressive ability to focus one’s attention on a single
voice from a multitude of voices. This means that particular stimuli in the
environment are enhanced in contrast to other ones of lower priority that are ignored.
To be able to understand how attention can influence the perception and processing
of sound, background knowledge is essential.
One aim of this dissertation is to provide an overview of already existing literature.
Therefore, the auditory system and different methods to measure and evaluate
auditory processes are introduced at first, followed by a review about competing
theories, trying to explain how auditory attention operates.
The second aim of the dissertation is to specify the mechanisms and to elucidate how
they operate. It is generally accepted that distinct signals (top-down signals) are
important for cognitive control, enabling selective attention and leading to an
enhanced processing of task relevant information. But it is unknown whether
facilitation and inhibition of stimulus processing are based upon one
(unitary gain
control mechanism of facilitation) or two mechanisms (net activity of distinct top-
down facilitation and inhibition mechanisms). Results from a visual fMRI study (de
Fockert, 2001) suggest that facilitation and inhibition rely on distinct mechanisms
that would be differentially affected by the availability of cognitive resources (i.e. for
performing a task).
To reveal that facilitation and inhibition represent distinct mechanisms in auditory
selective attention, we conducted a study, where subjects performed an auditory
attention task, while the amount of available cognitive resources was modulated (by
varying the difficulty of a memory task).
Methods
Electrophysiological experiments were conducted in young healthy adults. 16
subjects performed an attention task and a memory task of varying difficulty (no,
easy and difficult memory) at the same time (dual task protocol) while EEG was
recorded. Facilitation and inhibition were measured by comparing
2
3
1.2 Zusammenfassung
Hintergrund und Ziele
Selektive auditorische Aufmerksamkeit ist ein komplexer Mechanismus, der noch
nicht vollständig verstanden ist. Das klassische Beispiel ist der sogenannte „Cocktail
Party Effekt“ (Cherry, 1953). Dieser beschreibt die beeindruckende Fähigkeit, die
Aufmerksamkeit auf einen einzelnen Sprecher zu konzentrieren und andere
Unterhaltungen auszublenden. Das bedeutet, dass bestimmte Reize in unserer
Umwelt verstärkt wahrgenommen werden, wohingegen Reize von niedrigerer
Priorität ignoriert werden. Um zu verstehen, wie Aufmerksamkeit die Wahrnehmung
und Verarbeitung von Reizen beeinflusst, gibt der erste Teil dieser Dissertation einen
Überblick der Grundlagenliteratur. Dabei werden zuerst das auditorische System
vorgestellt und verschiedene Methoden zur Messung und Beurteilung auditorischer
Verarbeitungsprozesse eingeführt. Dem folgt ein kurzer Überblick über
konkurrierende Theorien, die zu erklären versuchen, wie selektive auditorische
Aufmerksamkeit funktioniert.
Der zweite Teil dieser Arbeit befasst sich genauer mit der Frage nach den
Mechanismen und wie diese arbeiten. Es ist allgemein anerkannt, dass bestimmte
Signale (top-down Signale) wichtig für die kognitive Kontrolle sind. Sie aktivieren
selektive auditorische Aufmerksamkeit und führen so zu einer verstärkten
Verarbeitung eines relevanten Reizes. Aber es ist noch ungeklärt ob die Förderung
und Hemmung der Reizverarbeitung durch einen (einheitlicher, linearer
Verstärkungsmechanismus von Förderung) oder zwei Mechanismen
(Netzwerkaktivität von unabhängiger Förderung und Hemmung) geregelt wird.
Ergebnisse einer visuellen fMRT Studie zeigen, dass das Ausmaß der Hemmung
ablenkender Reize von der Verfügbarkeit kognitiver Ressourcen (z.B. für das Lösen
von Problemen) abhängig ist (de Fockert, 2001). Die Ergebnisse deuten darauf hin,
dass Förderung und Hemmung im visuellen System auf verschiedenen Mechanismen
basieren, die von der Verfügbarkeit kognitiver Ressourcen unterschiedlich
beeinflusst werden.
Um zu zeigen, dass Förderung und Hemmung unabhängig voneinander agieren,
führten wir eine Studie durch, in der Probanden einen auditorischen
Aufmerksamkeitstest lösten, während die Verfügbarkeit von kognitiven Ressourcen
variiert wurde (verschiedene Schwierigkeitsstufen in einem Gedächtnis Test).
4
Methoden
Elektrophysiologische Versuche wurden mit 16 jungen, gesunden Erwachsenen
durchgeführt. Die Probanden lösten gleichzeitig (dual task protocol) einen
Aufmerksamkeits- und einen Gedächtnis Test mit variierenden Schwierigkeitsstufen
(no, easy und difficult memory) während elektrophysiologische Signale (EEG)
aufgezeichnet wurden. Förderung und Hemmung wurden gemessen, indem die
Antworten zu den beachteten und den ignorierten Reizen jeweils mit den Antworten
auf die gleichen Reize einer Kontrollbedingung verglichen wurden. In dieser
Kontrollbedingung wurde angenommen, dass die Aufmerksamkeit ausgewogen auf
alle Reize gerichtet war.
Ergebnisse und Beobachtungen
Zwei ERP-Komponenten wurden beobachtet: eine negative, in Antwort zu den
beachteten Reizen und eine positive, den ignorierten Reizen folgend.
Die zwei Komponenten zeigten verschiedene frontale Skalp-Topographien und
variierten auch in der zeitlichen Domäne. Außerdem wurden sie unterschiedlich von
der Schwierigkeit des Gedächtnis Tests beeinflusst.
Praktische Schlussfolgerungen
Diese Dissertation bietet einen Einblick in die Literatur über selektive auditorische
Aufmerksamkeit und bereichert das bestehende Wissen mit Ergebnissen einer neuen
Studie über die Wirkmechanismen. Die Studie erbringt den Nachweis, dass top-down
Kontrolle die Aktivität voneinander unabhängiger Förderungs- und
Hemmungsmechanismen widerspiegelt.
5
2 Introduction
The auditory system processes acoustic waves, leading to auditory percepts. An
important issue is to understand how attention can influence the perception of sound,
i.e. the processing of sounds. In other words, by which mechanisms and at which
step of sound processing, auditory attention operates. To address this question,
several basic principles will be introduced first: (1) the anatomy of the auditory
system and the sequence of sound processing from the outer ear to the auditory
cortices, (2) different methods to measure and evaluate auditory processes
(especially the electroencephalography), (3) auditory attention and the attempt of
psychological and physiological theories to elucidate its influence on sound
processing. Finally, (4) the aims of the present study are introduced.
2.1.1 Ear
Outer ear
Sound waves first reach the outer ear, which is composed of the pinna (auricle), the
ear canal (external acoustic meatus) and the eardrum (tympanic membrane). The
pinna, the visible part of the outer ear, collects and focuses sound waves, and directs
them through the ear canal (approximately 30 to 35 mm long and 7 mm in diameter)
to the eardrum, which transmits sound vibrations to the middle ear (see Figure 2.1).
6
Figure 2.1 – The anatomy of the ear (adapted from Netter, 2006).
Middle ear
The middle ear is an air filled cavity consisting of different muscles and the three
ossicles (malleus, incus and stapes). There are also two openings, linking the middle
ear to the inner ear over membranes: the oval (vestibular) window adjoining the
perilymph in the scala vestibuli and the round (cochlear) window connecting to the
perilymph in the scala tympani (see Figure 2.1).
The malleus is attached to the inner surface of the tympanic membrane and transmits
the arriving vibration to the incus and the stapes, which is attached to the membrane
of the oval window. This small bone is stabilized by the stapedius muscle, which
controls the amplitude of sound waves by pulling the stapes away from the oval
window and therefore protects the inner ear from high noise levels (Trepel, 2008).
The tensor tympani muscle functions in a similar manner by pulling the malleus, thus
tensing the tympanic membrane.
From a physical point of view, two mechanisms permit an increased
efficiency of sound transmission (Schmidt, 1993): (1) the reduced surface of the
membrane of the oval window compared to the surface of the tympanic membrane
causes an enhancement of pressure and (2) the lever system of the ossicles leads to
7
an adaptation between the low impedance of the air in the middle ear and the high
impedance of the fluid in the inner ear.
The middle ear is only functioning as long as the tympanic cavity is
ventilated and its pressure is matched to the atmosphere. This is assured by the
Eustachian tube, which links the middle ear to the nasopharynx. An upper airway
infection can cause swelling and occlusion of the tube, which can result in an ear
infection as well as in a rupture of the tympanic membrane, caused by a pathological
pressure difference (Schmidt, 1993).
Inner ear
The inner ear contains the vestibular system, dedicated to balance and spatial
orientation, and the cochlea, which is essential for hearing. The cochlea is part of the
osseous labyrinth and turns like a snail two and a half times around a core of bone
(modiolus), in which the cochlear nerve runs. This labyrinth is filled with perilymph,
a derivative of the cerebrospinal fluid, similar to extracellular fluid, and also contains
a membranous labyrinth: the cochlear duct (scala media), filled with endolymph (a
fluid with a high content of potassium, similar to intracellular fluid). The cochlear
duct is formed by the Reissner's membrane above and the basilar membrane below
and also holds the organ of Corti (organum spirale). This is the sensory organ of
hearing and is comprised of receptor cells (hair cells), different types of supporting
cells (cells of Deiters, Hensen, Claudius and Boettcher) and the basilar membrane
(see Figure 2.2 A and B). The hair cells are arranged in one row of inner and three
rows of outer hair cells and make contact with neurons on their basis (see Figure 2.2
B). Additionally, they have stereo cilia (hair bundles) on their free surface, which are
attached to each other by filamentous structures, called tip-links (Roberts, 1988). The
stereo cilia from the outer hair cells are conjoined to the tectorial membrane – a
colloidal membrane that covers the organ of Corti. Furthermore the cochlear duct
separates two structures: the scala vestibuli (above) and the scala tympani (below),
that merge at the apex of the cochlea (helicotrema) and so the perilyphm can flow
from one scala to another. The organ of Corti sits on top of the basilar membrane
along the entire length of the scala media.
8
9
10
enter the cell and trigger the release of neurotransmitters (glutamate) at the basal end
of the inner hair cell. Another depolarization is inhibited, if the hair bundles are
deflected to the other direction, relaxing the tip links, (Schmidt, 2005). The released
glutamate diffuses through the synaptic cleft and binds to the postsynaptic receptor
(AMPA receptor), which triggers a postsynaptic potential that causes an action
potentials in the afferent neuron. This process is called transformation. The number
of axons firing and the frequency of the action potentials encode for the volume of a
sound (amplitude), i.e. high volume will result in higher frequencies of action
potentials.
11
contralateral side, without forming any synapse with other neurons. Additionally
some of the neurons starting from the nucleus cochlearis posterior hold partly
excitatory and also partly inhibitory neurons, which can inhibit processing in
subsequent levels of the pathway (Schmidt, 2005).
On the contralateral side all auditory nerve fibers form the lemniscus lateralis,
where the neurons come together in synapses (nuclei lemnisci lateralis; fourth order
neurons) and either decussate back to the originally ipsilateral side or run to the
colliculus inferior (fifth order neuron) – part of the corpora quadrigemnia, located in
the mesencephalon (Trepel, 2008). At this location and also in the superior olivary
complex the direction of the sound is analyzed by specialized neurons, comparing
the timing of action potentials, coming from both cochleae.
From here, some nerve fibers decussate again to the contralateral side through
the brachium colliculi inferioris, but most of them continue to the corpus
geniculatum mediale of the thalamus, located in the diencephalon. The geniculate
neurons (sixth order neurons) project their axons through the capsula interna to the
primary auditory cortex (radiatio acustica).
This complex and intensely interconnected pathway is crucial to connect both
cochleae to the left and right auditory cortices, which is important for bilateral
processing and comparing sounds from the right and left side (Schmidt, 2005).
12
13
14
the disability to understand speech. The patient is still able to speak fluently, but it
makes little or no sense, because the word is not linked to its proper meaning.
The secondary auditory areas also receive afferent input from the angular gyrus that
gets it information from the secondary visual cortex. This circuit is important for the
combination of visual and auditory input to its meaning, crucial for reading and
writing.
Furthermore, there may be two major streams of information processing
comparable to the ‘what‘ and ‘where’ streams in the visual system (Kaas, 1999). To
simplify: the information about spatial location (‘where’) would run from A1 to
posterior higher order areas and continue to the parietal lobe and the posterior parts
of the dorsolateral prefrontal cortex. Object-related properties would be processed
within a ‘what’ pathway composed of the primary auditory cortex, anterior higher
order areas and ventral and medial prefrontal areas.
Given the knowledge about the complex auditory system, the obvious
questions follow: how does the brain interpret acoustic waves to produce a percept
and with what kind of methods and procedures is it possible to measure, evaluate and
interpret auditory perception and processing.
15
Also magnetic fields produced by electrical currents can be measured using the
Magnetoencephalogram (MEG).
Other techniques using Imaging technology provide a different view of brain
activity: Important to mention is the magnetic resonance imaging (MRI). It uses
powerful magnets to excite hydrogen nuclei. These atomic nuclei emit a signal while
returning to the initial point of excitement (relaxation). The signal can be measured
and computed into structural images of the brain. It is also possible to visualize the
brain function with the functional MRI (fMRI). Neuronal activity enhances
metabolic processes resulting in changes of blood flow. Hemoglobin features
different oxygenation levels that are measurable as different MRT signals showing
different activated structures in the brain. This is called the Blood Oxygen Level
Dependency effect (BOLD-effect).
Another imaging technique, the Positron emission tomography (PET), visualizes
metabolic processes by showing the distribution of a radioactive tracer in the brain.
The tracer is attached to a biological active molecule and injected into the blood
circulation; the most commonly used is fluorodeoxyglucose (FDG).
A compatible and complimentary use of some of these methods is possible.
The term level means, that the sound pressure measured (Px), is in a logarithmic ratio
to another sound pressure (P0), which is the absolute threshold of hearing (2*10-5
Pa). That indicates, that few decibels imply a multiplication of the sound pressure.
The most important way to examine a persons hearing ability is an
audiometry test. Different tones are presented through headphones at different levels
and the tested person has to press a button as soon as the tone is heard to determine
16
the individual threshold of audibility. The perception of loudness indicates how loud
a person perceives a sound and therefore it cannot objectively be measured. But it is
still related to the sound pressure level and the duration of a sound. If the sound
pressure increases, a sound is perceived louder and high frequencies are heard as a
high tone (and the other way around). Furthermore, at a constant sound pressure,
tones are perceived louder at frequencies between 2000 and 5000 Hz (Schmidt,
1993). Therefore the sound pressure must be adjusted to the frequencies in order to
perceive all tones at the same loudness (isophon). Thereby a chart is created (see
Figure 2.4) that shows equal loudness curves, which are also called Fletcher-Munson
curves (Fletcher, 1933). Values at 1000 Hz can also be named phon and per
definition, one phon equals one decibel at 1000 Hz. The human hearing is limited to
frequencies between 20 Hz and 16.000 Hz and loudness between 4 and 130 phon
(Schmidt, 2005). A normal spoken word would be found at around 50 to 70 decibel
and a painful tone at around 130 decibel (Schmidt, 2005).
17
18
19
Electrical changes from one single neuron cannot be recorded from an EEG
electrode, because the amplitude is too small and there is a considerable distance
between neurons and electrodes. The recorded electrical activity is rather a
summation of voltage fluctuations caused by EPSPs and/or IPSPs of many neurons
within a population.
Given that a neuron is part of a population, extracellular potentials are
behaving according to the orientation and the polarity of the neurons. A summation
of the field potentials happens only if the neurons are organized in parallel or serial
networks and if they have similar morphological polarization. This situation is called
open field (see Figure 2.6 A). Special neurons in the human cortex (pyramidal cells)
– mainly organized vertically – primarily generate the electrical potentials one can
see in the EEG.
On the other hand, in a closed field (see Figure 2.6 B), the current flow is canceled
out within the population. This happens if the neurons are arranged in stellate
morphology with dendrites extending radially outward, or if the neurons are
randomly oriented, for example, interneurons show closed field potentials (Ebner,
2006).
20
The EEG signal recorded from the scalp is composed of frequencies between 0,5 –
80 Hz and amplitudes in a range from 1 – 100 µV (Schmidt, 2005). The spontaneous
EEG signal mostly displays noise, but nevertheless the state of arousal and the areas
of higher activity can be displayed and mapped on a scalp model. Therefore, it is
important to consider the orientation of the generator population. It can be located
vertically to the surface of the cortex (see Figure 2.7 B). In this case, the dipole
moment is presented as a radial dipole and the scalp topography shows about the
localization of the source (Ebner, 2006) (see Figure 2.7 D). A tangential dipole
would result if the neurons were arranged tangential to the cortical surface (see
Figure 2.7 A) and for this, the maximal negativity or positivity of the scalp
topography would not show the actual source (see Figure 2.7 C). The source is rather
located in between the two maxima with opposite signs on the scalp. Therefore, one
must not reason that the biggest signal in the EEG presents the location of the biggest
activity in the brain. This is called the inverse problem. Consequently, interpreting
EEG scalp results requires carefulness, especially with respect to locating the
generator.
Moreover, the further away a dipole is from the scalp, the broader the distribution
and the smaller the amplitude of the signal.
21
22
2.2.2.3 Recording
EEG corresponds to the difference of 2 electrodes potentials, one of interested
positioned on the scalp and one reference (see Figure 2.8).
Figure 2.8 – A subject set up with electrodes, ready to start the experiment
(picture taken in the testing booth of the Helen Wills Neuroscience Institute at the
University of California, Berkeley, USA).
Electrodes
Electrodes are small metal discs, which are mainly made of silver, but also platinum,
gold or tin. Mostly silver / silver chloride (Ag / AgCl) electrodes are used, because
this compound reduces the polarization effect. This is a counter voltage that arises
while the voltage on the scalp is constant or slowly changing (Ebner, 2006). These
kinds of electrodes not only record brain activity, but also interfering activity, for
instance, alternating current (AC) (see 2.2.2.5).
The electrodes are placed on the head (cap or glued) and the application of a
conductive paste, rich in electrolytes, lowers the impedance between electrode and
skin – preferably below 5 kOhm. To ensure standardized recording, the positions of
23
the scalp are identified using the International 10 / 20 system (see Figure 2.9). The
number of recording electrodes can go up to 256. Each electrode is labeled with a
letter and a number: the letter refers to a brain area (‘F’ = frontal lobe, ‘T’ = temporal
lobe, ‘P’ = parietal lobe, ‘O’ = occipital lobe, and the ‘z’ refers to the central line),
even numbers refer to the right side of the head and odd numbers to the left side.
Figure 2.9 – The layout form the International 10 / 20 system with 64 recording
electrodes (adapted from BioSemi, the Netherlands).
EEG instruments
The small amplitude of the EEG signal (1-100 µV) requires amplification. Therefore
a differential amplifier is used. The difference between two signals is amplified by a
constant factor (usually 10.000) (Ebner, 2006).
The amplified signal is digitized or sampled, i.e. the signal is converted into a
series of numeric values (Analogue-to-Digital conversion - ADC). The samples,
representing the actual value of the EEG amplitude, are measured at constant time
24
periods. Sampling rate (expressed in Hz) refers to the number of samples per second.
For clinical application a usual sampling rate is at about 250 Hz, whereas in research
studies the signal can also be sampled much higher, for instance at over 1000 Hz.
Furthermore, the sampled signal is filtered to reduce superimposed signal or
to distinguish EEG frequency bands of interest. The bandwidth of EEG signal is
from under 1 Hz up to over 50 Hz varying in relative amplitude. Different filters can
be used depending on the purpose of a study. A notch filter, or band stop filter can be
used to exclude contaminating frequencies (50 Hz or 60 Hz) caused by electrical
power. A low pass filter attenuates signal higher than a specified threshold (e.g. 35
Hz) such as high frequency artifacts, for instance muscular activity. In contrast, a
high pass filter passes high frequencies and reduces the amplitude of low frequencies
(e.g. below 1 Hz) to remove slow artifacts (Ebner, 2006). A band pass filter allows
setting a range of frequencies that remain unattenuated whereas the frequencies
outside that range are rejected.
Montage
The way a pair of electrodes is connected to the differential amplifier is called
montage. It is crucial for a study to carefully choose the reference because data
alteration or loss due to subtraction can occur. There are different montages:
referential montage, bipolar montage and (common) average reference.
The referential montage indicates that one electrode is used as a reference.
This signal is subtracted from the signal of all other electrodes. Therefore, the
reference electrode should not record brain activity or artifacts, because otherwise
subtraction could cause information loss or modification. For instance, electrodes
placed on both earlobes, the nose or the mastoids would be a reference with a minor
activity of their own.
In the bipolar montage, electrodes are subsequently linked together and
potential differences between two adjacent electrodes are measured. In general, both
montages are equally effective, but they are used for different purposes according to
the location and dimension of the potential field.
A special montage is the average reference (common average reference). The
signal from all electrodes is summed up, averaged and subtracted from every
electrode. But, since the potentials are statistically irregular distributed, big
25
deflections in the EEG of one region due to physiological or pathological activity can
falsify the EEG (Ebner, 2006). This montage is often used in ECoG recordings.
26
27
2.2.2.5 Artifacts
Artifacts are deflections in the EEG that do not represent activity from the brain. A
distinction is drawn between biological and non-biological (technical) artifacts
(Cacioppo, 2005). If the artifacts show a typical shape and localization, they are easy
to identify, but artifacts can often modify the EEG in a minor way that is difficult to
notice. Therefore, observation and video monitoring are indispensable and make it
easier to identify the artifacts.
Sources for biological artifacts are: eyes, heart, arteries (pulse), tongue, skin
(sweat) and muscle activity. Especially eye movements and blinks are a problem in
experimental paradigms. The bulbus oculi (globe of the eye) forms an electrical
dipole that causes measurable potentials while the eyes move (see Figure 2.11).
28
To avoid blinks and saccades, the tested person is instructed to blink as less
as possible and for instance, to fixate the gaze on a centrally presented cross on the
testing screen. Additionally, an electrooculogram (EOG) is recorded from electrodes
placed on both external canthi and below an eye. Thus, vertical and horizontal eye
movements are recorded (see Figure 2.13 A) and can be removed later in the
analysis.
Other artifacts, coming from muscle activity, i.e. chewing, frowning or tense
face muscles can highly contaminate the EEG (see Figure 2.13 B). Especially
difficult to deal with are complex biological artifacts, i.e. prolonged movement of the
subject. The best way to reduce these artifacts is to avoid them in a prophylactic
manner by carefully instructing the person to stay as relaxed as possible during
testing.
A very common technical artifact is the contamination of the signal with AC,
coming from other devices near the tested person (for instance, a cell phone) (see
Figure 2.12). Most electrical power is generated either at 50 Hz (for example in
Germany) or at 60 Hz (in the United States). If the noise source cannot be located,
special filters, i.e., a notch filter, can be used during data collection or offline to
remove the superimposed activity (Cacioppo, 2005).
29
Figure 2.13 – Biological artifacts. (A) Blink artifacts propagated over the frontal
scalp electrodes. The three last channels represent the EOG channels: rEOG =
electrode placed next to the lateral canthus of the right eye, lEOG = electrode placed
next to the lateral canthus of the left eye, vEOG = electrode under the left eye.
Together rEOG and lEOG record horizontal eye movements, whereas vEOG records
vertical eye movements. (B) Muscle artifact probably from chewing (recorded in the
Helen Wills Neuroscience Institute at the University of California, Berkeley, USA).
30
Event-related potentials
Before, during and after a sensory, motor or cognitive event, specific electrical
events arise in the cerebral cortex. These effects can be measured as evoked-
potentials (EP) or event-related potentials (ERPs), which are very small signals
embedded in the ongoing EEG signal. ERPs refer to time locked perceptual,
cognitive or response potential, whereas evoked potential (EPs) refer to early sensory
responses such as the brainstem auditory evoked potentials (BAEP; see below). All
ERPs feature specific polarity, latency, localization and amplitude that characterize
the different components. ERPs reflect brain responses time-locked to an event or
stimulus in an experimental paradigm. ERPs are obtained by averaging the EEG
traces from a series of trials, aligned according to the event that is, for instance, the
onset of a stimulus or a response. Given that the background EEG is assumed to be
random, averaging random activity sums zero and the EP or ERP emerges from the
EEG. The signal to noise ratio (SNR) indicates to what extent the signal is
compromised by noise. The SNR is defined as the ratio of signal to noise power. A
ratio higher than 1:1 indicates more signal than noise. To increase the SNR, the
number of trial needs to increase as well because the SNR is proportional to the
square root of the number of sums (Schmidt, 2005). For instance, 81 trials improve
the SNR to 9:1 (given an initial SNR of 1:1). Moreover, the number of trials needed
31
for ERPs or EPs is also dependent on the amplitude of the ERP or EP of interest. For
instance, the P300 (see 2.2.2.7) can already be visible after averaging 10 trials,
whereas the BAEP requires at least more than 100 trials. Due to standardized
electrode positions, the amplitude value of the averaged signal at a favored time can
be plotted on a topographic scalp map. However, if the source of brain activity is
causing a tangential dipole, the ERPs can be mapped paradoxically. Moreover, the
activity can also reflect processes, executed in parallel. Therefore, before interpreting
activity as function or processes and allocating it to distinct brain areas, the
orientation and possible source of dipoles and underlying cognitive processes that
might be present during the experimental paradigm need to be considered.
32
33
34
P300 Component
The P300 or P3 is a positive ERP component, which reaches its maximum around
300 ms after stimulus onset. The strongest signal can be measured at parietal
electrodes. The P3 was first reported by Sutton (Sutton, 1965) in response to
unpredictable stimuli presented in an oddball paradigm. In this kind of paradigm a
rare target stimulus is presented amongst more frequent standard background stimuli
and the P3 arises when the target stimulus is detected. A larger P3 is elicited by those
events representing a low-probability category of stimuli (McCarthy, 1981).
The P3 wave is composed of two subcomponents known as P3a and P3b. These
subcomponents reflect distinct information-processing events. The P3a is usually
observed in response to non-expected meaningful stimulus, such as novels.
Therefore, it has been proposed that the P3a originates from stimulus-driven frontal
attention mechanisms. The P3b is elicited in response to detected targets and is
35
36
based on the analysis of the stimulus characteristic (e.g. color, brightness). For
example, one red balloon in a bunch of blue balloons will grab the attention and
attract it involuntarily.
The ideas of attention W. James proposed over 100 years ago are still today’s
purpose of research. The main goal in studying attention is to investigate how
attention enables and influences detection, perception and encoding of stimulus
events (Gazzaniga, 2002). Most studies were conducted in the visual and auditory
modalities. During the last decades the number of studies in visual attention
increased and displaced the emphasis of auditory attention, predominant in the 1950s
and 1960s (Broadbent, 1958; Cherry, 1953). Auditory attention seems to be a greater
challenge because of crucial physiological differences in structure and function.
Visual attention is linked to the position of the head and eyes since the stimuli are
already mostly fully processed in the fovea (Pashler, 1998). The human cochlea on
the other side is not an equivalent to the fovea. The characteristic of auditory
selective attention is that it is mostly independent of the position of the head and the
ears, which makes it a system that is ready to receive and process stimuli from all
directions regardless of the organism’s current orientation (Pashler, 1998). On the
other side this openness to all inputs from the environment means that efficient
selection mechanisms need to distinguish relevant from irrelevant sounds.
Colin Cherry, a British psychologist, described the classic auditory example
of this phenomenon – the so-called cocktail party effect (Cherry, 1953): a person can
focus on one particular speaker while tuning out several other simultaneous
conversations. This can only be achieved by auditory selective attention: the
perception of a certain stimuli in the environment is enhanced relative to other
stimuli of lower immediate priority. In Cherry’s study, competing speech input was
provided through earphones into the two ears of a subject. The subjects were asked
to attend and verbally shadow (immediately repeat each word) a relevant input in one
ear while ignoring irrelevant information presented to the other ear; this approach is
called dichotic listening. He noticed that the subjects were only able to report the
input from the attended ear and could not report one detail from the ignored channel.
He also observed a significant decrease in performance when the subjects attempted
to attend to both input channels simultaneously in comparison to selectively
attending to one channel.
37
Cherry proposed that attention focused on one ear results in better encoding
of inputs in this channel, whereas the input of unattended channels might be
attenuated or rejected. These findings led to general models of attention, which fall
into two categories: bottleneck theories (see 2.3.1.2) and other capacity model
theories (see 2.3.1.3). The bottleneck is the most influential one. It is worth noting
that all theories are based on the idea that humans have limited information
processing capacity: i.e. it is impossible to process and react to all exogenous and
endogenous inputs that continuously excite our senses.
38
Some features of Broadbent’s filter theory explained Cherry’s data well, but
Neville Moray showed in 1959 that high priority information in an unattended input
channel was also processed to the extent that it could break through the attentional
barrier. In his experiment, Moray found that a persons’ own name in an ignored input
channel could often direct attention to this channel (Moray, 1959). These findings led
to the assumption that all information was actually analyzed equivalently regardless
whether it was attended or ignored during testing.
Therefore, Treisman proposed a direct modification of Broadbent’s model on
which he agreed a year later. The theories are quite similar, but the main difference is
the filter. Treisman's filter passes the attended input as well through the limited
capacity channel but also allows unattended messages to go through, but in an
attenuated form, i.e. their signal strength is lowered. Accordingly, certain unattended
messages can be processed semantically and also reach consciousness, if they meet
certain criteria. Most important criteria are differing thresholds that can be variable
and also function as a filtering mechanism. For example, biologically important
signals have permanently lowered thresholds, thus, even very attenuated signals can
be facilitated and semantically analyzed. This could explain why one's own name in
an unattended message can attract attention to it. This model is, therefore, an early
selection theory, and an attenuation model of attention.
Taken together, the bottleneck in the early selection theory is located around
the level of perceptual analysis (see Figure 2.16), thus, attended input is perceptually
processed and continues to higher order processing (e.g. encoding as semantic or
categorical information), whereas unattended input is either rejected categorically
(Broadbent’s theory) or attenuated so that important messages in an unattended
channel are enabled for further processing (Treisman’s theory). So, it may be
possible, that inputs are selected or rejected even before the perceptual analysis of
the stimuli’s characteristics is fully completed (Gazzaniga, 2002).
The early selection models can be contrasted with the late selection one,
which proposes that attended and unattended stimuli are processed equivalently by
the perceptual system and both inputs reach further processing of semantic encoding.
After that, selection for further processing or for conscious awareness can take place.
Thus, selection takes place at higher stages of information processing about whether
the stimuli should gain complete access to awareness, be encoded in memory, or
initiate a response. J. A. Deutsch and D. Deutsch first proposed the most influential
39
late selection theory (Deutsch, 1963). In this model all incoming stimuli are stored in
a sensory register and fully processed even at a semantic level without any
attenuation. This perceptual analysis happens automatically and independently
whether attention was paid or not and it is accomplished before any selection due to
attention takes place. The information is then grouped by mechanisms, activated by
particular features of the incoming stimuli, i.e. importance of the stimulus. The
highest level represents a criterion by which all the other levels are compared. This
level represents a reference point enabling the appropriate output, such as a motor
response, and inhibits the output associated with other levels. Furthermore, the
general state of arousal alters the access to an output system, i.e. for a low level of
arousal (e.g. sleep), only very high-priority information will be able to alter storage
or motor response.
40
In summary, the early selection theories allow few automatic processing and
no semantic processing of the unattended input before the selection takes place.
Therefore, the bottleneck of these theories is located in the perceptual system. On the
contrary, in the late selection theories, the bottleneck appears to be in the response
system. Indeed, all inputs are processed rather automatically by the perceptual
system and reach the stage of semantic encoding and are therefore able to influence
the executive functions, such as decision, memory or simply making a response.
However, the theories might not be so different. One argument is that it might
be a terminological issue, because in all theories selecting mechanism operate by
similar conditions: levels of importance (Deutsch and Deutsch) or different threshold
levels (Treisman). More importantly, only the highest level of importance (Deutsch
and Deutsch) and the information with a triggered threshold (Treisman) can pass on
to further processing such as making a response. Besides, both theories propose
pattern recognition units and mechanisms selecting highly salient stimuli dependent
on bottom-up (physical features of the stimulus) and top-down (contextual features).
This shows that the theories feature major differences but also similarities.
41
only from an overload of the perceptual system by relevant information, i.e. in case
the capacity limit is exceeded.
Nilli Lavie observed that the effect of load on distractor processing is mainly
depending on the type of mental process that is loaded, because load on executive
functions such as working memory had the opposite result. So, she proposed the
cognitive load theory. This theory is about the interaction between attention and
working memory, which is the ability to hold and manipulate information in mind for
a short time, or rather, actively maintain stimulus-processing priorities through out
the task (Lavie, 2005). Lavie and colleagues suggested that load on working memory
results in a reduced availability of working memory for a selective attention task (by
loading working memory in a concurrent, yet unrelated task). This in turn should
result in reduced efficiency of focusing attention on the relevant stimuli, with greater
interference by distractors. More precisely, a high cognitive load would increase the
interference by an irrelevant low-priority distractor, and a low cognitive load would
decrease distractor interference (Lavie, 2004). Therefore, load on an executive
function such as working memory has the opposite effect than perceptual load. This
idea was supported by results from de Fockert (de Fockert, 2001) showing a causal
role for working memory in the control of selective attention (see 2.4).
The effects of different types of load on distractor processing provide a better
understanding of how distractor processing is affected by capacity limits in different
mental processes. These load models also provides a more complete view at the
early- and late-selection debate: early selection depends on high perceptual load,
whereas late selection depends on cognitive control functions, available for the
selective attention task (Lavie, 2005). Thus, these capacity-limitation models also
reconcile the competing bottleneck models by combining the assumption that
perception is a limited process (early) with the view that perception is an automatic
process (late) to the extent that there is spare capacity available. This suggests that
there is no single bottleneck in the information processing system, but there are a
series of filters so that incoming stimuli can be selected at early or late stages
depending on the situation. Therefore, these are also flexible theories of selective
attention.
42
43
Hillyard controlled the global state of arousal during testing by engaging the subject
in this difficult task subjects. Thus, only the direction of attention varied (i.e., which
ear the subjects directed their attention to). The researcher discovered that auditory
ERPs - more precisely the N1 component - was substantially larger in amplitude for
the attended stimuli compared to ignored ones (Hillyard, 1973). Since the N1 is
known to be generated in the auditory cortex (Vaughan, 1970), Hillyard and his
colleagues interpreted their findings as an increased activity of the N1 generators, i.e.
an enhanced activation of neurons involved in automatic sensory analysis of sounds
in the auditory cortex. Consequently, they proposed that selective attention acts as a
filtering or gain mechanism that can inhibit or gate unattended stimuli at an early
stage of sensory analysis (about 100 ms). This represents a physiological version of
the psychological attenuation model of early selection (Broadbent, 1958; Treisman,
1960).
A few years later Näätänen (Näätänen, 1978) suggested that the increased
negativity observed by Hillyard could be dissociated from the N100 component.
Näätänen used a longer and constant inter stimulus interval (ISI, 800 ms) than
Hillyard (250-1250 ms) and he did not observe an enlargement of the N1, but when
he subtracted the ERPs to ignored tones from those to the same tones when they
were attended, he found a negative difference wave (Nd), or also called processing
negativity (PN). This deflection began to emerge at around 150 ms after stimulus
onset and persisted for at least 500 ms. Näätänen proposed that this wave is an
endogenous component, representing attention-specific activity, which is different
from the activity resulting from automatic sensory analysis (Näätänen, 1978). He
concluded that the N1-effect Hillyard reported was the exogenous N1 overlapped by
the endogenous Nd and thus, not an increased activity of the N1 generators.
Näätänen also observed that the Nd is composed of two subcomponents: an early
one, which could be generated in the auditory association cortices and is independent
of the ISI and a later one of larger amplitude and longer duration at frontal sites,
which is elicited with long ISI (800-2000 ms) (Näätänen, 1981).
Based on these findings, he developed the attentional trace model of selective
attention (Näätänen, 1982). This attentional trace would be an actively maintained
cortical representation of the physical features (e.g. pitch, location) of stimuli. These
features separate relevant irrelevant stimuli from ones. The model proposes that there
is an early selection in terms of a comparison between the sensory input and the
44
attentional trace in the auditory cortex. The earlier Nd component, which would
explain the attention effect at the N1 latency, would be generated by the comparison
of the stimuli features with the trace. If the sounds do not match with the trace they
would be rejected from further analysis. Accordingly, the late Nd could reflect a
frontal component controlling and maintaining the attentional trace.
Hillyard’s and Näätänen’s models led to a controversy and numerous studies
about the relationship of the processing negativity and the N1. The main difference
between the two models is that Hillyard’s filtering mechanism would represent
modulation of the exogenous components of the ERPs in addition to an endogenous
attention effect presented as a frontally distributed negativity. In Näätänen’s
attentional trace model on the other hand all ERP effects would be of endogenous
origin and modulation.
Furthermore, another important study provided evidence for the early-
selection theory. Based on the idea that the auditory cortex can be activated as early
as 20 to 25 ms after the onset of a sound, Marty Woldorff tried to find attentional
changes on the earliest components of the auditory ERP, brainstem auditory evoked
potentials (BAEP) and the middle-latency deflections (latency range 10-40ms)
(Woldorff, 1987). He also used the classic dichotic listening paradigm but modified
it to facilitate early selection attention effects: the ISI was rather short and the
subjects had to perform a difficult detection task. The targets were of a low
probability and of lower intensity than the other stimuli, which increased the
attentional load and force the participants to closely pay attention to the sounds. He
did not find any evidence for an attentional modulation on the BAEP. However, he
found attentional changes of the ERPs even prior to the N1 and the P2, that is the
affection of the mid-latency ERPs around 20-50 ms (see Figure 2.18). More
specifically, Woldorff concluded that the P50 was modulated as function of selective
attention, since it showed enlargement of the amplitude to attended sounds in
comparison to unattended sounds. In addition, he replicated the results in another
study and finally showed that neural processing of attended versus unattended
sounds can differ significantly even at 20 ms post stimulus (Woldorff, 1991). Thus,
Woldorff provided support for the early-selection theory that stimuli can be selected
or gated before perceptual processing is fully completed.
45
Figure 2.18 – ERPs to attended and ignored stimuli in a dichotic listening task
(grand average across all subjects).
Except for the N1 and P2 effect, the essential finding is the attentional modulation of
the positive deflection at around 20-50 ms post stimulus (P50 effect). Thus,
providing support for early selection theories of attention (adapted from Woldorff,
1991).
46
(Giard, 1994a). These results indicate that selective attention related modulations
could occur already at the cochlear receptor, which indicates the existence of top-
down control mechanisms at a very early level. Marie-Helen Giard concluded that
selective attention could already operate as a peripheral band-pass filter at the
cochlear receptor level prior to transduction process (transduction of sound into a
neural signal).
Altogether these studies support the early selection theories of attention but
they do not answer the question by which mechanisms the attentional selection
happens.
47
48
49
50
An argument for two distinct attentional mechanisms also come from an EEG
(Gazzaley, 2008) and a fMRI (Gazzaley, 2005) study. Gazzaley found, that in the
visual modality, older adults exhibit a selective deficit in suppressing task-irrelevant
information during working memory encoding. He further showed that suppression
mechanisms are rather delayed in time than lost with age.
Attention and working memory are strongly related to each other. Lavie’s
cognitive load theory suggests that in order to direct attention and to specify which
stimuli are currently relevant, the active maintenance of stimulus properties in the
working memory is required (Lavie 2005). Therefore, a high load on working
memory should lead to less differentiation between high and low priority stimuli
(target versus distractor) and thus, increase distractor processing and thereby increase
distraction.
An fMRI study in young adults observed that the extent to which distractors
are inhibited can be determined by the availability of cognitive resources, assessing a
direct causal role for working memory in the control of selective attention (de
Fockert et al., 2001). Cognitive resources were manipulated in a dual task protocol
where subjects performed, at the same time, two unrelated tasks: an attention and a
memory tasks. In the visual attention task, subjects had to classify famous written
51
names as pop stars or politicians while ignoring distractor faces, which could be
congruent or incongruent with the target name or anonymous (see Figure 2.21 A).
In the working memory task, subjects were asked to remember a 5-digit order on
each trial at the beginning of the attention task. In order to manipulate the memory
load, subjects were asked to remember either a fixed order of digits (0 1 2 3 4) or a
random order of digits (0 3 1 4 2). After the attention task, a memory probe was
presented to the subjects, who were asked to report the digit that followed this probe
in the memory set (see Figure 2.21 A).
Functional magnetic resonance imaging (fMRI) was used to measure brain activity
while participants performed the two tasks. Distractor related activity was obtained
by comparing the activity during attention condition with a neutral condition in
which the distractor faces were absent (see Figure 2.21 B: face present versus face
absent).
De Fockert observed that a high working memory load increased the
distractor interference effect on behavioural performance of subjects. A high load
also resulted in an increase of activity elicited by the distarctor faces in visual areas,
especially in the extrastriate visual cortex and the fusiforme gyrus (known to be
selective for face processing) (see Figure 2.21 B). These findings indicate that
distractor faces were more extensively processed under high than under low working
memory load. De Fockert concluded that working memory serves to control visual
selective attention and suggested that there might be two distinct attentional
mechanisms regulating the responses to stimuli. However, this study did not assess,
to what extent the availability of cognitive resources affects the processing of
relevant information.
52
These results in the visual modality suggest that, facilitation and inhibition
rely on distinct mechanisms that would be differentially affected by the amount of
available cognitive resources, and thus the difficulty of a memory task in dual task
protocol. More precisely, facilitation would not to be affected by the memory task
difficulty, whereas inhibition is most likely to decrease with increasing memory task
difficulty.
53
The aim of the current article was to dissociate the two competing attentional
theories (see Figure 2.20) and to test whether facilitation and inhibition can operate
independently. To do so, a dual task protocol was also used. Subjects had to perform
an auditory selective attention task and a memory task. Electrophysiological
responses to the same sounds in three conditions were compared (attended, ignored
and a control condition) to measure facilitation and inhibition. The amount of
available cognitive resources was manipulated by varying the difficulty of a
concurrent sound memorization task. Based on the idea that facilitation and
inhibition operate independently, the hypothesis was, that they should not be
correlated, but rather feature different electrophysiological properties and should be
differentially affected by the memory task difficulty.
The following is largely content of the already published article: Bidet-Caulet, A.,
Mikyska, C., and Knight, R. T., (2010), Load effects in auditory selective attention:
evidence for distinct facilitation and inhibition mechanisms, Neuroimage, 50, (p.
277-84)
54
55
56
Figure 3.1 – Scheme of the dual task protocol: memory and attention task.
Subjects were presented with a sequence of 4 notes. Afterwards, they had to perform
3 different attentional tasks (detection of: (1) duration deviants in the left ear and (2)
in the right ear, (3) pure tones in both ears) while they were keeping in memory the
auditory sequence. 1 attention block consisted of 20 standards and 3 duration
deviants in each ear, respectively, and 3 pure tones in both ears. After the attention
block was completed, subjects had to do an easy or difficult (easy or difficult
memory task) test of the acoustic memorization (adapted from Bidet-Caulet, 2008).
57
3.3 Procedure
Participants were seated in a sound-attenuated EEG recording room. The sounds
were delivered through earphones at an intensity level judged comfortable by the
subjects, using ‘Presentation’ software (Neurobehavioral Systems, Albany, NY,
USA). The experiment started with a familiarization with the sounds and tasks and
the participants were trained on the attention and memory tasks separately. EEG was
then recorded while subjects performed 12 blocks of the attention task (4 in each
attention condition) for each memory condition, resulting in a total of 160 attended
standards, 160 ignored standards and 160 standards in the control condition. The
blocks were run by memory condition (e.g. 12 attention blocks were run under the
condition of easy memory and so forth). The order of memory conditions was
balanced across subjects. The order of the 12 attention blocks was the same for each
memory condition, and was balanced across participants using a Latin-square design.
During all the experiment, subjects were instructed to perform as well and as fast as
possible and to favor accuracy in the memory task if it was difficult to perform both
tasks correctly. They were also asked to keep their eyes fixated on a centrally
presented cross and to minimize any eye movements and blinks while performing the
tasks.
58
electrodes were replaced by their values interpolated from the remaining adjacent
electrodes. Averaging, locked to standard or deviant onset, respectively, was done
separately for each attention condition (attended, ignored and control) in each
memory condition (no, easy, difficult memory task). For the standard analysis, at
least 108 trials were averaged for each participant, for each condition. For deviant
analysis, trials contaminated corresponding to missed targets were excluded from
further analysis. At least 21 to 24 trials were averaged for each participant and for
each condition.
With this procedure, the average acoustic content of the sounds was the same
for all obtained event-related potentials (ERPs), only the attention orientation and the
memory task difficulty varied. ERPs were corrected with a -100 to 0 ms baseline
before standard or deviant onset, and were digitally filtered (low-pass 35 Hz). Since
the shortest ISI was 300 ms, only the -100 to 300 ms time-window was retained for
further analysis of standards. ERP scalp topographies were computed using spherical
spline interpolation (Perrin, 1989; Perrin, 1987).
59
model. The method assumes that the samples are normally distributed within
different population groups, each featuring the same variance. Thus, the ANOVA is
analyzing whether or not the means of several groups are all equal in order to
determine whether the groups are actually different or not. This is similar to a t-test.
But since there are more than two groups, multiple t-test would be necessary and this
in turn would increase the chance of committing a type I error. Therefore, ANOVAs
are useful for comparing more than two means.
The result of an ANOVA only states whether the tested groups are different
or not, but it does not reveal which means differ. For this reason it is necessary to
perform so called post-hoc tests like the permutation test (see 3.6.1.2). Another
important issue is the problem of multiple comparisons that arises from testing
multiple hypotheses at the same time. It means, the more tests performed the higher
the probability of obtaining at least one false positive result. For this reason it is
important to correct for the numbers of comparisons by performing statistical tests
like the Bonferroni correction.
60
61
corrected P are reported). Significant effects were explored using 2-tailed paired t-
tests. The Bonferroni correction to was used to correct the P-value for multiple
comparisons.
62
63
64
4 Results
We used a dual task protocol to orthogonally manipulate attention and cognitive
resources. For the attention task, we adapted the classic auditory attention protocol
by adding a third condition (control condition) in which attention was considered as
equally distributed to all sounds. We measured with electroencephalography (EEG)
the effects of three distinct levels of attention by comparing the event-related
potentials (ERPs) to the same sounds when they were attended (in the attended ear),
ignored (in the opposite, non-attended ear) or during the control condition. The
availability of cognitive resources was modulated by varying the difficulty of a
concurrent sound memorization task (3 difficulty levels: no, easy or difficult memory
task). Our hypothesis was that if attention-mediated facilitation and inhibition are
distinct mechanisms, they would be differentially affected by the difficulty of the
memory task.
65
66
67
68
Table 2 – Effect of the memory task difficulty and attention conditions on the
ERP amplitude.
Results of the two-way ANOVA on ERP mean amplitude, with memory difficulty
(no, easy and difficult) and attention condition as factors, for the three tested time-
windows. Statistical values (F, ε and P) of attention and memory difficulty main
effects and of attention by memory interaction effect are indicated with the control
condition included (attention condition factor with 3 levels: attended, control and
ignored) and with the control condition excluded (attention condition factor with 2
levels: attended and ignored). Significant effects are highlighted in grey.
69
Figure 4.3 – Effect of the memory task difficulty on attention effects. (A) Mean
ERP amplitudes (fronto-central group, 200-250 ms) of attention-mediated facilitation
(green) and inhibition (red) effects as a function of the memory task difficulty (no,
70
easy, difficult). Facilitation and inhibition effects are represented as the mean
difference between ERPs to attended and control, and to ignored and control
standards, respectively. Error bars represent 1 SEM. Stars indicate significant
differences assessed by permutation post-hoc tests of the interaction (attention by
memory) effect (*: P < 0.05; **: P < 0.01; ***: P < 0.001). (B) Scalp topographies
(top view) of the attention effects (200-250 ms): facilitation (mean difference
between ERPs to attended and control standards) and inhibition (mean difference
between ERPs to ignored and control standards). The black oval surrounds the
fronto-central electrode group used for computation of mean amplitudes and
statistical analysis represented in (A).
71
72
73
74
75
76
Figure 4.8 – Effect of the memory task difficulty on the P3 to targets. (A) Mean
P3 amplitudes (parietal group, 350-450 ms) to attended, control and ignored deviants
(depicted in green, grey and red, respectively) as a function of the memory task
difficulty (no, easy, difficult). Error bars represent 1 SEM Stars indicate significant
differences assessed by permutation post-hoc tests of the interaction (attention by
memory) effect (***: P < 0.001). (B) Scalp topographies (back views) of the mean
ERPs to targets (attended deviants) as a function of the memory task difficulty,
between 350 and 450 ms. The black oval surrounds the parietal group of electrodes
used for P3 analysis. (C) Mean ERPs at the parietal electrode group. ERPs to targets
in the no, easy and difficult memory conditions are depicted with thick, thin and
dashed green lines, respectively. The shaded area corresponds to the 350-450 ms
period, used for P3 analysis.
77
5 Discussion
Auditory selective attention is a complex mechanism constantly operating in our
daily life: the perception of a certain stimuli in the environment is enhanced relative
to other stimuli of lesser immediate priority (cocktail party-effect). Numerous
theories and studies were put forward to explain the phenomenon and to elucidate the
operating mechanisms and the associated anatomical structures. However, to date it
is still not known exactly how auditory selective attention is operating. Nevertheless,
the studies brought together several pieces of information about the mechanisms of
auditory selective attention.
It is now well accepted that auditory attention can modulate the sensory
analysis of sounds at multiple levels. First, selective attention can operate at early
stage of sensory processing, i.e. as early as 30 ms after sound onset, of the
automatic/exogenous ERPs generated in the auditory cortices. Second, attention can
also modulate stimuli processing at late selection stages via an attentional trace
observed as a sustained negative deflection of endogenous origin, called ‘Negative
difference’ (Nd) or Processing Negativity (PN). Additionally, auditory selective
attention seems to operate via facilitation and inhibition mechanisms, reflecting an
enhanced or reduced processing of relevant or irrelevant sounds, respectively.
In this study we tried to find out whether facilitation and inhibition are
distinct mechanisms and could operate independently at a late selection stage. To do
so, we modulated the amount of cognitive resources in an auditory selective attention
task, because if facilitation and inhibition are distinct mechanisms they should be
affected differently by the variation of the cognitive load (memory task). Therefore,
we used a dual task protocol: subjects had to perform an auditory attention task and a
memory task at the same time. We compared the electrophysiological responses to
the same sounds when they were attended, ignored or under a control condition,
where attention was considered equally distributed towards all sounds.
After analyzing the data, we found two frontally distributed components: a
negative one in response to attended standard sounds (facilitatory component), and a
positive one to ignored standard sounds (inhibitory component). These frontal
electrophysiological responses have distinct timing and topographies, and are
differentially modulated by the difficulty of the memory task. These results provide
78
79
80
81
82
6 References
1. Alcaini, M., Giard, M. H., Echallier, J. F., and Pernier, J., (1994), Selective
auditory attention effects in tonotopically organized cortical areas: A
topographic ERP study, Human Brain Mapping, 2, (p. 159-169).
2. Alho, K., Tottola, K., Reinikainen, K., Sams, M., and Naatanen, R., (1987),
Brain mechanism of selective listening reflected by event-related potentials,
Electroencephalogr Clin Neurophysiol, 68, (p. 458-70).
3. Alho, K., Woods, D. L., and Algazi, A., (1994), Processing of auditory
stimuli during auditory and visual attention as revealed by event-related
potentials, Psychophysiology, 31, (p. 469-79).
4. Araque, A. and Perea, G., (2004), Glial modulation of synaptic transmission
in culture, Glia, 47, (p. 241-8).
5. Berman, S. M., Heilweil, R., Ritter, W., and Rosen, J., (1989), Channel
probability and Nd: an event-related potential sign of attention strategies,
Biol Psychol, 29, (p. 107-24).
6. Bidet-Caulet, A., Mécanismes neurophysiologiques de la perception de flux
sonores chez l'Homme: Effets des contextes acoustiques et attentionnels,
Dissertation, Université Claude Bernard, Lyon, 2006.
7. Bidet-Caulet, A., Fischer, C., Besle, J., Aguera, P. E., Giard, M. H., and
Bertrand, O., (2007), Effects of selective attention on the electrophysiological
representation of concurrent sounds in the human auditory cortex, J Neurosci,
27, (p. 9252-61).
8. Bidet-Caulet, A. and Mikyska, C., Facilitation and inhibition mechanisms in
auditory selective attention, in Society for Neuroscience, 2008, Washington
DC.
9. Blair, R. C. and Karniski, W., (1993), An alternative method for significance
testing of waveform difference potentials, Psychophysiology, 30, (p. 518-24).
10. Broadbent, D. E., (1958), Perception and communication, Pergamon Press,
London.
11. Buschman, T. J. and Miller, E. K., (2007), Top-down versus bottom-up
control of attention in the prefrontal and posterior parietal cortices, Science,
315, (p. 1860-2).
83
12. Cacioppo, J., Tassinary, L., and Berntson, G., (2005), Handbook of
psychophysiology, 3 ed., Cambridge University Press, New York, (p. 908).
13. Cherry, E. C., (1953), Some exmeriments on the recognition of speech, with
one and with 2 ears, Journal of the Acoustical Society of America, 25, (p.
975-979).
14. Davis, P. A., (1939), Effects of acoustic stimuli on the waking human brain,
Journal of Neurophysiology, 2, (p. 494-499).
15. de Fockert, J. W., Rees, G., Frith, C. D., and Lavie, N., (2001), The role of
working memory in visual selective attention, Science, 291, (p. 1803-6).
16. Degerman, A., Rinne, T., Sarkka, A. K., Salmi, J., and Alho, K., (2008),
Selective attention to sound location or pitch studied with event-related brain
potentials and magnetic fields, Eur J Neurosci, 27, (p. 3329-41).
17. Deutsch, J. A. and Deutsch, D., (1963), Some theoretical considerations,
Psychol Rev, 70, (p. 80-90).
18. Donald, M. W., (1987), The timing and polarity of different attention-related
ERP changes inside and outside of the attentional focus, Electroencephalogr
Clin Neurophysiol Suppl, 40, (p. 81-6).
19. Ebner, A., (2006), EEG, 1 ed., Georg Thieme Verlag, Stuttgart, (p. 1-8).
20. Edgington, E. S., (1995), Randomization Tests, Third edition : revised and
expanded ed., Marcel Dekker, New York, USA.
21. Elul, R., (1971), The genesis of the EEG, Int Rev Neurobiol, 15, (p. 227-72).
22. Fletcher, H. and A., M. W., (1933), Loudness, Its Definition, Measurement
and Calculation, J. Acoust. Soc. Am., 5, (p. 82-108).
23. Folstein, J. R. and Van Petten, C., (2008), Influence of cognitive control and
mismatch on the N2 component of the ERP: a review, Psychophysiology, 45,
(p. 152-70).
24. Gazzaley, A., Clapp, W., Kelley, J., McEvoy, K., Knight, R. T., and
D'Esposito, M., (2008), Age-related top-down suppression deficit in the early
stages of cortical visual memory processing, Proc Natl Acad Sci U S A, 105,
(p. 13122-6).
25. Gazzaley, A., Cooney, J. W., Rissman, J., and D'Esposito, M., (2005), Top-
down suppression deficit underlies working memory impairment in normal
aging, Nat Neurosci, 8, (p. 1298-300).
84
26. Gazzaniga, M., Ivry, R., and Mangun, G., (2002), Cognitive Neuroscience, 2
ed., W. W. Norton & Company, New York City, (p. 244-251).
27. Gelman, A., (2005), Analysis of variance - Why it is more important than
ever, Annals of Statistics, 33, (p. 1-31).
28. Geschwind, N. and Levitsky, W., (1968), Human brain: left-right
asymmetries in temporal speech region, Science, 161, (p. 186-7).
29. Giard, M. H., Collet, L., Bouchet, P., and Pernier, J., (1994a), Auditory
selective attention in the human cochlea, Brain Res, 633, (p. 353-6).
30. Giard, M. H., Fort, A., Mouchetant-Rostaing, Y., and Pernier, J., (2000),
Neurophysiological mechanisms of auditory selective attention in humans,
Front Biosci, 5, (p. D84-94).
31. Giard, M. H., Perrin, F., Echallier, J. F., Thevenet, M., Froment, J. C., and
Pernier, J., (1994b), Dissociation of temporal and frontal components in the
human auditory N1 wave: a scalp current density and dipole model analysis,
Electroencephalogr Clin Neurophysiol, 92, (p. 238-52).
32. Giard, M. H., Perrin, F., Pernier, J., and Peronnet, F., (1988), Several
attention-related wave forms in auditory areas: a topographic study,
Electroencephalogr Clin Neurophysiol, 69, (p. 371-84).
33. Hansen, J. C. and Hillyard, S. A., (1984), Effects of stimulation rate and
attribute cuing on event-related potentials during selective auditory attention,
Psychophysiology, 21, (p. 394-405).
34. Hawkins, J. E., Human ear, in Britannica Ecyclopaedia. 1997, Encyclopedia
Britannica Inc.
35. Hillyard, S. A., Hink, R. F., Schwent, V. L., and Picton, T. W., (1973),
Electrical signs of selective attention in the human brain, Science, 182, (p.
177-80).
36. Hoffmann, S. and Falkenstein, M., (2008), The correction of eye blink
artefacts in the EEG: a comparison of two prominent methods, PLoS One, 3,
(p. e3004).
37. Howard, M. A., 3rd, Volkov, I. O., Abbas, P. J., Damasio, H., Ollendieck, M.
C., and Granner, M. A., (1996), A chronic microelectrode investigation of the
tonotopic organization of human auditory cortex, Brain Res, 724, (p. 260-4).
38. Hudspeth, A. J., (1983), Mechanoelectrical transduction by hair cells in the
acousticolateralis sensory system, Annu Rev Neurosci, 6, (p. 187-215).
85
39. James, W., (1890), The Principles of Psychology, New York: Henry Holt, 1,
(p. 403-404).
40. Jancke, L., Mirzazade, S., and Shah, N. J., (1999), Attention modulates
activity in the primary and the secondary auditory cortex: a functional
magnetic resonance imaging study in human subjects, Neurosci Lett, 266, (p.
125-8).
41. Kaas, J. H. and Hackett, T. A., (1999), 'What' and 'where' processing in
auditory cortex, Nat Neurosci, 2, (p. 1045-7).
42. Kastner, S. and Ungerleider, L. G., (2000), Mechanisms of visual attention in
the human cortex, Annu Rev Neurosci, 23, (p. 315-41).
43. Kemp, D. T., (1978), Stimulated acoustic emissions from within the human
auditory system, J Acoust Soc Am, 64, (p. 1386-91).
44. Kolb, B. and Whishaw, I. Q., (1996), Fundamentals of human
neuropsychology, 4th ed., W.H. Freeman, New York, N.Y.
45. Lavie, N., (2005), Distracted and confused?: selective attention under load,
Trends Cogn Sci, 9, (p. 75-82).
46. Lavie, N., (1995), Perceptual load as a necessary condition for selective
attention, J Exp Psychol Hum Percept Perform, 21, (p. 451-68).
47. Lavie, N., Hirst, A., de Fockert, J. W., and Viding, E., (2004), Load theory of
selective attention and cognitive control, J Exp Psychol Gen, 133, (p. 339-
54).
48. Liegeois-Chauvel, C., Musolino, A., Badier, J. M., Marquis, P., and Chauvel,
P., (1994), Evoked potentials recorded from the auditory cortex in man:
evaluation and topography of the middle latency components,
Electroencephalogr Clin Neurophysiol, 92, (p. 204-14).
49. Lukas, J. H., (1980), Human auditory attention: the olivocochlear bundle may
function as a peripheral filter, Psychophysiology, 17, (p. 444-52).
50. Lukas, J. H., (1981), The role of efferent inhibition in human auditory
attention: an examination of the auditory brainstem potentials, Int J Neurosci,
12, (p. 137-45).
51. Malmivuo, J., (2004), Comparison of the properties of EEG and MEG,
International Journal of Bioelectromagnetism, 6, (p. 1-14).
86
52. Manjarrez, E., Vazquez, M., and Flores, A., (2007), Computing the center of
mass for traveling alpha waves in the human brain, Brain Res, 1145, (p. 239-
47).
53. McCarthy, G. and Donchin, E., (1981), A metric for thought: a comparison of
P300 latency and reaction time, Science, 211, (p. 77-80).
54. McCarthy, G. and Wood, C. C., (1985), Scalp distributions of event-related
potentials: an ambiguity associated with analysis of variance models,
Electroencephalogr Clin Neurophysiol, 62, (p. 203-8).
55. Melara, R. D., Rao, A., and Tong, Y., (2002), The duality of selection:
excitatory and inhibitory processes in auditory selective attention, J Exp
Psychol Hum Percept Perform, 28, (p. 279-306).
56. Michie, P. T., Bearpark, H. M., Crawford, J. M., and Glue, L. C., (1990), The
nature of selective attention effects on auditory event-related potentials, Biol
Psychol, 30, (p. 219-50).
57. Michie, P. T., Solowij, N., Crawford, J. M., and Glue, L. C., (1993), The
effects of between-source discriminability on attended and unattended
auditory ERPs, Psychophysiology, 30, (p. 205-20).
58. Millett, D., (2001), Hans Berger: from psychic energy to the EEG, Perspect
Biol Med, 44, (p. 522-42).
59. Moray, N., (1959), Attention in dichotic listening: Affective cues and the
influence of instructions, Quarterly Journal of Experimental Psychology, (p.
56-60).
60. Muller-Gass, A. and Campbell, K., (2002), Event-related potential measures
of the inhibition of information processing: I. Selective attention in the
waking state, Int J Psychophysiol, 46, (p. 177-95).
61. Näätänen, R., (1992), Attention and Brain Function, Erlbaum, Hilldale, NJ.
62. Näätänen , R., (1982), Processing negativity: an evoked-potential reflection
of selective attention, Psychol Bull, 92, (p. 605-40).
63. Näätänen , R., Gaillard, A. W., and Mantysalo, S., (1978), Early selective-
attention effect on evoked potential reinterpreted, Acta Psychol (Amst), 42,
(p. 313-29).
64. Näätänen, R., Gaillard, A. W., and Varey, C. A., (1981), Attention effects on
auditory EPs as a function of inter-stimulus interval, Biol Psychol, 13, (p.
173-87).
87
65. Näätänen , R., Gaillard, A. W. K., Anthony, W. K. G., and Walter, R., The
Orienting Reflex and the N2 Deflection of the Event-Related Potential (ERP),
in Advances in Psychology. 1983, North-Holland. (p. 119-141).
66. Näätänen , R., Paavilainen, P., Tiitinen, H., Jiang, D., and Alho, K., (1993),
Attention and mismatch negativity, Psychophysiology, 30, (p. 436-50).
67. Näätänen , R. and Picton, T., (1987), The N1 wave of the human electric and
magnetic response to sound: a review and an analysis of the component
structure, Psychophysiology, 24, (p. 375-425).
68. Näätänen , R. and Winkler, I., (1999), The concept of auditory stimulus
representation in cognitive neuroscience, Psychol Bull, 125, (p. 826-59).
69. Netter, F., (2006), Atlas of human Anatomy, 4 ed., Saunders Elsevier,
Philadelphia, (p. 640).
70. Pandya, D. N., (1995), Anatomy of the auditory cortex, Rev Neurol (Paris),
151, (p. 486-94).
71. Pashler, H., (1998), The Psychology of Attention, MA: MIT Press,
Cambridge, (p. 75-77).
72. Perrin, F., Pernier, J., Bertrand, O., and Echallier, J. F., (1989), Spherical
splines for scalp potential and current density mapping,
Electroencephalography and Clinical Neurophysiology, 72, (p. 184-7).
73. Perrin, F., Pernier, J., Bertrand, O., Giard, M. H., and Echallier, J. F., (1987),
Mapping of scalp potentials by surface spline interpolation,
Electroencephalography and Clinical Neurophysiology, 66, (p. 75-81).
74. Pfurtscheller, G. and Lopes da Silva, F. H., (1999), Event-related EEG/MEG
synchronization and desynchronization: basic principles, Clin Neurophysiol,
110, (p. 1842-57).
75. Picton, T. W., Stuss, D. T., Kornhubek, H. H., and Deecke, L., The
Component Structure of the Human Event-Related Potentials, in Progress in
Brain Research. 1980, Elsevier. (p. 17-49).
76. Purves, D., Augustine, G. J., Katz, L. C., LaMantia, A. S., and McNamara, J.
O., (1997), Neuroscience, MA: Sinauer Associates, Sunderland.
77. Rickheit, G., Herrmann, T., and Deutsch, W., (2003), Psycholinguistik: Ein
internationales Handbuch, Walter de Gruyter, Berlin/New York, (p. 67).
88
78. Roberts, W. M., Howard, J., and Hudspeth, A. J., (1988), Hair cells:
transduction, tuning, and transmission in the inner ear, Annu Rev Cell Biol,
4, (p. 63-92).
79. Schmidt, R. F. and Schaible, H., (1993), Neuro- und Sinnesphysiologie, 5 ed.,
Springer, Berlin, (p. 287-311).
80. Schmidt, R. F., Thews, G., and Lang, F., (2005), Physiologie des Menschen,
25 ed., Springer, Heidelberg, (p. 334-357).
81. Schmitt, B. M., Munte, T. F., and Kutas, M., (2000), Electrophysiological
estimates of the time course of semantic and phonological encoding during
implicit picture naming, Psychophysiology, 37, (p. 473-84).
82. Schroger, E. and Eimer, M., (1997), Endogenous covert spatial orienting in
audition: "Cost-benefit" analyses of reaction times and event-related
potentials., Quarterly Journal of Experimental Psychology - A, (p. 457-474).
83. Singer, W., (1993), Synchronization of cortical activity and its putative role
in information processing and learning, Annu Rev Physiol, 55, (p. 349-74).
84. Soltani, M. and Knight, R. T., (2000), Neural origins of the P300, Crit Rev
Neurobiol, 14, (p. 199-224).
85. Sutton, S., Braren, M., Zubin, J., and John, E. R., (1965), Evoked-potential
correlates of stimulus uncertainty, Science, 150, (p. 1187-8).
86. Treisman, A. M., (1960), Contextual cues in selective listening, Quarterly
Journal of Experimental Psychology, 12, (p. 242-248).
87. Trepel, M., (2008), Neuroanatomie: Struktur und Fuktion, 4 ed., Urban &
Fischer, München, (p. 358-370).
88. Vaughan, H. G., Jr. and Ritter, W., (1970), The sources of auditory evoked
responses recorded from the human scalp, Electroencephalogr Clin
Neurophysiol, 28, (p. 360-7).
89. Von Bekesy, G., (1960), Experiments in hearing, McGraw-Hill, New York.
90. Woldorff, M. G., Gallen, C. C., Hampson, S. A., Hillyard, S. A., Pantev, C.,
Sobel, D., and Bloom, F. E., (1993), Modulation of early sensory processing
in human auditory cortex during auditory selective attention, Proc Natl Acad
Sci U S A, 90, (p. 8722-6).
91. Woldorff, M. G., Hansen, J. C., and Hillyard, S. A., (1987), Evidence for
effects of selective attention in the mid-latency range of the human auditory
89
90
7 List of abbreviations
A1 primary auditory cortex
AC alternating current
ADC analogue to digital conversion
AEP auditory evoked potential
Ag/Cl silver chloride
ANOVA analysis of variances
BAEP brainstem auditory evoked potential
BOLD blood oxygen level dependency
CGM corpus geniculatum mediale
dB decibel
ECoG electrocorticogram
EEG electroencephalogram
(r/l/v) EOG (right/left/vertical) electrooculogram
EP evoked potential
EPSP excitatory postsynaptic potential
ERP event related potential
FDG fluorodeoxyglucose
fMRI functional magnetic resonance imagining
GABA gamma aminobutyric acid
HG Heschl’s gyrus
Hz Hertz
ICA independent component analysis
ISI inter stimulus interval
IPSP inhibitory postsynaptic potential
kOhm kilo Ohm
mm millimeter
mV millivolt
mmol/l millimol per liter
N1/N100 auditory ERP at around 100 ms post stimulus (negative deflection)
Nd negative difference wave
(E)OAE (evoked) otoacoustic emission
MEG magnetencephalogram
91
92
8 Publication
Results from this study are already published:
Bidet-Caulet, A., Mikyska, C., and Knight, R. T., (2010), Load effects in auditory
selective attention: evidence for distinct facilitation and inhibition mechanisms,
Neuroimage, 50, (p. 277-84)
93
9 Acknowledgements
I would like to express my gratitude to everyone, who contributed to this thesis.
Especially I would like to thank Aurélie for introducing me to the world of science
and showing me everything she knew. It was a pleasure working with her and
learning from her. But more important, during the time in Berkeley, she became a
close friend. Thank you for the support (e.g. lab meeting, SfN, CNS), guidance,
advice, corrections, good music in the pod, French food and wine….
Also I would like to thank Robert T. Knight, M.D. for giving me the opportunity to
spend a year at Berkeley and do research at the Helen Wills Neuroscience Institute.
I am also grateful to Prof. Dr. H. Stefan, for supporting the cooperation with UC
Berkeley, for corrections and advices and for encouraging me to finish writing.
A very special appreciation goes to Prof. Dr. H-J Heinze, for the initial idea and the
support and initiative to bring it to life.
My deepest thank you is for my family, without you I would not be where I am now.
Thank you for all your love and support.
Nic, you always believe in me. Thank you for your love and understanding.
94
10 Curriculum vitae
Personal
Name Mikyska
First name Constanze Elisabeth Anna
Date and place of birth December 06th, 1984 in Munich, Germany
Education
School
2004 University-entrance diploma (Abitur), Heimschule
Kloster Wald, Wald (secondary school and boarding
school)
Professional education
2000 – 2004 Apprenticeship as a tailor, Heimschule Kloster Wald,
Wald
University
2004 – 2011 Medical student at the Friedrich-Alexander University
of Erlangen-Nuremberg
09/2006 Preliminary medical examination
12/2011 Final medical licensing examination
Research experience
05/2008 – 09/2009 Visiting researcher, Helen Wills Neuroscience
Institute, University of California, Berkeley, USA
Employment
12/2009 – 10/2010 Student assistant, Epilepsiezentrum (ZEE), Department
of Neurology, University hospital Erlangen-
Nuremberg