Sunteți pe pagina 1din 605

A PORTRAIT OF STATE-OF-THE-ART RESEARCH

AT THE TECHNICAL UNIVERSITY OF LISBON


A Portrait of State-of-the-Art
Research at the Technical
University of Lisbon

Edited by

MANUEL SEABRA PEREIRA


Technical University of Lisbon,
Lisbon, Portugal
A C.I.P. Catalogue record for this book is available from the Library of Congress.

ISBN-10 1-4020-5689-3 (HB)


ISBN-13 978-1-4020-5689-5 (HB)
ISBN-10 1-4020-5690-7 (e-book)
ISBN-13 978-1-4020-5690-1 (e-book)

Published by Springer,
P.O. Box 17, 3300 AA Dordrecht, The Netherlands.

www.springer.com

Printed on acid-free paper

All Rights Reserved


© 2007 Springer
No part of this work may be reproduced, stored in a retrieval system, or transmitted
in any form or by any means, electronic, mechanical, photocopying, microfilming, recording
or otherwise, without written permission from the Publisher, with the exception
of any material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work.
TABLE OF CONTENTS

Preface ix

PART I – Emergent Areas

Nanotechnology and the Detection of Biomolecular Recognition


Using Magnetoresistive Transducers 3
Paulo P. Freitas, Hugo A. Ferreira, Filipe Cardoso, Susana Cardoso,
Ricardo Ferreira, Jose Almeida, Andre Guedes, Virginia Chu,
João P. Conde, Verónica Martins, Luis Fonseca, Joaquim S. Cabral,
José Germano, Leonel Sousa, Moisés Piedade, Bertinho Silva,
José M. Lemos, Luka A. Clarke and Margarida D. Amaral

Financial Econometric Models 23


João Nicolau

Quantum Computation and Information 43


Amílcar Sernadas, Paulo Mateus and Yasser Omar

PART II – Basic Sciences

An Overview of Some Mathematical Models of Blood Rheology 65


Adélia Sequeira and João Janela

Mathematical Models in Finance 89


Maria do Rosário Grossinho

More Sustainable Synthetic Organic Chemistry Approaches Based


on Catalyst Reuse 103
Carlos A.M. Afonso, Luís C. Branco, Nuno R. Candeias,
Pedro M.P. Gois, Nuno M.T. Lourenço, Nuno M.M. Mateus and
João N. Rosa

v
vi Table of Contents

Simulation and Modeling in Computational Chemistry: A


Molecular Portfolio 121
José N.A. Canongia Lopes

Experimental Particle and Astroparticle Physics 137


M. Pimenta

Frontiers on Extreme Plasma Physics 151


Luís Oliveira e Silva

Nuclear Fusion: An Energy for the Future 163


Carlos Varandas and Fernando Serra

PART III – Social Sciences, Economics and Management Sciences

Regulation Policies in Portugal 173


João Bilhim, Luis Landerset Cardoso and Eduardo Lopes Rodrigues

The Growing Relevance of Africa in Chinese Foreign Policy:


The Case of Portuguese Speaking Countries 183
Ana Alves and António Vasconcelos de Saldanha

Economy Growth Theory, Fifty Years Later 197


Paulo B. Brito

PART IV – Life Sciences and Biotechnology

DNA Vaccines 219


Duarte Miguel F. Prazeres and Gabriel Amara Monteiro

Biotechnology of the Bacterial Gellan Gum: Genes and Enzymes


of the Biosynthetic Pathway 233
Arsénio M. Fialho, Leonilde M. Moreira, Ana Teresa Granja,
Karen Hoffmann, Alma Popescu and Isabel Sá-Correia

Epigenetics: The Functional Memory of Ribosomal Genes 251


Wanda S. Viegas, Manuela Silva and Nuno Neves

Biotechnology of Reproduction and Development: From the


Biomedical Model to Enterprise Innovation 259
Luís Lopes da Costa, António Freitas Duarte and José Robalo Silva
Table of Contents vii

PART V – Engineering and Technologies

Evolution and Challenges in Multimedia Representation Technologies 275


Fernando Pereira, João Ascenso, Catarina Brites, Pedro Fonseca,
Pedro Pinho and Joel Baltazar

Bioinformatics: A New Approach for the Challenges of Molecular


Biology 295
Arlindo L. Oliveira, Ana T. Freitas and Isabel Sá-Correia

Research and Development in Metal Forming Technology at the


Technical University of Lisbon 311
Jorge M.C. Rodrigues and Paulo A.F. Martins

Agronomy: Tradition and Future 329


Pedro Aguiar Pinto

Towards a Clean Energy for the Future – The Research Group on


Energy and Sustainable Development of IST 341
Maria da Graça Carvalho and Luis Manuel Alves

PART VI – Nature, Environment and Sustainability

Industrial Ecology: A Step towards Sustainable Development 357


Paulo Manuel Cadete Ferrão

Forests for the 21st Century? 385


João Santos Pereira, Helena Martins and José G.C. Borges

The Role of Emergent Technologies towards an Integrated


Sustainable Environment 401
Elizabeth Duarte, Maria N. Pinho and Miguel Minhalma

Integrated Water Management 421


Ramiro Neves, José S. Matos, Luís Fernandes and Filipa S. Ferreira

PART VII – Public Health, Food Quality and Safety

Food Safety Crisis Management and Risk Communication 449


Virgilio Almeida
viii Table of Contents

Debaryomyces hansenii, A Salt Loving Spoilage Yeast 457


Catarina Prista and Maria C. Loureiro-Dias

The New Disease and the Old Agents 465


Yolanda Vaz and Telmo Nunes

The Sharing of Urban Areas by Man and Animals 479


Armando C. Louzã

PART VIII – Health and Sport Sciences

Physical Activity and Cardiorespiratory Fitness 491


Luís Bettencourt Sardinha

Ergonomics: Humans in the Centre of Innovation 511


Anabela Simões and José Carvalhais

Development in Biomechanics of Human Motion for Health


and Sports 531
Jorge A.C. Ambrósio and João M.C.S. Abrantes

PART IX – Urbanism, Transports, Architecture and Design

Urbanisation Trends and Urban Planning in the Lisbon


Metropolitan Area 557
João Cabral, Sofia Morgado, José Luís Crespo and Carine Coelho

Technical, Economical and Organizational Innovation in


Transport Systems 573
José Manuel Viegas

Hotel Architecture in Portugal 595


Madalena Cunha Matos

Inclusive Design: A New Approach to Design Project 605


Fernando Moreira da Silva and Rita Almendra
PREFACE

The Technical University of Lisbon (UTL) is celebrating this year its 75th
anniversary. Being a jubilee occasion, a full program of events took place,
including a two-day Symposium on the research at UTL. This Symposium
addressed the state-of-art in major areas of excellence at UTL.
Science, technology and innovation and the way universities and society
in general, create, use and disseminate knowledge have gained a growing
significance over the last decades. UTL no doubt embeds a relevant potential
of excellence in different areas of research in basic and applied sciences,
which bears its development on the basis of a “research university” model.
This book contains the edited version of the invited lectures that were
delivered by prominent researchers at UTL. This book brings together in a
review manner a comprehensive summary of high quality research contribu-
tions across basic and applied sciences.
The contributing papers are organized around the following major areas:
– Emergent areas (Nanosciences, Quantic Computations and Informa-
tion, Risk and Volatility in Financial Markets);

– Basic Sciences (Mathematics, Physics, Chemistry and Materials);

– Social Sciences, Economics and Management Sciences;

– Life Sciences and Biotechnology;

– Engineering and Technologies

– Nature, Environment and Sustainability;

– Public Health, Food Quality and Safety;

– Health and Sport Sciences;

– Urbanism, Transports, Architecture, Arts and Design.


The transdisciplinary nature of most areas aims to stress a compelling
sense of purpose in the work developed.
ix
x Preface

The editor is indebted to the members of the Organizing Committee, Pro-


fessors Luís Tavares, Manuela Chaves, João Santos Silva, Francisco Rebelo,
Carlos Mota Soares, João Pedro Conde, João Bilhim and Clara Mendes for
their valuable suggestions and advice in the organization of the Symposium.
We extend our recognition to all lecturers for their contributing presentations.
Our appreciation is also due to the collaboration and efforts of Dr. Maria do
Céu Crespo and Ms. Lourdes Costa who contributed to the smooth running
of the Symposium. We acknowledge the dedicated support of Dr. Nelson
Camacho in the editing of the book.
Finally the support of Bank Santander Totta is gratefully acknowledged.

Manuel Seabra Pereira


Vice-Rector for Scientific Affairs
Technical University of Lisbon, Portugal
June 2006
PART I

EMERGENT AREAS
NANOTECHNOLOGY AND THE DETECTION
OF BIOMOLECULAR RECOGNITION USING
MAGNETORESISTIVE TRANSDUCERS

Paulo P. Freitas1,2, Hugo A. Ferreira1,2, Filipe Cardoso1,2, Susana Cardoso1,2,


Ricardo Ferreira1,2, Jose Almeida1,2, Andre Guedes1,2, Virginia Chu1, João P.
Conde1,3, Verónica Martins1,3, Luis Fonseca3, Joaquim S. Cabral3, José
Germano4,5, Leonel Sousa4,5, Moisés Piedade4,5, Bertinho Silva4,5, José M.
Lemos4,5, Luka A. Clarke6 and Margarida D. Amaral6
1
INESC MN, R. Alves Redol 9, 1000-029 Lisboa, Portugal, e-mail: pfreitas@inesc-mn.pt

2
Departamento de Física, Instituto Superior Técnico, Universidade Técnica de Lisboa, Av.
Rovisco Pais, 1049-001, Lisboa, Portugal

3
Departamento de Química e Engenharia Biológica, Instituto Superior Tecnico, Universidade
Técnica de Lisboa, Av. Rovisco Pais, 1049-001, Lisboa, Portugal

4
INESC ID, R. Alves Redol 9, 1000-029, Lisboa, Portugal

5
Departamento de Engenharia Electrotécnica e de Computadores, Instituto Superior Técnico,
Universidade Técnica de Lisboa, Av. Rovisco Pais, 1049-001, Lisboa, Portugal

6
Departamento de Química e Bioquímica, Faculdade de Ciências, Universidade de Lisboa,
1749-016, Lisboa, Portugal

Abstract: An integrated electronic biochip platform for the detection of biomolecular


recognition is described. It includes the detection module, where labeled target
DNA/antibody molecules are magnetically arrayed towards immobilized
probes (cDNA/antigen) and where DNA-cDNA hybridization or antibody-
antigen interaction is detected. Magnetic nanobeads are used as labels for
targets, and magnetic field sensors are used to detect beads presence. The
present device holds 256 probe sites in a matrix containing one magnetic
tunnel junction and one thin film PIN diode at each node. A microfluidics
chamber and a credit card sized electronics board complete the microsystem.
Diagnostic applications include the detection of cystic fibrosis relate gene

3
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 3–22.
© 2007 Springer. Printed in the Netherlands.
4 P.P. Freitas et al.

mutations (DNA chip) and the detection of Salmonella and Es-cherichia coli
presence in water (immunoassay).

Key words: nanotechnology, nanobiotechnology, biochips, lab-on-a-chip, bioarrays,


magnetic beads, magnetoresistive sensors, microfluidics.

1. INTRODUCTION

Biomolecular recognition has been playing an ever-important role in


health care, pharmaceutical industry, environmental analysis and broad
biotechnological applications. In particular, a great deal of effort is being
placed in developing high-performance and low-cost tools for the detection
of DNA-DNA hybridization in genetic disease diagnostics, mutation
detection or gene expression quantification, and for the detection of
antibody-antigen interaction in micro-organism identification and biological
warfare agent screening [1].
The idea behind a spintronic biochip or biosensor is to replace
traditionally used fluorescent markers by magnetic labels. Instead of
detecting biomolecular recognition using expensive optical or laser-based
fluorescence scanner systems, the detection is made by a spintronic element,
such as a magnetoresistive sensor, that senses the magnetic labels stray field
and provides a straightforward electronic signal at low-cost. In addition,
since biological samples are usually non-magnetic, the background is greatly
diminished when compared to fluorescence methods. Other advantages of
the spintronic biochip are the fast response, high sensitivity and ease of
integration and automation, making it competitive in the analysis of few
(100’s to 1000’s) of biological analytes (e.g. screening for mutations in
genetic diagnosis).
A typical spintronic biochip consists of an array of sensing elements
(such as magnetoresistive sensors); an array of probes (biomolecules of
known identity such as gene specific oligonucleotides or antibodies) that are
immobilized onto the surface of the sensors (through microspotting, or
electrical or magnetic arraying); an hybridization chamber (normally a
microfluidic channel arrangement); and an optional target arraying
mechanism (electric fields for charged molecules such as DNA or magnetic
field generating lines for magnetically labeled targets, see Fig. 1)
The targets (biomolecules to identify in a sample such as a DNA strand
complementary to the immobilized DNA probe, or antigens complementary
to the immobilized antibodies) are incubated with the chip for biomolecular
recognition to occur. They can be already magnetically labeled before or be
labeled after the recognition step. Magnetic labels usually are
Nanotechnology and the Detection of Biomolecular Recognition 5

superparamagnetic or non-remanent ferromagnetic in nature, with nano- or


micrometer dimensions, and can be attached to the target biomolecules.
Under a magnetic applied field these particles or beads acquire a moment
and their fringe field can induce a change in resistance of the spintronic
sensor, enabling biomolecular recognition detection.

Figure 1. Schematic of INESC-MN’s spintronic biochip which is composed of an array of


spintronic transducers, an array of probe biomolecules immobilized onto the surface of the
sensors (in this case single-stranded DNA molecules are represented), and solutions of
magnetically labeled target biomolecules (DNA strands) that bind to the surface through
biomolecular recognition (DNA hybridization). At the side: the detection of biomolecular
recognition is achieved through the sensing of the magnetic stray field created by a magnetic
label using a spintronic transducer.

2. MAGNETIC NANOBEADS

In magnetoresistive bioassays, magnetic labels should comply with


certain requisites: have a high saturation magnetization (made of materials
like Fe, Co, Ni and their alloys) so that the signal per particle is the
maximum possible; show material stability over time (like iron oxides); be
biocompatible and non-toxic (like iron oxides); be monodispersed and do not
cluster, i.e., be superparamagnetic; show low unspecific adsorption to
undesired biomolecules and surfaces; and ideally, each particle should label
or tag a single biomolecule. In addition, material stability and
biocompatibility requisites should apply as well as to the encompassing
matrix or the coating.
6 P.P. Freitas et al.

The technology of magnetic particles for biosensing applications involves


several fields of knowledge, namely, inorganic and organic chemistry,
materials science, and molecular biology. In fact, magnetic properties are as
important as suitable coating and biomolecule functionalization chemistries.

Table 1. Properties of several magnetic labels used in magnetoresistive biosensing platforms.


Data was obtained by vibrating sample magnetometry at INESC-MN, unless indicated
otherwise [2], [3]. aMagnetization per particle at an excitation field H of 1.2 kA/m.b Average
susceptibility for 1 < |H| < 4 kA/m. c FeOx represents J-Fe2O3 and Fe3O4. % values represent
the magnetic content of the particles (data from supplier).d Magnetization and susceptibility
values were taken from magnetization curves shown on [4].e Magnetization values were
estimated from data shown in [5] admitting a constant susceptibility from 0 to 40 kA/m.
Diameter Magnetization
Label Manufacturer Susceptibilityb Materialc
(nm) (kA/m)a
Ni70Fe30
NiFe powderd Novamet 3300 5.00 4.2
(~100%)
Dynal FeOx
Dynal M-280d 2800 0.40 0.35
Biotech (17%)
FeOx
Micromer-M Micromod 2000 0.48 0.22
(15%)
Bangs FeOx
CM01N/7228e 860 1.88 1.57
Laboratories (27.5%)
e Bangs FeOx
CM01N/7024 350 0.99 0.825
Laboratories (45.8%)
FeOx
Nanomag-D Micromod 250 20.10 4.81
(75%)
FeOx
Nanomag-D Micromod 130 17.80 4.44
(75%)
Nanomag-D- FeOx
Micromod 100 0.34 0.28
spio (35%)
Nanomag-D- FeOx
Micromod 50 0.85 0.71
spio (35%)

At INESC-MN, several particles of diameters ranging from 50 nm up to


2.8 µm were studied [2], [3]. Table 1 shows some of the magnetic properties
of the labels tested by INESC-MN and other research laboratories. Fig. 2.a)
shows the magnetization curve for iron oxide particles of 250 nm in
diameter. Fig.2 b) shows 2.8 Pm diameter magnetite particles functionalized
with Salmonella specific antibodies.
Nanotechnology and the Detection of Biomolecular Recognition 7

SEM image
a) SEM b)
150 250nm
Data Dynabeads
Dynabeads
anti-Salmonella
Magnetization (kA/m)

antiSalmon (
100 Langevin fit
2.8Pm
ella 28
50
0
Salmonella
Salmo
-50
ccell
nella
-100
ells
-150
-150 -100 -50 0 50 100 150
H (kA/m)

Figure 2. a) Magnetization vs applied field for 250nm diameter magnetite labels. b) 2.8um
diameter Dynabeads (magnetite nanoparticles in a polymer matrix).

3. MAGNORESISTIVE TRANSDUCERS

Spin valve and tunnel junction sensors are being used for magnetic label
detection. Spin valve sensors were introduced in the early nineties, and are
used today as read elements in most hard disk drives.
10
10

M R = 9.6 %
8
Ti
10
W
90
(N
2
) 150 Å R es istance = 4 7 :
o hm
Hf = 9 Oe
Ta 30 Å
H c = 3 Oe
6 AF Mn
74
Ir
26
100 Å
M R (% )

P Co
82
Fe
18
25 Å

4 Cu 22Å

Co Fe 20 Å
F 82 18
Ni 80 Fe 20 10 Å
2
Ta 70 Å

0
-1 5 0 -10 0 -50 0 50 100 15 0

M a gn e tic F ie ld (O e)
Figure 3. Magnetoresistance vs. magnetic field for a top-pinned spin valve coupon sample.
(P) pinned, (F) free, and (AF) antiferromagnetic layers [6].
8 P.P. Freitas et al.

The spin valve sensor has essentially three active layers, a pinned
reference ferromagnetic layer about 3 nm thick (CoFe), a Cu non magnetic
spacer, 2 nm thick, and a free soft ferromagnetic layer (NiFe or NiFe/CoFe)
3 to 4 nm thick. The electrical current flows in the plane of the layers. When
the magnetization of the two ferromagnetic layers is parallel, the spin valve
is in a low resistance state, and when the magnetization of the two layers is
antiparallel, the spin valve is in a high resistance state. The typical
magnetoresistance ratio (MR = (R||-RA)/R||) for a spin valve is about 10%
(Fig. 3) [7].
When the bulk sample is microfabricated into a rectangular stripe with
width W>>h, the height (W = 6 Pm, h = 2 Pm), the transfer curve becomes
linear when sensing transverse in-plane applied fields. This is the sensor
configuration used for a biochip application (see Fig. 4a). Fig. 4 also shows
the transfer curve of an ion beam sputtered spin valve (Si/Al2O3 50 nm/Ta 2
nm/NiFe 2 nm/MnIr 6 nm/CoFe 2.5 nm/Cu 1.8 nm/CoFe1 nm/NiFe 2.5
nm/Ta 5 nm) with a sensitivity of 0.3 %/Oe in a r15 Oe (or 1.2 kA/m) linear
range.

Figure 4. a) Schematic of spin valve transducer geometry and sensing direction. b) Transfer
curve for a 6 Pm u 2 Pm spin valve sensor used for a biochip applications. Sense current is 5
mA.
Nanotechnology and the Detection of Biomolecular Recognition 9

For increased sensitivity, magnetic tunnel junctions were introduced in


the late nineties, where the Cu spacer is replaced by a 1 nm thick insulating
layer (amorphous AlOx, crystalline MgO) sandwiched between a pinned and
a free ferromagnetic electrode (like the spin valve). Here, the current flows
perpendicular to the layers. AlOx based tunnel junctions have maximum
tunneling magnetoresistance ratio (TMR) of 40% to 60%, and MgO based
tunnel junctions reach TMR values in excess of 300% [7].
Fig. 5 shows the squared-loop response of a MgO based transducer with
the structure glass/Ta 3 nm/ CuN 30 nm/ Ta 5 nm/PtMn 20 nm/CoFe 2.5
nm/Ru 0.7 nm/CoFeB 3 nm/MgO 1.2 nm/CoFeB 3 nm/Ta 5 nm.

9 120
110
2
8 Area : 1x2 Pm 100

Magnetoresistance (%)
2
RxA : 8.0 : Pm 90
TMR : 115% 80
Resistance (:)

7 70
60
6 50
40
30
5 20
10
4 0
-100 -80 -60 -40 -20 0 20 40 60 80 100
Applied Magnetic Field (Oe)
Figure 5. Minor-loop tunnel magnetoresistance (free layer reversal) for a 1 Pm u 2 Pm tunnel
junction.

Fig. 6b), on the other hand, shows the linear response of an AlOx based
MTJ sensor with a sensitivity of 2 to 3 %/Oe (about 10 u higher than the
spin valve sensor), while Fig. 6a) shows the sensor geometry and sensing
direction of these devices when applied to the detection of magnetic labels.
10 P.P. Freitas et al.

Figure 6. a) Schematic of MTJ transducer geometry and sensing direction. b) Transfer curve
for a 2 Pm u 10 Pm magnetic tunnel junction sensor based on AlOx barrier. An external
longitudinal 10 to 15 Oe field is used for sensor biasing and magnetic label magnetization.

For biosensing applications, not only must the field sensitivity be


increased, but also the sensor noise background and system noise must
diminish. The thermal noise background for spin valve and tunnel junction
sensors being fabricated in our lab is near 1 nT/¥Hz below 100Hz [8]. This
means that if system noise can be reduced to this level, single nanometer
sized labels will be detectable under kA/m excitation fields.

4. BIOCHIP ARCHITECTURE

The first generations of biochips made at INESC MN/IST used arrays of


linear spin valves (6 and 24 sensors) together with magnetic field generating
lines (current lines), which were used to concentrate magnetically labeled
Nanotechnology and the Detection of Biomolecular Recognition 11

biomolecules over the immobilized probes/detection sites [9], [10], [11],


[12].
Fig. 7 shows one of the 24 (4 rows of 6 sensors each) detection cells of a
biochip using U-shaped spin valve sensors together with U-shaped current
lines for magnetic field assisted hybridization (see Fig. 1 and section 6).
U-shaped spin-valve sensors of 2.5 Pm × 80 Pm (full sensing length)
were deposited on 3” Si/Al2O3 50 nm wafers by an ion beam deposition
system with the structure Ta 2 nm/NiFe 3 nm/CoFe 2.5 nm/Cu 2.6 nm/CoFe
2.5 nm/MnIr 6 nm/Ta 3 nm/ TiW(N2) 15 nm. The as-deposited spin valve
coupon samples showed a MR of ~7.5% and, when patterned, the sensors
showed a MR of 7.40 r 0.06 % (r represents standard deviation), a
sensitivity in the linear regime of 0.130 r 0.005%/Oe, and a resistance of
750 r 30:. The spin-valves were defined inside U-shaped aluminum current
line structures 300 nm thick, 10 Pm wide, 120 Pm full length with a spacing
between the arms of the line of 17 Pm, corresponding to an area of ~1000
Pm2 where magnetic labels were focused and detected (Fig. 1). Aluminum
300 nm thick leads were evaporated to contact sensors to wire-bonding pads
and a 300 nm thick oxide layer (100 nm Al2O3 + 200 nm SiO2) was used to
protect the chip against chemical corrosion and to provide a suitable surface
for DNA probe functionalization. Individual chips containing an array of
sensors and associated U-shaped lines were diced and wire-bonded to 40-pin
chip carriers [12].

Figure 7. a) Photograph of a u-shaped spin valve sensor together with u-shaped current line.
Diagram of the spotted probe region. b) Layout of the 24 u-shaped sensing units (total chip
area of 8 m u 8 m).

In order to increase the number of sensors and to make the present


biochips fully scalable, a matrix-based biochip was designed and fabricated.
The proposed basic cell consists of a thin-film amorphous silicon (a-Si:H)
diode connected in series with a MTJ (see Fig. 8) [13].
The MTJ is used due to the flexibility in controlling MTJ resistance by
changing the barrier thickness and for the higher sensitivity when compared
12 P.P. Freitas et al.

with spin valve sensors, allowing the detection of smaller labels. The diode
was chosen rather than a three terminal device, such as a transistor, since
additional control lines are avoided.
This architecture was already used for MRAM devices [14]. In that case,
the large diode dimensions (200 Pm u 200 Pm) needed to pass the required
write currents through the diodes were the main reason to prevent the use of
this architecture for dense MRAMS. For biochip applications, this is no
longer a major limitation since probe sites have dimensions of few hundred
Pm2 (similar to the TFD dimensions), and the number of immobilized probes
will not exceed few hundred to few thousand.

a) b) contact lead (column) 100 µm

U-shaped contact lead


current line (row)
diode
MTJ

8 mm
Figure 8. a) Photograph showing the recent 256 element bioarray, and b) a close-up of
diode/MTJ matrix elements.

5. BIOSENSOR INTEGRATED PLATFORM

In the present design, the biochip is further incorporated into a credit card
sized portable platform that incorporates all electronics for addressing,
reading out, sensing and controlling temperature, as well as a microfluidic
chamber and controls. System control is done through a PDA via a wireless
channel or a standard bus. In the 16 u 16 matrix prototype described above,
each thin film diode (TFD) and the corresponding magnetic tunnel junction
sensor are connected in series and this series circuit is driven by a
programmed current, provided through a DAC.
A block diagram of the read out electronics is shown in Fig. 9. A current
mirror circuit provides a current with equal value for two reference cells, Dr
and Sr, placed on a specific location of the chip. The current flows through
row and column multiplexers, according to the address of the cell
establishing single a closed circuit at a time. This allows the use of a single
DAC and single amplifier. The TFD has two main functions that correspond
to two different modes of the circuit operation: i) selecting single cells out of
Nanotechnology and the Detection of Biomolecular Recognition 13

the matrix, by forward biasing the selected cell while all the others are
reverse-biased; ii) controlling the temperature of the probe site to which the
diode is allocated. For this role, the voltage-temperature characteristic of the
diode is used for sensing the temperature of the diode site. These
temperature sensors are calibrated through the microcontroller/Digital Signal
Processor at setup time [15].
For measurement of the signal response of the magnetic tunnel junction
sensor, an alternate magnetic field is created by a coil placed bellow the
chip.

Figure 9. Electrical diagram and phtograph of electronics data acquisition and control board
[15].

Inlet reservoirs Channel


Rotary valve

micropump

Disposable
Oulet reservoir PCB

Screw
Figure 10. 3D Model of fluidics system, which two inlets for sample and washing fluids and
one outlet. It contains also a rotary mechanical valve and a micropump, which transports all
solutions to a microfluidics chamber located above the magnetoresistive biochip.

Fig. 10 shows a model of the fluidics system being fabricated. It contains


a micropump that is control by the electronics board described above. The
connection of the biochip to the fluidics device creates a 5 mm u 5 mm u
14 P.P. Freitas et al.

1mm chamber over the biochip. The latter is mounted on a disposable PCB
which connects to the electronics boar.

6. BIOLOGICAL APPLICATIONS

6.1 Surface functionalization and probe immobilization

In order to use magnetoresistive devices in bioassays, they have to be


first functionalized with the biomolecules of interest (e.g. probe DNA
strands or antibody molecules). At INESC-MN typical surface
functionalization protocols involve 5 steps: activation; silanization; cross-
linking; probe immobilization; and blocking (see Fig. 11).

Probe Blocking
Activation Cross-Linking
Immobilization

Silanization

OH OH OH OH OH NH2 NH2 NH2 OH NH2 OH NH2 OH NH2

SiO2
Si substrate

Figure 11. Schematic of typical surface chemistry functionalization. It comprises 5 steps:


activation; silanization; cross-linking; probe immobilization (here represented as an oligo);
and blocking.

The functionalization or derivatization protocols followed are based on


glass/silicon dioxide surfaces. Consequently, chip surfaces are passivated
with a 200 nm thick SiO2 layer (see section 4). Although, the chemistry is
well known for these surfaces, care must be taken and mild conditions
should be used whenever possible not to degrade the chip surface and
transducers and metal structures underneath.
The activation of the surface consists in the formation of reactive
hydroxyl groups (-OH) at the surface. It is often unnecessary, as naturally
occurring hydroxyl groups, formed in contact with moisture in the air, are
Nanotechnology and the Detection of Biomolecular Recognition 15

enough for subsequent steps. When necessary, the magnetoresistive chip is


treated with a mild acid solution, such as cholic acid.
The next step is silanization, which is used to endow the surface with
reactive amino groups (-NH2) required for subsequent steps. Typically, a
trialkoxy silane derivative, such as 3-aminopropyltriethoxysilane (APTES),
is used in aqueous solution.
Cross-linking is used to enable the covalent binding of two distinct
chemical entities that are not reactive toward each other (e.g. amino and thiol
–SH groups). A cross-linker serves another important purpose: it provides a
physical spacer that gives a larger mobility and freedom to the immobilized
biomolecules. This greater accessibility is important to facilitate
biomolecular recognition. Typically, at INESC-MN, hetero-bifunctional
cross-linkers such as sulfo-EMCS are used. The spacer molecule binds the
amino groups at the surface and leaves thiol groups available to react further.
Probe immobilization can be done using several methods depending on
surface chemistry and the nature of the biomolecule to be immobilized. In
the case of the studied nucleic acid chips, 3’-end thiolated DNA
oligonucleotide strands were used and immobilized to the chip surface.
Finally, in order to reduce unspecific binding during the assays, a
blocking step before hybridization is done by incubating the probe
functionalized chip with bovine serum albumin (BSA). This protein, binds to
the unreacted functional groups (hydroxyl, amino, and cross-linker
molecules) at the chip surface, preventing target molecules to bind
unspecifically.
Similar chemical protocols may also be used for magnetically labeling of
target molecules, provided magnetic particles are prepared with a suitable
surface. Nevertheless, at INESC-MN the magnetic labeling of target DNA
strands was achieved by incubating 3’-end biotinylated targets with
streptavidin-coated magnetic carriers (biotin and streptavidin bind with a
high affinity). This latter protocol was used in cystic fibrosis magnetic field
assisted hybridization experiments.

6.2 Cystic fibrosis gene mutation detection

Spintronic biochip platforms have been used for the detection of


biomolecular recognition in binding models such as biotin-streptavidin [16],
immunoglobulinG – Protein A, and DNA-cDNA (cystic fibrosis related) [1,
2, 10]; in the development of applications in the detection of biological
warfare agents [17], [18]; and more recently in the detection of cells from
pathogenic microorganisms [19].
16 P.P. Freitas et al.

In bioassay experiments, typically, the functionalized magnetoresistive


chips are incubated with biotinylated targets. These then must diffuse in
solution until finding the complementary probe biomolecules at the chip
surface. Consequently, to achieve detectable biomolecular recognition
signals, targets are left to diffuse passively for several hours, usually
overnight. Afterwards, the chip is interrogated with streptavidin coated
magnetic particles that will bind to the available biotin molecules where
recognition (e.g. DNA hybridization) occurred [1, 2, 5, 17, 18].
Unlike, other research groups worldwide, INESC-MN’s approach is
unique in the sense that detection of magnetic labels and biomolecular
recognition is coupled to the transport of magnetic-labeled target molecules,
as described in Fig. 1.
The main advantage of using on-chip transport systems is that diffusion
limitations are overcome by attracting bio-functionalized magnetic particles
to magnetic field generating line structures, like the U-shaped current line
described above (Fig. 7). Consequently, a high degree of biomolecular
recognition events can be achieved in minute time-scales. This was
demonstrated for the hybridization of magnetically labeled cystic fibrosis
related DNA targets to complementary surface bound probes. In these cases,
hybridization was promoted and detected almost in real-time in 5 to 30 min
[10, 11, 12].
The described U-platform was then used to detect the target
oligonucleotides whose sequence is specific for genes that were found to be
either over or sub-expressed in cystic fibrosis related cell lines in comparison
with normal tissues [20].
Real-time alternate current (ac) detection signals for 16 sensors were
obtained using 1 mA sense currents, and an external in-plane ac excitation
field of 13.5 Oe rms at a frequency of 30 Hz together with a direct current
(dc) bias component of 24 Oe. In addition, currents of 25 to 40 mA rms at a
frequency of 0.2 Hz were sent through the U-shaped lines for attracting the
magnetic labels towards the U-shaped sensors [12].
The experiments usually proceeded as followed. After addition of the
magnetically labeled target biomolecules, a current of 40 mA rms at 0.2 Hz
was applied to the U-shaped current line for 20 min to focus the particles.
Then after turning off the current on the lines, the labels were left for 10
minutes to settle down onto the surface to further promote hybridization
between the targets and the immobilized probes. Thus, the total
hybridization time was only 30 min. The chip was then washed to remove
unbound labels; and finally, it was further washed with a higher stringency
buffer solution to remove unspecifically or weakly interacting targets.
Fig. 12 shows the results obtained for a multiprobe experiment, where 7
sensors were functionalized with 50 nucleotide long DNA strands related to
Nanotechnology and the Detection of Biomolecular Recognition 17

the up-regulated rpl 29 gene; 6 sensors where functionalized with down-


regulated asah 50-mer probes; and 3 sensors were used as background
references. The chip was tested with rpl29 DNA targets labeled with 250 nm
magnetic particles in a total target concentration of 80 nM.

1.4 1.4

1.2 1.2

1.0 1.0
'V (mV rms)

'V (mV rms)


0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0
saturation wash 1 wash 2 saturation wash 1 wash 2 saturation wash 1 wash 2
rpl29 probe asah probe no probe
(complementary) (non-complementary) (background)

Figure 12. Magnetic field assisted hybridization and simultaneous detection experiments
using u-shaped current lines and 2.5 µm × 80 µm u-shaped spin valves. Spintronic biochips
were functionalized with a 50-mer single-stranded DNA molecules that correspond to genes
that were found to be up-regulated (rpl29) or down-regulated (asah) in cystic fibrosis related
cell lines vs healthy cell lines; and were tested with rpl29 magnetically labeled targets. Also,
background-sensing sites were included. Here, saturation represents the sensor responses to
labels just before washing the chip, and wash 1 and wash 2 represent increasing stringency
washing buffers to remove unspecifically and weakly bound labels [12].

It can be observed that under the referred conditions, the ratio


complementary/ non-complementary signals is about 7 to 10, meaning that
this system is able to discriminate well between distinct DNA targets and has
sufficient dynamic range to be used in gene expression analysis.
Furthermore, it was observed that there is no significant difference between
non-complementary/ background signals, showing that unspecific binding is
relatively small. As a note, the complementary binding signals are about
25% of the saturation signal and correspond to about 250 nanoparticles over
the sensor [21] or a maximum of 20,000 detected DNA hybridization events
[1].
18 P.P. Freitas et al.

20 µm 20 µm

labels

Figure 13. Optical micrograph of sensing sites interrogated with magnetically labeled targets
that are a) complementary or b) non-complementary to the surface immobilized probes [12].

The corresponding optical pictures for sensing sites functionalized with


DNA probes complementary and non-complementary to the targets is shown
in Fig. 13. The complementary site is full with 250 nm bound nanoparticles,
whereas the non-complementary site as very few or a negligible amount of
these labels.
Finally, additional studies concerning the detection of mutations in the
Cystic Fibrosis Conductance Regulator (CFTR) gene were done at INESC-
MN using fabricated spintronic biochip devices and can be found elsewhere
[1], [2], [10].

6.3 Salmonella detection

The same U-shaped sensor and current line platform was used for the
detection of Salmonella cells. Here though, the surface was treated and
functionalized with antibody molecules against Salmonella epitopes at the
cell’s surface. Subsequently, the immuno-biochips were incubated with a
solution containing the pathogen, which if present bounds to the surface. The
chip is further interrogated with magnetic labels, which are coated with the
same antibody that recognizes the microorganism. This “sandwich”-like
structure then enables the detection of the pathogen using magnetoresistive
sensors located underneath (see Fig. 14).
Nanotechnology and the Detection of Biomolecular Recognition 19

pathogenic microorganism

magnetic label

antigen

antibody

spintronic transducer
Figure 14. Scheme of magnetoresistive immunoassay for the detection of pathogenic
microorganisms.

Fig. 15 shows the results of an experiment where two different chips


were functionalized with antibodies against Salmonella or Escherichia coli
pathogens. The chips were then incubated with a solution containing
Salmonella cells before being interrogated with magnetically labeled anti-
Salmonella antibodies. The results shown are for single sensor
measurements using operational conditions similar to those described above
for the detection of DNA hybridization, and more details can be found in
[19].
The goal is here is to develop a tool for the analysis and monitoring of
water quality in an almost real-time scheme.

Figure 15. Real-time detection signals obtained with magnetoresistive immuno-chips. Chips
were functionalized with antibodies against a) Salmonella or b) Escherichia coli; incubated
with Salmonella cells and then interrogated with magnetic labels coated with anti-Salmonella
antibodies [19].
20 P.P. Freitas et al.

7. CONCLUSIONS

Magnetoresistive transducers show a great potential for integration in


novel biosensors and biochips. Their unique features, such as the magnetic
field transduction mechanism with electronic readout, fast response, high
sensitivity, scalability, automation and easy CMOS process integration,
makes them versatile enough to be tailored to the desired biological
applications, in areas from biomedicine and biotechnology to food and
environmental analysis.
In addition, the combination with on-chip magnetic transport systems
enables the detection of minute amounts of target biomolecules in a
reasonable time frame [22]. This further opens a window for the
development of devices to be used in point-of-care settings and in
applications where short response times are needed.
As a result, interest from laboratories and companies, and research on
this field of spintronics is continuously increasing.

REFERENCES
1. D.L. Graham, H.A. Ferreira, and P.P. Freitas, Magnetoresistance-based biosensors and
biochips. Trends in Biotechnology, 22, 455-462, 2004.
2. P.P. Freitas, H.A. Ferreira, D.L. Graham, L.A. Clarke, M.D. Amaral, V. Martins, L.
Fonseca, and J.S. Cabral, Magnetoresistive DNA chips. In Magnetoelectronics, M.
Johnson (Ed.). Academic Press, New York, 2004.
3. H.A. Ferreira, D.L. Graham, P.P. Freitas, and J.M.S. Cabral, Biodetection using
magnetically labeled biomolecules and arrays of spin valve sensors. Journal of Applied
Physics, 93, 7281-7286, 2003.
4. J.C. Rife, M.M. Miller, P.E. Sheehan, C.R. Tamanaha, M. Tondra, and L.J. Whitman,
Design and performance of GMR sensors for the detection of magnetic microbeads in
biosensors. Sensors and Actuators A, 107, 209-218, 2003.
5. J. Schotter, P.B. Kamp, A. Becker, A. Pühler, G. Reiss, and H. Brückl, Comparison of a
prototype magnetoresistive biosensor to standard fluorescent DNA detection. Biosensors
and Bioelectronics, 19, 1149-1156, 2004.
6. Guedes, M.J. Mendes, P.P. Freitas, and J.L. Martins, Study of synthetic ferromagnet-
synthetic antiferromagnet structures for magnetic sensor application. Journal of Applied
Physics, 99, 08B703, 2006.
7. P. P. Freitas, H. A. Ferreira, R. Ferreira, S. Cardoso, S. van Dijken, and J. Gregg,
Nanostructures for spin electronics. In Advanced Magnetic Nanostructures, D. Sellmyer,
and R. Skomski, (Eds.), Springer, Berlin, 2006.
8. J.M. Almeida, R. Ferreira, P.P. Freitas, J. Langer, B. Ocker, and W. Maass, 1/f noise in
linearized low resistance MgO magnetic tunnel junctions. Journal of Applied Physics,
99, 08B314, 2006.
Nanotechnology and the Detection of Biomolecular Recognition 21

9. D. L. Graham, H. Ferreira, J. Bernardo, P. P. Freitas, and J. M. S. Cabral, Single


magnetic microsphere placement and detection on-chip using current line designs with
integrated spin valve sensors: biotechnological applications. Journal of Applied Physics,
91, 7786-7788, 2002.
10. D.L. Graham, H.A. Ferreira, N. Feliciano, P.P. Freitas, L.A. Clarke, and M.D. Amaral,
Magnetic field-assisted DNA hybridisation and simultaneous detection using micron-
sized spin-valve sensors and magnetic nanoparticles. Sensors and Actuators B, 107, 936-
944, 2005.
11. H. A. Ferreira, N. Feliciano, D. L. Graham, L. A. Clarke, M. D. Amaral, and P. P.
Freitas, Rapid DNA hybridization based on AC field focusing of magnetically-labeled
target DNA. Applied Physics Letters, 87, 013901, 2005.
12. H.A. Ferreira, D.L. Graham, N. Feliciano, L.A. Clarke, M.D. Amaral, and P.P. Freitas,
Detection of cystic fibrosis related DNA targets using AC field focusing of magnetic
labels and spin-valve sensors. IEEE Transactions on Magnetics, 41, 4140-4142, 2005.
13. F.A. Cardoso, H.A. Ferreira, J.P. Conde, V. Chu, P.P. Freitas, D. Vidal, J. Germano, L.
Sousa, M.S. Piedade, B. Andrade and J.M. Lemos, Diode/magnetic tunnel junction for
fully scalable matrix-based biochip. Journal of Applied Physics, 99, 08B307, 2006.
14. R.C. Sousa, P.P. Freitas, V. Chu, and J.P. Conde, Vertical integration of a spin
dependent tunnel junction with an amorphous Si diode for MRAM application. IEEE
Transactions on Magnetics, 35, 2832-2834, 1999.
15. M. Piedade, L. Sousa, J. Germano, J. Lemos, B. Costa, P. Freitas, H. Ferreira, F.
Cardoso, and D. Vidal, Architecture of a portable system based on a biochip for DNA
recognition. Proceedings of the XX Conference on Design of Circuits and Integrated
Systems (DCIS), 23-25 November, Lisboa, Portugal, 2005.
16. D.L. Graham, H.A. Ferreira, P.P. Freitas, and J.M.S. Cabral, High sensitivity detection
of molecular recognition using magnetically labelled biomolecules and magnetoresistive
sensors. Biosensors and Bioelectronics, 18, 483-488, 2003.
17. R.L. Edelstein, C.R. Tamanha, P.E. Sheehan, M.M. Miller, D.R. Baselt, L.J. Whitman,
and R.J. Colton, The BARC biosensor applied to the detection of biological warfare
agents. Biosensors and Bioelectronics, 14, 805-813, 2000.
18. M.M. Miller, P.E. Sheehan, R.L. Edelstein, C.R. Tamanaha, L. Zhong, S. Bounnak, L.J.
Whitman, and R.J. Colton, A DNA array sensor utilizing magnetic microbeads and
magnetoelectronic detection. Journal of Magnetism and Magnetic Materials, 225, 138-
144, 2001.
19. V. C. B. Martins, L. P. Fonseca, H. A. Ferreira, D. L. Graham, P. P. Freitas, and J. M. S.
Cabral, Use of magnetoresistive biochips for monitoring of pathogenic microorganisms
in water through bioprobes: oligonucleotides and antibodies. Technical Proceedings of
the 2005 NSTI Nanotechnology Conference and Trade Show, 8-12 May 2005, Anaheim,
California, USA, 1, chapter 8: Bio Micro Sensors, 493-496, 2005.
20. L.A. Clarke, C. Braz, and M.D. Amaral, Cystic fibrosis-related patterns of gene
expression: a genome-wide microarray approach. Pediatric Pulmonology, 38,
supplement 27, 219, 2004.
21. H.A. Ferreira, F.A. Cardoso, R. Ferreira, S. Cardoso, and P. P. Freitas, Magnetoresistive
DNA-chips based on ac field focusing of magnetic labels. Journal of Applied Physics,
99, 08P105, 2006.
22 P.P. Freitas et al.

22. P.E. Sheehan, and L.J. Whitman, Detection limits for nanoscale biosensors. Nano
Letters, 5, 803-807, 2005.
FINANCIAL ECONOMETRIC MODELS
Some Contributions to the Field

João Nicolau
Instituto Superior de Economia e Gestão, Universidade Técnica de Lisboa, Rua do Quelhas 6,
1200-781 Lisboa, Portugal, nicolau@iseg.utl.pt

Abstract: Four recent financial econometric models are discussed. The first aims to
capture the volatility created by “chartists”; the second intends to model
bounded random walks; the third involves a mechanism where the stationarity
is volatility-induced, and the last one accommodates nonstationary diffusion
integrated stochastic processes that can be made stationary by differencing.

Key words: ARCH models, diffusion processes, bounded random walk, volatility-induced
stationarity, second order stochastic differential equations.

1. INTRODUCTION

1.1 The objective and scope of this work

This paper reflects some of our recent contributions to the state-of-the-art


on our financial econometrics. We have selected four main contributions in
the this field. Also, we briefly refer to some contributions to the estimation
of stochastic differential equations, although the emphasis of this chapter is
on specification of financial econometric models. We give the motivation
behind the models, and the more technical details will be referred to the
original papers. The structure of this chapter is as follows. In section 1.2 we
refer some general properties of returns and prices. In section 2 we mention
a model that aims to capture the volatility created by “chartists”. This is done
in a discrete-time setting in the context of ARCH models; also a continuous-
time version is provided. In section 3 we present three diffusion processes,
with different purposes. The first one intends to model bounded random
walks; the idea is to model stationarity processes with random walk

23
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 23–41.
© 2007 Springer. Printed in the Netherlands.
24 J. Nicolau

behaviour. In the second one we discuss processes where the stationarity is


volatility-induced. This is applicable to every time series where reversion
effects occur mainly in periods of high volatility. In the last one, we focus on
a second order stochastic differential equation. This process accommodates
nonstationary integrated stochastic processes that can be made stationary by
differencing. Also, the model suggests directly modelling the (instantaneous)
returns, contrary to usual continuous-time models in finance, which model
the prices directly.

1.2 Prices, returns and stylized facts

An important step in forming an econometric model consists in studying the


main features of the data. In financial econometrics two of the most
important variables are prices and returns (volatility is also fundamental and
we shall go back to it later). Prices include, for example, stock prices, stock
indices, exchange rates and interest rates. If we collect daily data, the price is
usually some type of closing price. It may be a bid price, an ask price or an
average. It may be either the final transaction price of the day or the final
quotation. In discrete time analysis, researchers usually prefer working with
returns, which can be defined by changes in the logarithms of prices (with
appropriate adjustments for any dividend payments). Let Pt be a
representative price for a stock (or stock indices, exchange rate, etc.). The
return rt at time t is defined as rt log Pt  log Pt 1 .
General properties (stylized facts) are well known for daily returns
observed over a few years of prices. The most significant are:
x The (unconditional) distribution of rt is leptokurtic and in some
cases (for stock prices and indices) asymmetric;
x The correlation between returns is absent or very weak;
x The correlations between the magnitudes of returns on nearby days
are positive and statistically significant.
These features can be explained by changes through time in volatility.
Volatility clustering is a typical phenomenon in financial time series. As
noted by Mandelbrot [19], “large changes tend to be followed by large
changes, of either sign, and small changes tend to be followed by small
changes.” A measurement of this fact is that, while returns themselves are
uncorrelated, absolute returns rt or their squares display a positive,
significant and slowly decaying autocorrelation function: Corr rt , rt W ! 0
for IJ ranging from a few minutes to several weeks. Periods of high volatility
lead to extreme values (and thus to a leptokurtic distribution). Figure 1
shows a typical time series of returns. Any econometric model for returns
should capture these general features of financial time series data.
Financial Econometric Models 25

The statistical features of prices are not so obvious. In general, most of


the series contain a clear trend (e.g. stock prices when observed over several
years), others shows no particular tendency to increase or decrease (e.g.
exchange rates). Shocks to a series tend to display a high degree of
persistence. For example, the Federal Funds Rate experienced a strong
upwards surge in 1973 and remained at the high level for nearly two years.
Also, the volatility of interest rates seems to be persistent. We will resume
some of these features in section 3.

0.3

0.2
0.1

0
-0.1

-0.2
-0.3

-0.4
May-92
Mar-86

Aug-88

Nov-89

Aug-93

Apr-97

Mar-02

Nov-05
Oct-94

Oct-99
Jun-87

Feb-91

Jan-96

Jul-98

Dec-00

Jun-03

Sep-04

Figure 1. Microsoft daily returns from 1986 to 2006

2. DISCRETE-TIME MODELS

2.1 The ARCH family

In a seminal paper Engle [13] introduced the so called autoregressive


conditional heteroskedasticity model. These models have proven to be
extremely useful in modelling financial time series. Also, they have been
used in several applications (forecasting volatility, CAPM, VaR, etc.). The
ARCH(1) is the simplest example of an ARCH process. One assumes that
the distribution of the return for period t , given past information, is

rt Ft 1 ~ D P t ,V t2 (1)

where D is the conditional distribution, P t is the conditional mean and

V t2 Z  D rt 1  P t 1 2 , ( Z ! 0 ,D t 0 ) (2)
26 J. Nicolau

is the conditional variance. A large error in period t  1 (that is a high value


2
for rt 1  P t 1 ) implies a high value for the conditional variance in the
next period. Generally, P t 1 is a weak component of the model since it is
difficult to predict the return rt 1 based on a Ft  2 -mensurable stochastic
process P t 1 . In many cases it is a positive constant. Thus, either a large
positive or a large negative return in period t  1 implies higher than
average volatility in the next period; conversely, returns close to the mean
imply lower than average volatility. The term autoregressive (from ARCH)
comes from the fact that the squared errors follow an autoregressive process.
In fact, from V t2 Z  Du t21 where u t 1 rt 1  P t 1 we have

V t2  u t2 Z  Du t21  u t2
(3)
u t2 Z  Du t21  vt , vt u t2  V t2

and since vt is a martingale difference (by construction, assuming


E > vt @  f ) one concludes that u t2 is an autoregressive process of order
one. There are a great number of ARCH specifications and many of them
have their own acronyms, such GARCH, EGARCH, MARCH, AARCH, etc.

2.2 One more ARCH model – the Trend-GARCH

2.2.1 Motivation

In recent literature a number of heterogeneous agent models have been


developed based on the new paradigm of behavioural economics,
behavioural finance and bounded rationality (see [17] for a survey on this
subject). Basically, most models in finance distinguish between sophisticated
traders and technical traders or chartists. Sophisticated traders, such as
fundamentalists or rational arbitrageurs tend to push prices in the directions
of the rational expectation fundamental value and thus act as a stabilising
force. Chartists base their decisions mainly on statistics generated by market
activity such as past prices and volume. Technical analysts do not attempt to
measure the intrinsic value of a security; instead they look for patterns and
indicators on stock charts that will determine a stock's future performance.
Thus, there is the belief that securities move in very predictable trends and
patterns.
As De Long et al. [11] recognise, this activity can limit the willingness of
fundamentalists to take positions against noise traders (chartists). In fact, if
Financial Econometric Models 27

noise traders today are pessimists and the price is low, a fundamentalist with
a short time horizon buying this asset can suffer a loss if noise traders
become even more pessimistic. Conversely, a fundamentalist selling an asset
short when the price is high can lose money if noise traders become more
bullish in the near future. "Noise traders thus create their own space. [...]
Arbitrage does not eliminate the effect of noise because noise itself creates
risk" (De Long et al., [11]). As a consequence, technical traders or chartists,
such as feedback traders and trend extrapolators tend to push prices away
from the fundamental and thus act as a destabilising force, creating volatility.
Based on these ideas, Nicolau [26] proposed an econometric model, in a
discrete and continuous-time setting, based on a technical trading rule to
measure and capture the increase of volatility created by chartists.

2.2.2 The Trend-GARCH

In order to derive the model we now focus more closely on a buy-sell


rule used by chartists. One of the most widely used technical rules is based
on the moving average rule. According to this rule, buy and sell signals are
generated by two moving averages of the price level: a long-period average
and a short-period average. A typical moving average trading rule prescribes
a buy (sell) when the short-period moving average crosses the long-period
moving average from below (above) (i.e. when the original time series is
rising (falling) relatively fast). As can be seen, the moving average rule is
essentially a trend following system because when prices are rising (falling),
the short-period average tends to have larger (lower) values than the long-
period average, signalling a long (short) position.
Hence, the higher the difference between these two moving averages, the
stronger the signal to buy or sell would be and, at the same time, the more
chartists detect the buy or sell signals. As a consequence, a movement in the
price and in the volatility must, in principle, be expected, whenever a trend is
supposed to be initiated. How to incorporate this information in the
specification of the conditional variance is explained below. To simplify, we
assume (as others) that the short-period moving average is just the current
(or latest) market price and the long-period one is an exponentially weighted
moving average (EWMA), which is also an adaptive expectation of the
market price. In this formulation, the excess demand function of noise
traders can be given as a function of log S t  mt

qt f log St  mt , f c x ! 0 (4)

where S t denotes the market price and mt is the long-period moving


average, represented here as an EWMA,
28 J. Nicolau

mt Omt 1  1  O log S t 1 , 0 d O  1 . (5)

The derivative of f (see equation (1)) is positive as, the higher the
quantity log S t  mt ! 0 , the stronger the signal to buy would be.
Conversely, the lower the quantity log S t  mt  0 , the stronger the signal
to sell would be.
Based on these ideas and in Bauer [7], Nicolau [26] proposes the
following model, inspired by the GARCH(1,1) specification:

rt P t  ut ,
ut V tH t
(6)
V t2 Z  Dut2 1  EV t21  J log St 1  mt 1 2 , D t 0 , E t 0 ,J t 0
mt Omt 1  1  O log St 1 , 0 d O  1

where, rt is the log return, P t is the conditional mean, ^H t ` is assumed


to be a sequence of i.i.d. random variables with E >H t @ 0 and Var >H t @ 1 .
The conditional variance V t2 incorporates a measure of chartists trading
2
activity, through the term log S t 1  mt 1 .
We present some properties of this model. Suppose that S0 1 . Thus,

t
log S t log S t  log S 0 ¦ ri (7)
i 1

On the other hand, the EWMA process has the following solution

t
mt m0 Ot  1  O ¦ log S k 1 (8)
k 1

Combining equations (7) and (8), and assuming m0 0 , we have, after


some simplifications,

t t t
log S t  mt ¦ ri  1  O ¦ Ot k log S k 1 ¦ Ot i ri (9)
i 1 k 1 i 1

If the sequence ^ri ` displays very weak dependence, one can assume
Pt 0 , that is rt ut . In this case, we have
Financial Econometric Models 29

V t2 Z  Du t21  EV t21  J log S t 1  mt 1 2


2
§ t · (10)
Z D u t21  EV 2
t 1  J ¨¨ ¦ Ot i u i ¸¸ .
©i 1 ¹

The model involves explicitly the idea of the moving average rule, which
we incorporate using equation mt Omt 1  1  O log S t 1 . This moving
average representation greatly facilitates the estimation of the model and the
study of the stationary conditions. The expression ¦ ti 1 Ot  i ri can be
understood as a trend component, which approximately measures trend
estimates in technical trading models. When the most recent returns have the
same signal, that is, when log St 1  mt  1
2
¦ it 1 O ri
t i 2

is high, chartists
see a general direction of the price (that is, a trend) which is generally
classified as an uptrend or downtrend. In these cases, chartists increase their
activity in the market, buying and selling and thus increasing volatility. On
the other hand, when the trend is classified as rangebound, price swings
back and forth for some periods, and as consequence, the quantity

t i 2
¦ti 1 O ri is low (the positive returns tend to compensate the negative
ones). In this case, there is much less trade activity by chartists, and the
volatility associated with them is low.
It can be proved under the conditions, D t 0 , E t 0 , J t 0 , 0 d O  1
and ^H t ` is a sequence of i.i.d. random variables with E >H t @ 0 and and
Var >H t @ 1 that the process ^ut ` is covariance-stationary if and only if

1  O2 1  D  E ! J . Conditions for the existence of a unique strict
stationarity solution are also studied in Nicolau [26]. The stationarity makes
sense because uptrends or downtrends cannot persist over time.
To assess the mean duration of a trend component, it could be interesting
to calculate the speed of adjustment around zero. The higher the parameter
O  1 the lower the speed of reversion. A useful indicator of the speed of
adjustment is the so-called half-live indicator, which, in our case, is given by
the expression log 1 / 2 / log O .
Estimation of model (5) is straightforward. One can use the pseudo
maximum likelihood based on the normal distribution (for example). A null
2
hypothesis of interest is whether the term log S t 1  mt 1 enters in the
specification of the conditional variance, that is, H0: J 0. Under this
hypothesis, O is not identified, that is, the likelihood function does not
depend on O and the asymptotic information matrix is singular. One simple
approach consists of considering Davies's bound when q parameters are
identified only under alternative hypothesis (see Nicolau, [26]). An empirical
illustration is provided in Nicolau [26]. Also, when the length of the
discrete-time intervals between observations goes to zero, it is shown that, in
30 J. Nicolau

some conditions, the discrete-time process converges in distribution to the


solution of the diffusion process

dX t c  I X dt 
t Z  J X t  P t 2 dW1,t , Z ! 0 ,J ! 0
(11)
dP t T X t  P t dt  V dW2 ,t , T t 0 ,V ! 0.

3. CONTINUOUS-TIME MODELS

3.1 A bounded random walk process

3.1.1 Motivation

Some economic and financial time series can behave just like a random
walk (RW) (with some volatility patterns) but due to economic reasons they
are bounded processes (in probability, for instance) and even stationary
processes. As discussed in Nicolau [21] (and references therein) this can be
the case, for example, of interest rates, real exchange rates, some nominal
exchange rates and unemployment rates among others series. To build a
model with such features it is necessary to allow RW behaviour during most
of the time but force mean reversions whenever the processes try to escape
from some interval. The aim is to design a model that can generate paths
with the following features: as long as the process is in the interval of
moderate values, the process basically looks like a RW but there are
reversion effects towards the interval of moderate values whenever the
process reaches some high or low values. As we will see, these processes can
admit - relying on the parameters - stationary distributions, so we come to an
interesting conclusion: processes that are almost indistinguishable from the
RW process can be, in effect, stationary with stationary distributions.

3.1.2 The model

If a process is a random walk, the function E >'X t X t 1 x @ (where


'X t X t  X t 1 ) must be zero (for all x ). On the other hand, if a process is
bounded (in probability) and mean-reverting to W (say), the function
E >'X t X t 1 x @ must be positive if x is below W and negative if x is
above W .
Financial Econometric Models 31

Now consider a process that is bounded but behaves like a RW. What
kind of function should E >'X t X t 1 x @ be? As the process behaves like a
RW, (i) it must be zero in some interval and, since the process is bounded,
(ii) it must be positive (negative) when x is ''low'' (''high''). Moreover we
expect that: (iii) E >'X t X t 1 x @ is a monotonic function which, associated
with (ii), means that the reversion effect should be strong if x is far from
the interval of reversion and should be weak in the opposite case; (iv)
E >'X t X t 1 x @ is differentiable (on the state space of X ) in order to
assure a smooth effect of reversion. To satisfy (i)-(iv) we assume
> @
E 'X t X t 1 x e k e D1 xW  eD 2 xW with D 1 t 0 , D 2 t 0 , k  0 . Let

us fix a x e k e D1 xW  eD 2 xW . With our assumption about
E >'X t X t 1 x @ we have the bounded random walk process (BRW) in
discrete-time:

X ti
X ti  1  e k ' e

D 1 X ti 1 W
e
D 2 X ti 1 W
'  V H
' ti , X0 c (12)

where ti are the instances at which the process is observed,


( 0 d t0 d t1 d ... d T ), ' is the interval between observations, ' ti  ti 1,
k ' and V ' are parameters depending on ' and H t i ,i 1,2 ,... is a ^ `
> @
sequence of i.i.d. random variables with E H ti 0 and Var H ti 1 . It can > @
^ `
be proved (see [21]) that the sequence X t' formed as a step function from
X ti , that is X t' X ti if ti d t  ti  1 , converges weakly (i.e. in distribution)
as ' p 0 to the solution to the stochastic differential equations (SDE) :

dX t
e k e D 1 X t W  eD 2 X t W dt  VdWt , X t0 c (13)

where c is a constant and W is a standard Wiener process ( t t t0 ). The


case a x 0 (for all x ) leads to the Wiener process (which can be
understood as the random walk process in continuous-time). It is still
obvious that a W 0 , so X t must behave just like a Wiener process when
X t crosses W . However, it is possible, by selecting adequate values for k ,
D 1 and D 2 to have a Wiener process behaviour over a large interval
centred on W (that is, such that a x | 0 over a large interval centred on
W ). Nevertheless, whenever X t escapes from some levels there will always
be reversion effects towards the W . A possible drawback of model (12) is
that the diffusion coefficient is constant. In the exchange rate framework and
under a target zone regime, we should observe a volatility of shape ''  '' with
respect to x (maximum volatility at the central rate) (see [18]). On the
other hand, under a free floating regime, it is common to observe a ''smile''
volatility (see [18]). For both possibilities, we allow the volatility to be of
shape ''  '' or ''  ' '' by assuming a specification like exp^V  E x  P ` .
32 J. Nicolau

Depending on the E we will have volatility of ''  '' or ''  '' form. Naturally,
E 0 leads to constant volatility. This specification, with E ! 0 , can also
be appropriate for interest rates. We propose, therefore,

dX t 2
e k e D1 X t W  eD 2 X t W dt  eV / 2 E / 2 X t P dWt , X t0 c (14)

Some properties are studied in [21]. Under some conditions both


solutions are stationary (with known stationary densities). To appreciate the
differences between the Wiener process (the unbounded RW) and the
bounded RW, we simulate one trajectory for both processes in the period
t  >0 , 20 @ with X 0 100 . We considered k 2 , D 1 D 2 2 , W 100
and V 4 . The paths are presented in figure 2. In the neighbourhood of
W 100 the function a x is (approximately) zero, so X behaves as a
Wiener process (or a random walk in continuous-time). In effect, if
a x 0 , we have dX t VdWt (or X t X 0  VWt ). We draw two arbitrary
lines to show that the bounded random walk after crossing these lines tends
to move toward the interval of moderate values.

130
125
120
115
110
Bounded Random
Walk
105
Wiener (Random
100
Walk)
95
90
85
80
0 2 4 6 8 10 12 14 16 18 20

Figure 2. Bounded Randow Walk vs. Wiener Process

3.2 Processes with volatility-induced stationarity

3.2.1 Motivation

Short-term interest rate processes have shown at least two main facts.
Firstly, the mean-reverting effect is very weak (see, for example, Chan et al.
[9] or Bandi [5]). In fact, the stationarity of short-term interest rate processes
is quite dubious. The usual unit root tests do not clearly either reject or
accept the hypothesis of stationarity. Since interest rate processes are
Financial Econometric Models 33

bounded by a lower (zero) and upper (finite) value a pure unit root
hypothesis seems impossible since a unit root process goes to f or  f
with probability one as time goes to f . Some authors have addressed this
question. The issue is how to reconcile an apparent absence of mean-
reverting effects with the fact that the interest rate is a bounded (and possibly
stationary) process. While Aït-Sahalia [1] and Nicolau [21] suggests that
stationarity can be drift-induced, Conley et al. [10] (CHLS, henceforth)
suggest that stationarity is primarily volatility-induced. In fact, it has been
observed that higher volatility periods are associated with mean reversion
effects. Thus, the CHLS hypothesis is that higher volatility injects
stationarity in the data.
The second (well known) fact is that the volatility of interest rates is
mainly level dependent and highly persistent. The higher (lower) the interest
rate is the higher (lower) the volatility. The volatility persistence can thus be
partially attributed to the level of persistence of the interest rate.
The hypothesis of CHLS is interesting since volatility-induced stationarity
can explain martingale behaviour (fact one), level volatility persistence (fact
two), and mean-reversion. To illustrate these ideas and show how volatility
can inject stationarity we present in figure 3 a simulated path from the SDE:

dX t 1  X dW t
2
t (15)

It is worth mentioning that the Euler scheme

Yt i
Yt i 1  1  Yt i21 ti  ti  1 H t i , H t i a i.i.d.N 0 ,1 (16)

10
8
6
4
2
0
-2 1 101 201 301 401 501 601 701 801 901
-4
-6
-8

Figure 3. Simulated path from the SDE (14)

cannot be used since Y explodes as ti o f (see [24, 27]). For a method


to simulate X , see Nicolau [24]. Since the SDE (14) has zero drift, we
34 J. Nicolau

could expect random walk behaviour. Nevertheless, figure 3 shows that the
simulated trajectory of X exhibits reversion effects towards zero, which is
assured solely by the structure of the diffusion coefficient. It is the volatility
that induces stationarity. In the neighbourhood of zero the volatility is low so
the process tends to spend more time in this interval. If there is a shock, the
process moves away from zero and the volatility increases (since the
diffusion coefficient is 1  x 2 ) which, in turn, increases the probability that
X crosses zero again. The process can reach extreme peaks in a very short
time but quickly returns to the neighbourhood of zero. It can be proved, in
fact, that X is a stationary process. Thus, X is a stationary local
> @
martingale but not a martingale since E X t X 0 converges to the stationary
mean as t o f and is not equal to X 0 as would be required if X was a
martingale.

3.2.2 A definition of volatility-induced stationarity

To our knowledge, CHLS were the first to discuss volatility-induced


stationarity (VIS) ideas. Richter [27] generalizes the definition of CHLS.
Basically, their definition states that the stationary process X (solution of
the stochastic differential equation (SDE) dX t a X t dt  b X t dWt ) has
VIS at boundaries l f and r f if lim s X x  f and lim s X x  f
xol x or
where s X is the scale density,

^ x
s X x exp  ³ 2a u / b 2 u du
z0
` ( z0 is an arbitrary value) (17)

There is one disadvantage in using this definition. As shown in [25], the


VIS definition of CHLS and Richter does not clearly identify the source of
stationarity. It can be proved that their definition does not exclude mean-
reversion effects and thus stationarity can also be drift-induced.
A simple and a more precise definition is given in Nicolau [25]. Consider
the following SDEs

dX t a X t dt  b X t dWt
(18)
dYt a Yt dt  VdWt .

We say that a stationary process X has VIS if the associated process


Y does not possess a stationary distribution (actually, this corresponds to
what it defined in Nicolau [25] as VIS of type 2). The intuition is simple:
although the process Y has the same drift as that of the process X , Y is
nonstationary (by definition) whereas X is stationary. The substitution of
V for b x transforms a nonstationary process Y into a stationary
Financial Econometric Models 35

process. Thus, the stationarity of X can only be attributed to the role of the
diffusion coefficient (volatility) and in this case we have in fact a pure VIS
process.
The following is a simple criterion to identify VIS, in the case of state
space  f ,f . We say that a stationary X process with boundaries
l f and r f has VIS if lim xa x t 0 or lim xa x t 0.
x of x o f

3.2.2.1 An example: Modelling the Fed funds Rate with VIS

Processes with VIS are potentially applicable to interest rate time-series


since, as has been acknowledged, reversion effects (towards a central
measure of the distribution) occur mainly in periods of high volatility. To
exemplify a VIS process monthly sampling of the Fed funds rate between
January 1962 and December 2002 was considered. As discussed in Nicolau
[25], there is empirical evidence that supports the specification

dX t ^
exp D / 2  E / 2 X t  P dWt
2
` (19)

where X t log rt and r represents the Fed funds rate. The state space of
r is 0 , f and X is  f ,f . That is, X can assume any value in R .
This transformation preserves the state space of r , since rt exp X t ! 0.
By Itô's formula, equation (18) implies a VIS specification for interest rates

1 D  E log rt  P 2 2
drt rt e dt  rt eD / 2  E / 2 log rt  P dWt (20)
2

It can be proved that X is an ergodic process with stationary density

m X x E E xP 2
p x e (21)
³ X x dx
m S

i.e. X log r a N P ,1 / 2 E . By the continuous mapping theorem,


r exp X is an ergodic process. Furthermore, it has a log-normal
stationary density. There is some empirical evidence that supports the above
models. It is based on four facts:
1. The empirical marginal distribution of X t log rt matches the
(marginal) distribution that is implicit in model (18).
2. The results of Dickey-Fuller tests are compatible with a zero drift
function for X , as specified in model (18).
3. Nonparametric estimates of a x and b 2 x do not reject
specification (18).
36 J. Nicolau

4. Parametric estimation of model (18) outperforms common one-


factor models in terms of accuracy and parsimony.
The estimation of SDE (18) is difficult since the transition (or
conditional) densities of X required to construct the exact likelihood
function are unknown. Several estimation approaches have been proposed
under these circumstances (see Nicolau [20] for a brief survey). To estimate
the parameters of equation (18) we considered the simulated maximum
likelihood estimator suggested in Nicolau [20] (with N 20 and S 20 ).
The method proposed by Aït-Sahalia [3] with J 1 (Aït-Sahalia's notation
for the order of expansion of the density approximation) gives similar
results. The approximation of the density based on J 2 is too
complicated to implement (it involves dozens of intricate expressions that
are difficult to evaluate).
The proposed model compares extremely favourably with other proposed
one-factor continuous-time models. In table 1 we compare the proposed
model with other relevant models for interest rates. Only the proposed
method was estimated by us. Remaining information was obtained from
table VI of Aït-Sahalia [2]. For comparison purposes the proposed model
was estimated using the same method applied to the other models (we
considered the density approximation proposed by Aït-Sahalia [3] with
J 1, in the period January-63 to December-98). Table 5 indicates that the
proposed model outperforms the others in terms of accuracy and parsimony.

Table 1. Log-Likelihood of some Parametric Models, 1963-1998


Models Log- Nº
likelihooh Parameters
drt N W  rt dt  VdWt 1569.9 3

drt N W  rt dt  V rt dWt 1692.6 3

drt
rt N  V 2  NW rt dt  Vrt3 / 2 dWt 1801.9 3

drt N W  rt dt  Vrt U dWt 1802.3 4

drt E r
1 t
1

 E 2  E 3 rt  E 4 rt2 dt  Vrt3 / 2 dWt 1802.7 5
2 2
drt rt 21 eD  E log rt  P dt  rt eD / 2 E / 2 log rt  P dWt 1805.1 3

3.3 A second order stochastic differential equation

In economics and finance many stochastic processes can be seen as


integrated stochastic processes in the sense that the current observation
behaves as the cumulation of all past perturbations. In a discrete-time
framework the concept of integration and differentiation of a stochastic
Financial Econometric Models 37

process plays an essential role in modern econometrics analysis. For


instance, the stochastic process ^yt ; t 0 ,1, 2 ,...` where yt D  yt 1  H t
( H t a i .i .d .N 0 ,1 ) is an example of an integrated process. Notice that y
can be written as yt y0  tD  ¦tk 1 H k , or

t
yt y0  ¦ x k (22)
k 1

where xt D  H t . One way to deal with such processes is to use a


differenced-data model (for example, 'yt D  H t , in the previous
example). Differencing has been used mostly to solve non-stationary
problems viewed as unit roots although, historically, differenced-data
models arose early in econometrics as a procedure to remove common trends
between dependent and independent variables.
In empirical finance, most work on integrated diffusion processes is
related to stochastic volatility models (see for example, Genon-Catalot and
Laredo [14]) and realized volatility (see for example, Andersen et al. [4] and
Barndorff-Nielsen and Sheppard [6]). However, integrated and differentiated
diffusion processes in the same sense as integrated and differentiated
discrete-time processes are almost absent in applied econometrics analysis.
One of the reasons why continuous-time differentiated processes have not
been considered in applied econometrics is, perhaps, related to the
difficulties in interpreting the 'differentiated' process. In fact, if Z is a
diffusion process driven by a Brownian motion, then all sample functions are
of unbounded variation and nowhere differentiable, i.e. dZ t / dt does not
exist with probability one (unless some smoothing effect of the measurement
instrument is introduced). One way to model integrated and differentiated
diffusion processes and overcome the difficulties associated with the
nondifferentiability of the Brownian motion is through the representation

­dYt X t dt
® (23)
¯dX t a X t dt  b X t dWt

where a and b are the infinitesimal coefficients (respectively, the drift


and the diffusion coefficient), W is a (standard) Wiener process (or
Brownian motion) and X is (by hypothesis) a stationary process. In this
model, Y is a differentiable process, by construction. It represents the
integrated process,

t
Yt Y0  ³ X u du (24)
0
38 J. Nicolau

(note the analogy with the corresponding expression in a discrete-time


setting, yt y0  ¦tk 1 xk , equation (20)) and X t dYt / dt is the
stationary differentiated process (which can be considered the equivalent
concept to the first differences sequence in discrete-time analysis). If X
represents the continuously compounded return or log return of an asset, the
first equation in system (22) should be rewritten as d log Yt X t dt .
Nicolau [23] argues that (22) can be a useful model in empirical finance
for at least two reasons. First, the model accommodates nonstationary
integrated stochastic processes ( Y ) that can be made stationary by
differencing. Such transformation cannot be done in common univariate
diffusion processes used in finance (because all sample paths from univariate
diffusion processes are nowhere differentiable with probability one). Yet,
many processes in economics and finance (e.g. stock prices and nominal
exchange rates) behave as the cumulation of all past perturbations (basically
in the same sense as unit root processes in a discrete framework). Second, in
the context of stock prices or exchange rates, the model suggests directly
modelling the (instantaneous) returns, contrary to usual continuous-time
models in finance, which directly model the prices. General properties for
returns (stylized facts) are well known and documented (for example, returns
are generally stationary in mean, the distribution is not normal, the
autocorrelations are weak and the correlations between the magnitude of
returns are positive and statistically significant, etc.). One advantage of
directly modelling the returns ( X ) is that these general properties are easier
to specify in a model like (22) than in a diffusion univariate process for the
prices. In fact, several interesting models can be obtained by selecting a x
and b 2 x appropriately. For example, the choice a x E W  x and
2
b 2 x V 2  O X t  P leads to an integrated process Y whose returns,
X , have an asymmetric leptokurtic stationary distribution (see the example
below). This specification can be appropriated in financial time series data.
Bibby and Sørensen [8] had already noticed that a similar process to (22)
could be a good model for stock prices.
We observe that the model defined in equation (22) can be written as a
second order SDE, d dYt / dt a X t dt  b X t dWt . These kinds of
equations are common in engineering. For instance, it is usual for engineers
to model mechanical vibrations or charge on a capacitor or condenser
submitted to white noise excitation through a second order stochastic
differential equation. Integrated diffusions like Y in equation (23) arise
naturally when only observations of a running integral of the process are
available. For instance, this can occur when a realization of the process is
observed after passage through an electronic filter. Another example is
provided by ice-core data on oxygen isotopes used to investigate paleo-
temperatures (see Ditlevsen and Sørensen [12]).
Financial Econometric Models 39

To illustrate continuous-time integrated processes we present in figure 4


two simulated independent paths of Yt Y0  ³0t X u du where X is
governed by the stochastic differential equation

2
dX t 20 0.01  X t dt  0.1  10 X t  0.05 dWt (25)

( X is also represented in figure 4). All paths are composed of 1000


observations defined in the interval t  >0 ,10 @. It is interesting to observe
that Y displays all the features of an integrated process (with a positive
drift, since E > X t @ 0.01 ): absence of mean reversion, shocks are persistent,
mean and variance depend on time, etc. On the other hand, the unconditional
distribution of X (return) is asymmetric and leptokurtic.

A: Integrate Process - Y B: Integrate Process - Y


100.1 100.15
100.08
100.1
100.06
100.04 100.05
100.02
100 100
99.98
99.95
99.96
99.94 99.9
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
time time

A: Differenciated Process - X B: Differenciated Process - X

0.1 0.1
0.05
0
0 1 2 3 4 5 6 7 8 9 0
0 1 2 3 4 5 6 7 8 9
-0.1 -0.05
-0.1
-0.2
-0.15
-0.3 -0.2
time time

Figure 4 Simulation of two independent paths from a second order SDE

Estimation of second order stochastic differential equations raises new


challenges for two main reasons. On the one hand, only the integrated
process Y is observable at instants ^ti ,i 1, 2 ,...` and thus X in model
(22) is a latent non-observable process. In fact, for a fixed sampling interval,
it is impossible to obtain the value of X at time ti from the observation
Yti which represents the integral Y0  ³0ti X u du . On the other hand, the
estimation of model (22) cannot in principle be based on the observations
^Yt i ,i 1, 2 ,... `
since the conditional distribution of Y is generally
unknown, even if that of X is known. An exception is the case where X
follows an Orstein-Uhlenbeck process, which is analyzed in Gloter [16].
40 J. Nicolau

However, with discrete-time observations ^Yi' , i 1, 2 ,...` (to simplify


we use the notation ti i' , where ' ti  ti1 ), and given that

i' i 1 ' i'


Yi'  Y i 1 ' ³ X u du  ³ X u du ³ X u du , (26)
0 0 i 1 '

we can obtain a measure of X at instant ti i' using the formula:

~ Yi'  Y i 1 '
X i' (27)
'

Naturally, the accuracy of (27) as a proxy for X i' depends on the


magnitude of ' . Regardless of the magnitude of ' we have in our
sample, we should base our estimation procedures on the sample
~
^ `
X i' ,i 1,2 ,... since X is not observable. Parametric and semi-parametric
estimation of integrated diffusions is analyzed in Gloter [15, 16] and
Ditlevsen and Sørensen [12]. In Nicolau [23] it is supposed that both
infinitesimal coefficients a and b , are unknown. Non-parametric
estimators for the infinitesimal coefficients a and b are proposed. The
analysis reveals that the standard estimators based on the sample
~
^ `
X i' ,i 1,2 ,... are inconsistent even if we allow the step of discretization
' to go to zero asymptotically. Introducing slight modifications to these
estimators we provide consistent estimators. See also [22].

ACKNOWLEDGEMENTS

I would like to thank Tom Kundert for helpful comments. This research
was supported by the Fundação para a Ciência e a Tecnologia (FCT) and by
POCTI.

REFERENCES
1. Aït-Sahalia, Y. (1996), Testing Continuous-Time Models of the Spot Interest Rate, The
Review of Financial Studies 9, 385-426.
2. Aït-Sahalia, Y. (1999), Transition Densities for Interest Rate and Other Nonlinear
Diffusions, The Journal of Finance LIV, 1361-1395.
3. Aït-Sahalia, Y. (2002), Maximum Likelihood Estimation of Discretely Sampled
Diffusions: a Closed-Form Approximation Approach, Econometrica 70, 223-262.
4. Andersen T. & T. Bollerslev & F. Diebold & P. Labys (2001) The Distribution of
Exchange Rate Volatility. Journal of the American Statistical Association 96, 42-55.
Financial Econometric Models 41

5. Bandi, F. (2002), Short-Term Interest Rate Dynamics: A Spatial Approach, Journal of


Financial Economics 65, 73-110.
6. Barndorff-Nielsen, O. & N. Sheppard (2002) Econometric Analysis of Realized
Volatility and its use in Estimating Stochastic Volatility Models. Journal of the Royal
Statistical Society B 64, 253-280.
7. Bauer, C. “A Better Asymmetric Model of Changing Volatility in Stock Returns: Trend-
GARCH”, Working Paper 03-05, University Bayreuth, 2005.
8. Bibby, B. & M. Sørensen (1997) A Hyperbolic Diffusion Model for Stock Prices.
Finance and Stochastics 1, 25-41.
9. Chan, K., G. Karolyi, F. Longstaff and A. Sanders (1992), An Empirical Comparison of
Alternative Models of the Short-Term Interest Rate, The Journal of Finance XLVII,
1210-1227.
10. Conley, T., L. Hansen, E. Luttmer and J. Scheinkman (1997), Short-term interest rates as
subordinated diffusions, The Review of Financial Studies 10, 525-577.
11. De Long, J.B., Shleifer, A., Summers,L.H. and Waldmann, R.J., (1990) Noise trader risk
in financial markets, Journal of Political Economy 98, 703-738.
12. Ditlevsen S. & M. Sørensen (2004) Inference for Observations of Integrated Diffusion
Processes. Scandinavian Journal of Statistics 31(3), 417-429.
13. Engle, R. “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance
of the United Kingdom Inflation”, Econometrica, vol. 50-4, pp. 987-1007, 1982.
14. Genon-Catalot, V. & J. Laredo (1998) Limit Theorems for Discretely Observed
Stochastic Volatility Models. Bernoulli 4, 283-303.
15. Gloter A. (1999) Parameter Estimation for a Discretely Observed Integrated Diffusion
Process. Preprint, Univ. of Marne-la-Vallée, 13/99.
16. Gloter A. (2001) Parameter Estimation for a Discrete Sampling of an Integrated
Ornstein-Uhlenbeck Process. Statistics 35, 225-243.
17. Hommes, C. “Heterogeneous agent models in economics and finance”, in Kenneth L.
Judd, and Leigh Tesfatsion, ed.: Handbook of Computational Economics (North-
Holland) Vol. 2: Agent-Based Computational Economics, 2005.
18. Krugman, P. & Miller, M. (1992) Exchange Rate Targets and Currency Bands. Centre
for Economic Policy Research, Cambridge University Press.
19. Mandelbrot, B. “The variation of certain speculative prices”, Journal of Business vol. 36,
pp. 394-419, 1963.
20. Nicolau, J. “New Technique for Simulating the Likelihood of Stochastic Differential
Equations” The Econometrics Journal, 5(1). pp. 91-103, 2002.
21. Nicolau, J. “Stationary Processes that Look Like Random Walks -- the Bounded Random
Walk Process in Discrete and Continuous Time”, Econometric Theory, vol. 18 (1), pp.
99-118, 2002.
22. Nicolau, J. “Bias Reduction in Nonparametric Diffusion Coefficient Estimation”,
Econometric Theory, vol. 19(5), pp. 754-777, 2003.
23. Nicolau, J. “Non-Parametric Estimation of Second Order Stochastic Differential
Equations”, Working Paper 3-04, CEMAPRE, 2004.
24. Nicolau, J. “A Method for Simulating Non-Linear Stochastic Differential Equations in
R1”, Journal of Statistical Computation and Simulation, vol. 75(8), pp. 595-609, 2005.
25. Nicolau, J. “Processes with Volatility-Induced Stationarity. An Application for Interest
Rates”, Statistica Neerlandica, 59(4), pp. 376-396, 2005.
26. Nicolau, J. “Modelling the Volatility Created by Chartists: A Discrete and a Continuous-
Time Approach”, Working Paper 4-05, CEMAPRE, 2005.
27. Richter M. (2002), A study of stochastic differential equations with volatility induced
stationarity (unpublished).
QUANTUM COMPUTATION AND
INFORMATION

Amílcar Sernadas1, Paulo Mateus1 and Yasser Omar2


1
CLC, Dep. Matemática, IST, UTL, Av. Rovisco Pais, 1049-001 Lisboa, Portugal;
email: {acs,pmat}@math.ist.utl.pt
2
CEMAPRE, Dep. Matemática, ISEG, UTL, Rua do Quelhas 6, 1200-781 Lisboa, Portugal;
email: yomar@iseg.utl.pt

Abstract After a very brief survey of the key milestones and open problems in quantum
computation and information, the research effort at IST-UTL is outlined,
namely, the goals, ongoing tasks and results of the QuantLog project. In
order to illustrate some key issues in quantum computation, the problem of
minimizing the number of qubits in quantum automata is presented in detail
at a level appropriate for non-specialists.

Keywords: Quantum Computation, Quantum Information, Quantum Logic.

1. INTRODUCTION
It seems unavoidable to use quantum resources in information processing
and communication for three kinds of reasons. First, the continuing pro-
cess of miniaturization of computer circuits will in due course lead to scales
where quantum effects must be taken into account. Second, as noticed by
Feynman [37], the fact that many quantum phenomena cannot be efficiently
simulated with classical computers suggests that we should look at those
phenomena as possible computation tools. Third, entanglement seems to be
a natural for solving synchronization problems between distant agents.
Two key results confirmed these ideas. In 1991, Ekert proposed a per-
fectly secure quantum protocol for sharing a private classical key using pub-
lic channels [35]. In 1994, Shor proposed a polynomial-time quantum al-
gorithm for prime factorization [71]. Curiously, Shor’s algorithm also has a
great impact in security, namely in e-business, because the classical public
key systems now in use rely precisely on the fact that prime factorization can-
not be efficiently achieved with classical computers. Afterwards, research
in quantum computation and information was accelerated in several fronts:
hardware for quantum computation (still in its infancy with very small labor-
43
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 43–62.
© 2007 Springer. Printed in the Netherlands.
44 A. Sernadas et al.

atory prototypes); hardware for quantum enhanced secure communication


(with some products already in the market); quantum algorithms (with a few
interesting results, namely in searching); quantum security protocols (with
several breakthroughs), quantum information theory (key theorems already
established), and quantum complexity theory (with some results, but key
problems still open). Section 2 contains a very brief survey of these devel-
opments.
At IST-UTL, QuantLog project (FCT FEDER POCI/MAT/55796/2004,
January 1, 2005–December 31, 2007) brought together researchers from
Mathematics, Physics and Computer Science in order to address some of the
open problems in quantum computation, information and logic. At this early
stage of the effort, some significant results should already be mentioned:
extension of classical logic for reasoning about quantum systems [54];
quantum algorithm for pattern matching in genomic sequences [50]; and
compositionality of quantum security protocols [6]. Section 3 outlines the
goals, ongoing tasks and results of the project.
Quantum automata are used in Section 4 to illustrate some key issues in
quantum computation at a level appropriate for non-specialists.
Finally, in Section 5, some of the most important open problems in the
area are revisited, including those currently being addressed at IST-UTL.

2. VERY BRIEF SURVEY


Information is encoded in physical systems and these are described by the
laws of physics. Such laws can roughly be divided into two classes: classical
physics, which describes the world at our scale, and quantum physics, which
describes the world at the atomic and sub-atomic scales.1 For most of man-
kind’s history, information was encoded in systems that obeyed the laws of
classical physics, such as stone, paper, electromagnetic waves or hard disks.
And, despite the fact that one of the most important scientific revolutions
of the early 20th century was the understanding and control over atoms and
their constituents, only in the last couple of decades did the idea to encode in-
formation directly in quantum systems, such as atoms, electrons or photons,
emerge. This led to a new type of information and a new area of science:
quantum information.
By the middle of the 20th century all the ingredients necessary to con-
sider this new type of information were available: Claude Shannon proposed
(classical) information theory in 1948 [68] and quantum mechanics was an
established and successful theory since at least the 30’s. Yet, it took a few
decades more before the advent of quantum information. What were then
the key ideas that led to it?
Quantum Computation and Information 45

With hindsight, the advent of quantum information was inevitable. First,


there is a technological problem. With the current trend of miniaturization
in electronic chips, it is predicted that in a few decades the number of elec-
trons per transistor will be so little that quantum effects will have to be taken
into account: the device can no longer be described by classical physics, nor
can the information it processes. From this perspective, the quantum com-
puter appears as the natural consequence of the miniaturization of current
(classical) computers. Yet, an apparently very different problem, dissipation
of heat, also led people to consider quantum systems to process information:
since quantum dynamics is reversible in time by its very nature, Paul Benioff
proposed in the early 1980’s a quantum Turing machine [12, 13] as a way
to do computation without dissipating any energy. In fact, miniaturization is
also increasing the problem of dissipation as we put more and more devices
per unit of surface in microchips, as we can observe in the increasingly soph-
isticated cooling systems that we find in our laptops, but a quantum computer
will be naturally free from such problems.
There was also an efficiency problem. Given the huge limitations that
the use of classical computers impose on the efficient simulation of the time
evolution of quantum systems, which in general can be in many different
superpositions of states, Richard Feynman proposed a computer based on
the laws of quantum physics as a natural and efficient way to simulate the
dynamics of such systems [37]. A few years later, in 1985, David Deutsch
effectively launched quantum computation by showing that a quantum com-
puter could solve a particular (and quite academic) problem faster than any
classical computer [32]. But the most significant step probably came from
Peter Shor, who in 1994 showed that it was possible to factorize integers ef-
ficiently using a quantum algorithm [69, 71]. The factorization problem is
believed to be a very hard problem for classical computers to solve, to the ex-
tent that the security of most encrypted internet communications nowadays
is based on the assumption that our current computers cannot find the solu-
tion of the problem in useful time for sufficiently large numbers. Thus, the
construction of a quantum computer, a machine that so far exists only in re-
search laboratories in a rudimentary form that can only perform very trivial
tasks, would challenge the security of our private communications and trans-
actions online. Another extremely important contribution by Shor, and inde-
pendently by Andrew Steane, was the proof of the existence of quantum error
correcting codes, allowing for the possibility of performing quantum com-
putation in realistic scenarios where the presence of noise cannot be avoided
[70, 72]. Furthermore, and possibly also contributing to the implementa-
tion effort, there are now other models of quantum computation alternative
and fully equivalent to the standard model based on quantum circuits, ini-
tially suggested by David Deutsch in 1985 and shown to require only two-
46 A. Sernadas et al.

Figure 1. Some relevant complexity classes and problems.

quantum-bit gates by David DiVincenzo in 1995 [34]. In 2000 Edward Farhi


and collaborators proposed to do quantum computation by adiabatic evol-
ution [9, 36], and in 2001 Robert Raussendorf and Hans Briegel proposed
a quantum computer based on (irreversible) quantum measurements [63], a
surprising idea very much in contrast with the (reversible) quantum circuit
model, and yet completely equivalent to it. Finally, in must be said that very
few significant quantum algorithms have surfaced so far: in 1996 Lov Grover
proposed a search algorithm that offers a quadratic speed-up [40, 41], and in
2003 Andrew Childs and collaborators came up with an algorithm to find
a particular node in a specific graph [28], a very academic problem but the
only quantum algorithm so far offering a demonstrated exponential speed-up
compared to its classical counterpart. Recall that, as mentioned above, it is
believed that Shor’s algorithm offers an exponential speed-up, but in fact it is
not known if there is an efficient classical solution to the factorization prob-
lem, nor do we know if NP ⊆ BQP, that is, if SAT ∈ BQP, where BQP is
the Bounded-error Quantum Polynomial time complexity class which Shor’s
algorithm belongs to (see Figure 1 for a map of some relevant complexity
classes and their known relationships and problems2 ). In any case, should
we have an operating quantum computer nowadays, its main use would be
to run Shor’s algorithm and thus be able to decrypt many private communic-
ations.
Quantum Computation and Information 47

Yet, interestingly, the third motivation was precisely the incomparable


level of security that quantum information can offer us. In 1935, in an at-
tempt to criticize quantum mechanics, Albert Einstein, Boris Podolsky and
Nathan Rosen (EPR) pointed out how this theory allowed for the apparent
instantaneous generation of (non-classical) correlations between arbitrarily
distant parties, a kind of spooky action at a distance that for them meant that
quantum mechanics could not be a complete theory: it needed to be enriched
with new features to explain properly such correlations. In the very same
year, Erwin Schrödinger identified the existence of states (which he called
entangled states) offering these strange quantum correlations as the “charac-
teristic trait of quantum mechanics, the one that enforces its entire departure
from classical lines of thought” [64]. Yet, most people were unaware of
Schrödinger’s reflection and the EPR problem was the source of a long de-
bate on the foundations of quantum theory, a debate that lasted at least until
1981, when Alain Aspect and collaborators, building on previous theoretical
work by John Bell [11], performed experiments showing that quantum mech-
anics is indeed a complete theory and that Einstein and his colleagues were
wrong [10]. In 1991, Artur Ekert revisited the EPR idea of what quantum
mechanics was lacking and cunningly understood that it was equivalent to
perfect eavesdropping. He then reversed the argument to show that quantum
correlations could be used to establish a perfectly secure cryptographic key
between two distant parties [35], as eavesdropping could be detected. This
independent work by Ekert launched the new field of quantum security. Yet,
in 1984, Charles Bennett and Gilles Brassard had already proposed a per-
fectly secure quantum key distribution protocol [14], but with almost no im-
pact at the time. Bennett himself was inspired by Stephen Wiesner original
ideas in the 1970’s to use the unique properties of quantum states for secur-
ity purposes, as for instance unforgeable quantum money [74]. In the early
90’s, Bennett and his collaborators also extended the idea that entanglement
between two parties could assist in the transmission of information, both
classical — as in the dense coding scheme where a single quantum two-level
system is used to send two bits [16], and quantum — as in the teleporta-
tion protocol to transmit an unknown quantum bit without measuring it [15].
The idea of the quantum bit, or qubit, as the fundamental unit of quantum
information, was introduced in 1993 by Benjamin Schumacher [68], who at
the same time launched quantum information theory by proving Shannon’s
Noiseless Coding Theorem [68] for quantum channels [65]. A few years
later, the Holevo-Schumacher-Westmoreland Theorem [46, 66] gave us the
capacity of a noisy quantum channel for classical information and the fully
quantum analog of Shannon’s Noisy Channel Coding Theorem [68] was fi-
nally obtained in 2005, by Igor Devetak [33].
48 A. Sernadas et al.

These were the key steps leading to the emergence of quantum inform-
ation as a new area of science, in fact an area that has been attracting very
significant resources over the last decade. This is not that surprising given
the revolutionary application that quantum information seems to have the po-
tential to offer. But what has been delivered so far? On the security side, the
progress has been quite spectacular, as we now have plug and play quantum
key distribution systems available on the market that work in commercial
optical fibers for up to 122 km [39], with a growing hope that such systems
will be able to operate globally in the near future, either by cable or satel-
lite. Regarding the construction of a scalable quantum computer, this is a
much harder problem, being tackled with a plethora of different technolo-
gies [59], and where some significant steps have already been made, despite
the infancy of the field: in 2001 a NMR-based machine has been able to
run Shor’s algorithm with seven quantum bits [73], and only in the end of
2005 was it possible to produce and manipulate a quantum byte of entangled
particles (in an ion trap) [42]. To build a useful quantum computer remains a
very difficult challenge and success is not guaranteed. But, in the meantime,
there are also several very important challenges at the theoretical level: to
find out which problems a quantum computer can help us solve faster, why
and its consequences for complexity theory; to extend quantum key distri-
bution protocols to more than two parties and to understand in what other
security problems quantum physics can offer us new and better solutions or,
on the other hand, better attacks to the current systems; and finally, to study
and develop new quantum logics and quantum automata to analyze these
novel algorithms and protocols.

3. RESEARCH AT IST-UTL
The interest in quantum computation and information at IST-UTL star-
ted a few years ago at the Center for Plasma Physics (CFP) and got mo-
mentum with the organization of the very successful International School3
on Quantum Computation and Information, September 2–7, 2002. A joint
(almost weekly) seminar,4 with the Center for Logic and Computation (CLC)
and the Center for Physics of Fundamental Interactions (CFIF), was started
in September 2003.
In due course, the researchers interested in the seminar put together a re-
search proposal that led to the QuantLog project5 (FCT FEDER POCI/MAT/
55796/2004, January 1, 2005–December 31, 2007). A dozen faculty mem-
bers plus some PhD students and postdocs work in the project that addresses
some of the challenging open theoretical problems in the area and explores
some important applications with emphasis on security.
Quantum Computation and Information 49

The project is organized into five tasks: T0) Physics of quantum computa-
tion and information – pursuing specific goals in relevant aspects of quantum
physics (namely, entanglement in solid state systems) and providing the
foundational support for the whole project; T1) Quantum computation –
aimed at developing new quantum algorithms (namely in logic), as well as
at establishing abstract results in computational complexity. T2) Quantum
automata – directed at developing the categorical theory of quantum auto-
mata, ultimately aiming at compositional model checking of quantum al-
gorithms and protocols. T3) Logics for quantum reasoning – focused on the
development of a new quantum logic endowed with a semantics based on
superpositions of classical valuations, having in mind the specification and
verification of quantum protocols. T4) Quantum cryptography and security –
mainly devoted to applications in cryptography and security, with emphasis
on zero-knowledge proof systems.
Cooperation has been established with some leading research groups
abroad, namely at the University of Waterloo (Canada), University College,
London (UK), Kings College, London (UK), University of Berkley (USA),
and University of Pennsylvania, Philadelphia (USA). An intensive guest pro-
gram brought to Lisbon already more than twenty researchers active in the
field for short visits and talks in our QCI seminar.

3.1 Exogenous quantum logic


Since a significant part of the project team has a background in logic,
it is no surprise that the first significant contributions were in the topic of
quantum logic. Based on the simple idea (so called exogenous approach)
of taking superpositions of classical models as the models of the envisaged
quantum logic, a novel quantum logic (EQPL) was developed for reasoning
about the states of collections of qubits [51, 52, 54].
This novel approach to quantum reasoning is different from the main-
stream approach [27, 38]. The latter, as initially proposed by Birkhoff and
von Neumann [17], focuses on the lattice of closed subspaces of a Hilbert
space and replaces the classical connectives by new connectives represent-
ing the lattice-theoretic operations. The former adopts superpositions of
classical models as the models of the quantum logic, leading to a natural
extension of the classical language containing the classical connectives (just
as modal languages are extensions of the classical language). Furthermore,
EQPL allows quantitative reasoning about amplitudes and probabilities, be-
ing in this respect much closer to the possible worlds logics for probability
reasoning than to the mainstream quantum logics. Finally, EQPL is designed
to reason about finite collections of qubits and, therefore, it is suitable for ap-
plications in quantum computation and information. The models of EQPL
50 A. Sernadas et al.

are superpositions of classical valuations that correspond to unit vectors ex-


pressed in the computational basis of the Hilbert space resulting from the
tensor product of the independent qubit systems.
Therefore, in EQPL we can express a wide range of properties of states
of such a finite collection of qubits. For example, we can impose that some
qubits are independent of (that is, not entangled with) other qubits; we can
prescribe the amplitudes of a specific quantum state; we can assert the prob-
ability of a classical outcome after a projective measurement over the compu-
tational basis; and, we can also impose classical constraints on the admissible
quantum states.
A complete axiomatization was given for EQPL in [54] (see Figure 2).
Later on, a decidable fragment was presented in [26] where completeness
was recovered with respect to a relaxed semantics over an arbitrary real
closed field and its algebraic closure.

Axioms
[CTaut]  α for each classical tautology α
[QTaut]  γ for each quantum tautology γ
[Lift⇒]  ((α1 ⇒ α2 )  (α1  α2 ))
[Eqv⊥]  (⊥≡ ⊥ ⊥)
[Ref]  ((α1  α2 )  (α1 ∧ α2 ))
[Sub∅]  [∅]
[Sub∪]  ([G1 ]  ([G2 ]  [G1 ∪ G2 ]))
[Sub\]  ([G] ≡ [qB \ G])
[RCF]  κ{| x /t , z /
u|} where κ is a valid arithmetical formula,
x , z , t and u are sequences of real variables, complex
variables, real terms and complex terms respectively
[If ]  (α  ((α  u1 ; u2 ) = u1 ))
[If⊥]  (( α)  ((α  u1 ; u2 ) = u2 ))
[Empty]  (| ∅∅ = 1)
[NAdm]  ((¬(∧A))  (| qBA = 0))

[Unit]  ([G]  (( A⊆G || GA |2 ) = 1))
[Mul]  (([G1 ]  [G2 ])  (| G1 ∪G2 A1 ∪A2 = | G1 A1 | G2 A2 ))
where G1 ∩ G2 = ∅, A1 ⊆ G1 and A2 ⊆ G2
 
[Prob]  (( α) = ( A ||α A |2 ))

Inference rules
[CMP] α1 , (α1 ⇒ α2 )  α2
[QMP] γ1 , (γ1  γ2 )  γ2

Figure 2. Axiomatization of EQPL.

Other applications and further development of the exogenous approach


to enriching logics were presented in [19, 55]. The adjective “exogenous"
is used as a counterpoint to “endogenous". For instance, in order to enrich
Quantum Computation and Information 51

some given logic with probabilistic reasoning it may be convenient to tinker


with the models of the original logic. This endogenous approach has been
used extensively. For example, the domains of first-order structures are en-
dowed with probability measures in [44]. Other examples include labeling
the accessibility pairs with probabilities in the case of Kripke structures [45]
for reasoning about probabilistic transition systems. By not tinkering with
the original models and only adding some additional structure on collections
of those models as they are, the exogenous approach has the potential for
providing general mechanisms for enriching a given logic with some addi-
tional reasoning dimension. In the case at hand, the exogenous approach has
the advantage of closely guiding the design of the envisaged quantum lan-
guage around the underlying concepts of quantum physics while keeping the
classical connectives.
Current efforts in the quantum logic front of the QuantLog project are
directed at reasoning about imperative quantum programs [25], as well as
at trying to establish a clear bridge between EQPL and the Birkhoff and von
Neumann style of quantum logics via an algebraic characterization of EQPL.

3.2 Quantum pattern matching


In another direction, a quantum algorithm for pattern matching in very
long strings (like genomic sequences) was proposed in [50]. The algorithm
is based on the modified Grover search algorithm proposed in [18] for the
case of multiple solutions. It uses the techniques originally introduced by
Grover [41]: a query operator that marks the state encoding the database ele-
ment being searched by changing its phase; followed by an amplitude ampli-
fication of the marked state.
√ The state can be detected with high probability
by iterating this process N times where N is the size of the database.
The algorithm (see Figure 3) proposed in [50] searches for as many dis-
tinct patterns as desired in a given unsorted string, and moreover returns
the position
√ of the closest substring to a given pattern with high probabil-
ity in O( N) queries, where N is the size of the string. This means that
the time to find the closest match (a much harder problem than to find the
exact match, as we shall see) does not depend on the size of the pattern it-
self, a result with no classical equivalent. Another crucial point is that our
quantum algorithm is actually useful and implementable to perform searches
in (unsorted) databases. For this, a query function per symbol of the pattern
alphabet is needed, which will require a significant (though clearly efficient)
pre-processing, but will allow us to perform an arbitrary amount of differ-
ent searches in a static string. A compile once, run many approach yielding
a new search algorithm that not only settles the previously existing imple-
52 A. Sernadas et al.

Input: w ∈  ∗ and p ∈  ∗
Output: m ∈ N
Quantum variables: |ψ ∈ H({1, . . . , N})
Classical variables: r, i, j ∈ N
Procedure:

1 choose r ∈ [0,  N − M + 1] uniformly,

2 set |ψ = N−M+1
k=1
√ 1 |k ;
N−M+1
3 for i = 1 to r
(a) choose j ∈ [1, M] uniformly
(b) set |ψ = Tj−1 Upj Tj |ψ ;
(c) set |ψ = D|ψ
4 set m to the result of the measurement of |ψ over the base
{|1 , . . . , |N }.

Figure 3. Quantum pattern matching algorithm.

mentation problems, but even offers the solution of a more general problem,
and with a very interesting speed-up.
In the classical setting, the best algorithm for the closest substring problem
takes O(MN) queries where M is the size of the pattern. This result follows
from adapting the best known algorithm for approximate pattern matching
[58], which takes O(eN + M) where e is the number of allowed errors. One
should not compare the closest match to (exact) pattern match, where the
problem consists in determining if a certain word (pattern) is a substring
of a text. For exact pattern matching it is shown that the best algorithm
can achieve O(M + N) [58]. However, in practical cases where data can
mutate over time, like DNA, or it is stored in noisy systems, the closest
match problem is much more relevant, since in general only approximates of
the pattern exist, but nevertheless need to be found.
The full analysis of the proposed quantum algorithm as well as the recipe
for its implementation as a quantum circuit are under way. In due course,
more complex pattern matching problems will be addressed.

3.3 Quantum process algebra in security


In yet another direction of the QuantLog project, work has been done in
the area of quantum process algebras. In [6] a quantum process algebra was
proposed for the design and verification of quantum protocols, with applica-
tions in quantum security.
Quantum Computation and Information 53

Security protocols are composed by several agents running in parallel,


where each agent computes information (bounded by polynomial-time on
the security parameter) and exchange it with other agents. In the context
of quantum processes, the computation is bounded by quantum polynomial-
time and the information exchanged is supported by qubits.
The problem of defining quantum security properties is addressed in [6]
using a quantum polynomial-time process algebra. This approach is highly
inspired in [48, 56]. The computational model used to define quantum poly-
nomial terms is based on the logarithmic cost random access machine [30].
A hybrid model, using both classic and quantum memory [47], is considered
and it is shown to be (polynomial-time) equivalent to a uniform family of
quantum circuits. Such machines model the computation of each agent, and
receive qubits as input and return qubits as output. Thanks to the non-cloning
theorem, quantum information can not be copied without prior knowledge
of its state. This observation imposes some design options in the process
algebra, since it is necessary to know which agent possesses a qubit in order
to know who can retrieve some piece of information. In order to deal with
this fact, a set of agents is fixed and the qubits are partitioned among them.
Process terms are divided into local and global. An agent is modeled by
a local process while a protocol is modeled by a global process, so, a global
process corresponds to local processes running in parallel. A semantics
based on probabilistic transition systems (which can be easily translated
to Markov chains) is provided, and the probabilistic transitions are defined
using rules and assuming a uniform scheduler to resolve non-deterministic
choices.
Agent observation is defined as a probability distribution over binary
words obtained by measuring, at the end of the protocol and on the com-
putational basis, (some of) the agent’s qubits. This concept is the key in-
gredient to establish observational equivalence, that in the context of secur-
ity protocols is based on computational indistinguishability [75]. Intuitively,
two process terms are observational equivalent for an agent if, after mak-
ing all possible reductions to each process, it is impossible to distinguish (in
quantum polynomial-time) the qubits of the agent on both processes. Since
quantum polynomial-time machines are internalized in the process algebra
language, observational equivalence is easily defined and it is shown to be a
congruence relation.
One of the most successful ways for defining secure concurrent crypto-
graphic tasks is via process emulation [1, 24]. This definitional job boils
down to the following: a process realizes a cryptographic task iff it emulates
an ideal process that is known to realize such task. Hence, verification of
a protocol amounts to checking if it can emulate the ideal protocols. This
approach is fully compositional.
54 A. Sernadas et al.

Current work on this front of the QuantLog project is focused on ap-


plications to designing and verifying concrete quantum security protocols,
namely contract signing, as well as on finding quantum attacks to classical
cryptosystems, namely zero-knowledge proof systems.

4. FROM QUANTUM BITS TO QUANTUM


AUTOMATA
Some of the basic concepts and issues of quantum computation can be
easily illustrated around the notion of quantum automaton.
But let us start first with the notion of classical automaton. Classical auto-
mata are widely used. In a typical household you will find several auto-
mata: refrigerators, washing machines, lifts, et cetera are usually controlled
by automata. A classical finite state automaton has a finite memory (that is,
composed of a finite number of bits). The contents of the memory (state) is
changed according to the input fed to the automaton. At each state the auto-
maton displays an output. More precisely, a classical automaton is a tuple
(, , S, s0 , δ, Z) where  is the input alphabet (set of input symbols),  is
the output alphabet (set of output symbols), S is the state space (finite set of
states), s0 ∈ S is the initial state, δ : S × → S is the transition map (returns
the next state δ(s, σ ) on receiving input σ on state s), and Z : S →  is the
output map (returns the output Z(s) on state s). For example, in the case of
your washing machine, the inputs are the buttons that you press and also the
tics of the clock. The outputs are what you can observe in its display plus the
commands it is able to issue to the other components of the washing machine
(water valves, pumps, heaters, etc).
These days, the memory is implemented using a finite number of (clas-
sical) bits. A bit is a (classical) system that can be only in two states: false
or true. Let us denote these two states of a bit by |0 and |1 , respectively.
It is only natural to introduce the notion of quantum automaton by adding
to the classical concept a quantum memory. A quantum memory is to be im-
plemented by a finite number of quantum bits known as qubits. A qubit is a
quantum system that can be in any superposition of the states of a (classical)
bit. That is, a possible state of a qubit is a vector α|0 + β|1 where α and β
are complex numbers such that |α|2 + |β|2 = 1. Thus, in general, the state of
a qubit is not one of the two possible truth values. The state of a qubit is, in
general, a “combination" of those two truth values (remember Schrödinger’s
cat!).
A classical bit is usually implemented with some electronic system: for
instance, its state is true if the voltage is greater than +5 Volts, and its state
is false if the voltage is less than -5 Volts (any other voltage is considered
faulty).
Quantum Computation and Information 55

A qubit can be implemented, for example, using the spin of an electron:


its state is true if the spin is +1/2, and its state is false if the spin is -1/2.
Furthermore, as a quantum system, the spin of the electron can be in any
superposition of +1/2 and -1/2.
The postulates of quantum mechanics also prescribe how we can observe
the state of a qubit. Given a qubit in the state α|0 + β|1 , if you measure
it with an appropriate apparatus (mathematically described as a Hermitian
operator acting on its state space6 ) then the possible outcomes of the meas-
urement are the eigenvalues of that operator. By choosing an operator with
eigenvectors |0 and |1 corresponding to distinct eigenvalues, we can decide
after the measurement if the result is false or true. This result is random: false
will come out with probability |α|2 and true will come out with probability
|β|2 . Thus, quantum systems when observed are random systems.
Quantum systems evolve by the application of unitary operators. For in-
stance, a qubit in state α|0 + β|1 will evolve to the state β|0 + α|1 if
subjected to the Pauli X transformation. The Hadamard transformation when
applied to α|0 + β|1 results in α+β √ |0 + α−β
2
√ |1 .
2
Returning to automata, we are now ready to motivate a simple but nev-
ertheless quite useful notion of quantum automaton. Figure 4 depicts the
overall structure of such an automaton. The inputs and δ are as in the clas-
sical case. But now we also have a quantum component of the memory. At
each classical component of the state s, upon input σ the quantum compon-
ent of the memory is subjected to the unitary transformation Usσ . Starting
at some initial state (s0 , |ψ0 ), after a sequence of inputs w, the automaton
reaches the final state (sw , |ψw ). The random output is obtained by applying
a suitable Hermitian operator Asw to |ψw .
In short, a quantum automaton is a tuple

M = (, , S, H, s0 , |ψ0 , δ, U, A)

where:  is the input alphabet;  ⊆ R is the output alphabet;7 S is the


classical state space; H is the Hilbert space of the quantum states; s0 ∈
S is the initial classical state; |ψ0 ∈ H is the initial quantum state; δ :
S ×  → S is the classical state transition map; U = {Usσ }s∈S,σ ∈ where
each Usσ is the quantum state transition operator at s for input σ ; and A =
{As }s∈S where each As is the measurement operator at s such that specAs ⊆
. This rather simple notion of quantum automaton subsumes the concepts
previously proposed in the literature [57].
The behavior of such a quantum automaton M is the map BM that returns
for each sequence w of inputs the probability distribution over  of the out-
puts obtained by measuring |ψw using the Hermitian operator Asw . Two
quantum automata M and M  should be considered equivalent if BM = BM  .
56 A. Sernadas et al.

Figure 4. Basic quantum automaton.

At this stage several interesting problems arise. Given M, can we find


an equivalent M • with minimal dimension of the underlying Hilbert space
H • , that is, with minimal number of qubits? The answer is yes. We can
even get rid of all qubits! But the price is high: in that case M • will have a
very large classical state space S • . That is, we can replace all qubits with an
exponential increase in the number of the (classical) bits. This is yet another
instance of a well know effect: we can always simulate quantum machinery
with classical machinery but paying a high price.
Thus, we are led to the following reformulation of the qubit minimization
problem. Given M, can we find an equivalent M • with minimal dimension
of the underlying Hilbert space H • , that is, with minimal number of qubits,
but allowing only a polynomial increase on the number of (classical) bits?
These problems for this kind of quantum automata (and also for more
powerful kinds of quantum automata allowing quantum outputs) are the cur-
rent focus of task T2 of the Quantlog project described in Section 3.
Quantum Computation and Information 57

5. OUTLOOK
Notwithstanding the significant steps mentioned in Section 2, some key
open issues remain in the field of quantum computation and information be-
fore it revolutionizes the way we compute and communicate, namely:

– Usable hardware for quantum computation?


– Long range cable and open air quantum communication and networks?
– Which quantum systems can be efficiently simulated in a classical
computer?
– Where is BQP in the family of computational complexity classes? Is
SAT in BQP?
– Further examples (besides Child’s graph search) of exponential gains
by using quantum computation?
– Can quantum communication achieve exponential gain in communic-
ation complexity?
– Besides Shor’s quantum Fourier transform and Grover’s amplitude
amplification, other approaches to the design of quantum algorithms?
– Can quantum resources help in producing tamper-proof devices?
– Which classical cryptosystems will still be secure against quantum at-
tacks?

At IST-UTL, within the context of the QuantLog project described in


Section 3, some aspects of the non experimental issues above are being
addressed, namely: properties of entanglement in solid state systems [31];
particle statistics in quantum information [60, 61]; quantum walks and their
comparison with random walks [62]; quantum algorithms for searching [49]
and in logic; quantum automata and their minimization and interconnection;
quantum transition systems for model checking of quantum systems [7];
quantum logic [19, 25, 26, 51–55]] for model checking of quantum systems;
formal methods in security [2–5, 20–23]; quantum security [6]; and quantum
attacks to classical cryptosystems.

ACKNOWLEDGMENTS
The authors wish to express their gratitude to all members of the team of
the QuantLog project for helping in this survey of their activities. This work
was partially supported by FCT and EU FEDER through POCTI and POCI,
namely via POCI/MAT/55796/2004 project.
58 A. Sernadas et al.

NOTES
1. Throughout this text, the word classical will be used in the sense of non-quantum.
2. See also the site http://qwiki.caltech.edu/wiki/Complexity Zoo by Scott Aaronson.
3. http://www.qubit.org/school2002/
4. http://sem.math.ist.utl.pt/qci/
5. http://clc.math.ist.utl.pt/quantlog.html
6. Hilbert space of dimension 2.
7. Recall that the eigenvalues of a Hermitian operator are real numbers.

REFERENCES
[1] Abadi M, Gordon AD. “A calculus for cryptographic protocols: The Spi Calculus”,
Information and Computation, vol. 148 no. 1, pp. 1-70, 1999. Full version available
as SRC Research Report 149, January 1998.
[2] Adão P, Bana G, Herzog J, Scedrov A. “Soundness and completeness of formal en-
cryption: The cases of key-cycles and partial information leakage”, Preprint, CLC,
Department of Mathematics, Instituto Superior Técnico, 1049-001 Lisboa, Portugal,
2005. Submitted for publication.
[3] Adão P, Bana G, Herzog J, Scedrov A. “Soundness of formal encryption in the pres-
ence of key-cycles”, S. D. C. di Vimercati, P. Syverson, and D. Gollmann (eds.), Pro-
ceedings of the 10th European Symposium on Research in Computer Security (ES-
ORICS), vol. 3679 of Lecture Notes in Computer Science, Springer-Verlag, 2005, pp.
374-396.
[4] Adão P, Bana G, Scedrov A. “Computational and information-theoretic soundness and
completeness of formal encryption”, Proceedings of the 18th IEEE Computer Security
Foundations Workshop (CSFW), IEEE Computer Society Press, 2005, pp. 170-184.
[5] Adão P, Fournet C. “Cryptographically sound implementations for communicating
processes”, Preprint, CLC, Department of Mathematics, Instituto Superior Técnico,
1049-001 Lisboa, Portugal, 2006. Submitted for publication.
[6] Adão P, Mateus P. “A process algebra for reasoning about quantum security”, Elec-
tronic Notes in Theoretical Computer Science, to appear. Preliminary version presented
at 3rd International Workshop on Quantum Programming Languages, June 30–July 1,
2005, Chicago, Affiliated Workshop of LICS 2005.
[7] Adão P, Mateus P, Reis T, Viganò L. “Towards a quantitave analysis of security pro-
tocols”, Preprint, CLC, Department of Mathematics, Instituto Superior Técnico, 1049-
001 Lisboa, Portugal, 2006. Submitted for publication.
[8] Agrawal M, Kayal N, Saxena N. “PRIMES is in P”, Annals of Mathematics, vol. 160
no. 2, pp. 781-793, 2004.
[9] Aharonov D, van Dam W, Kempe J, Landau Z, Lloyd S, Regev O. “Adiabatic quantum
computation is equivalent to standard quantum computation”, FOCS ’04: Proceedings
of the 45th Annual IEEE Symposium on Foundations of Computer Science (FOCS’04),
IEEE Computer Society, 2004, pp. 42-51.
[10] Aspect A, Grangier P, Roger G. “Experimental tests of realistic local theories via Bell’s
theorem”, Physical Review Letters, vol. 47, pp. 460, 1981.
[11] Bell JS. “On the Einstein-Podolsky-Rosen paradox”, Physics, vol. 1, pp. 195, 1964.
Quantum Computation and Information 59

[12] Benioff P. “The computer as a physical system: A microscopic quantum mechanical


Hamiltonian model of computers as represented by Turing machines”, Journal of Stat-
istical Physics, vol. 22, pp. 563-591, 1980.
[13] Benioff P. “Quantum mechanical models of Turing machines that dissipate no energy”,
Physical Review Letters, vol. 48, pp. 1581-1585, 1982.
[14] Bennett CH, Brassard G. “Quantum cryptography: Public key distribution and coin
tossing”, Proceedings of IEEE international Conference on Computers, Systems and
Signal Processing, IEEE Press, 1984, pp. 175-179.
[15] Bennett CH, Brassard G, Crépeau C, Jozsa R, Peres A, Wootters W. “Teleporting
an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels”,
Physical Review Letters, vol. 70 no. 13, pp. 1895-1899, 1993.
[16] Bennett CH, Wiesner SJ. “Communication via one- and two-particle operators on
Einstein-Podolsky-Rosen states”, Physical Review Letters, vol. 69 no. 20, pp. 2881-
2884, 1992.
[17] Birkhoff G, von Neumann J. “The logic of quantum mechanics”, Annals of Mathem-
atics, vol. 37 no. 4, pp. 823-843, 1936.
[18] Boyer M, Brassard G, Høyer P, Tapp A. “Tight bounds on quantum searching”, Forts-
chritte der Physik, vol. 46 no. 1-5, pp. 493-505, 1998.
[19] Caleiro C, Mateus P, Sernadas A, Sernadas C. “Quantum institutions”, Preprint, CLC,
Department of Mathematics, Instituto Superior Técnico, 1049-001 Lisboa, Portugal,
2005. Submitted for publication.
[20] Caleiro C, Viganò L, Basin D. “Deconstructing Alice and Bob”, Electronic Notes in
Theoretical Computer Science, vol. 135 no. 1, pp. 3-22, 2005. Preliminary version
presented at ICALP’05 ARSPA Workshop.
[21] Caleiro C, Viganò L, Basin D. “Metareasoning about security protocols using dis-
tributed temporal logic”, Electronic Notes in Theoretical Computer Science, vol. 125
no. 1, pp. 67-89, 2005. Preliminary version presented at IJCAR’04 ARSPA Workshop.
[22] Caleiro C, Viganò L, Basin D. “On the expresiveness of a message sequence formalism
for security protocols”, Preprint, CLC, Department of Mathematics, Instituto Superior
Técnico, 1049-001 Lisboa, Portugal, 2005. Submitted for publication.
[23] Caleiro C, Viganò L, Basin D. “Relating strand spaces and distributed temporal logic
for security protocol analysis”, Logic Journal of the IGPL, vol. 13 no. 6, pp. 637-664,
2005.
[24] Canetti R. “Universally composable security: A new paradigm for cryptographic pro-
tocols”, 42nd IEEE Symposium on Foundations of Computer Science (FOCS), IEEE
Computer Society Press, 2001, pp. 136-145. Full version available at IACR ePrint
Archive, Report 2000/067.
[25] Chadha R, Mateus P, Sernadas A. “Reasoning about states of probabilistic sequential
programs”, Preprint, CLC, Department of Mathematics, Instituto Superior Técnico,
1049-001 Lisboa, Portugal, 2006. Submitted for publication.
[26] Chadha R, Mateus P, Sernadas A, Sernadas C. “Extending classical logic for reas-
oning about quantum systems”, Preprint, CLC, Department of Mathematics, Instituto
Superior Técnico, 1049-001 Lisboa, Portugal, 2005. Submitted for publication.
[27] Chiara MLD, Giuntini R, Greechie R. Reasoning in Quantum Theory, Dordrecht, The
Netherlands, Kluwer Academic Publishers, 2004.
60 A. Sernadas et al.

[28] Childs AM, Cleve R, Deotto E, Farhi E, Gutmann S, Spielman DA. “Exponential
algorithmic speedup by a quantum walk”, STOC’03: Proceedings of the 35th Annual
ACM Symposium on Theory of Computing, ACM Press, 2003, pp. 59-68.
[29] Cook SA. “The complexity of theorem-proving procedures”, STOC’71: Proceedings
of the 3rd Annual ACM Symposium on Theory of Computing, ACM Press, 1971, pp.
151-158.
[30] Cook SA, Reckhow RA. “Time bounded random access machines”, Journal of Com-
puter and System Sciences, vol. 7 no. 4, pp. 354-375, 1973.
[31] Costa Jr AT, Bose S, Omar Y. “Entanglement of two impurities through electron scat-
tering”, Preprint, CFP, Department of Physics, Instituto Superior Técnico, 1049-001
Lisboa, Portugal, 2005. Submitted for publication.
[32] Deutsch D. “Quantum theory, the Church-Turing principle and the universal quantum
computer”, Proceedings of the Royal Society of London A, vol. 400, pp. 97-117, 1985.
[33] Devetak I. “The private classical capacity and quantum capacity of a quantum chan-
nel”, IEEE Transactions on Information Theory, vol. 51, pp. 44-55, 2005.
[34] DiVincenzo DP. “Two-bit gates are universal for quantum computation”, Physical
Review A, vol. 51, pp. 1015-1022, 1995.
[35] Ekert AK. “Quantum cryptography based on Bell’s theorem”, Physical Review Letters,
vol. 67 no. 6, pp. 661-663, 1991.
[36] Farhi E, Goldstone J, Gutmann S, Sipser M. “Quantum computation by adiabatic
evolution”, Technical Report quant-ph/0001106, ArXiv, USA, 2000.
[37] Feynman RP. “Simulating Physics with computers”, International Journal of Theoret-
ical Physics, vol. 21, pp. 467, 1982.
[38] Foulis DJ. “A half-century of quantum logic. What have we learned?”, Quantum Struc-
tures and the Nature of Reality, vol. 7 of Einstein Meets Magritte, Kluwer Acad. Publ.,
1999, pp. 1-36.
[39] Gobby C, Yuan ZL, Shields AJ. “Quantum key distribution over 122 km of standard
telecom fiber”, Applied Physics Letters, vol. 84 no. 19, pp. 3762-3764, 2004.
[40] Grover LK. “A fast quantum mechanical algorithm for database search”, STOC’96:
Proceedings of the 28th Annual ACM Symposium on the Theory of Computing, ACM
Press, 1996, pp. 212-219.
[41] Grover LK. “Quantum mechanics helps in searching for a needle in a haystack”,
Physical Review Letters, vol. 79 no. 2, pp. 325-328, 1997.
[42] Häffner H, Hänsel W, Roos CF, Benhelm J, al kar D Chek, Chwalla M, Körber T,
Rapol UD, Riebe M, Schmidt PO, Becher C, Gühne O, Dür W, Blatt R. “Scalable
multiparticle entanglement of trapped ions”, Nature, vol. 438, pp. 643-646, 2005.
[43] Hallgren S. “Polynomial-time quantum algorithms for Pell’s equation and the principal
ideal problem”, STOC’02: Proceedings of the 34th Annual ACM Symposium on Theory
of Computing, ACM Press, 2002, pp. 653-658.
[44] Halpern JY. “An analysis of first-order logics of probability”, Artificial Intelligence,
vol. 46, pp. 311-350, 1990.
[45] Hansson H, Jonsson B. “A logic for reasoning about time and reliability”, Formal
Aspects of Computing, vol. 6, pp. 512-535, 1995.
[46] Holevo AS. “The capacity of quantum channel with general signal states”, IEEE Trans-
actions on Information Theory, vol. 44, pp. 269, 1998.
Quantum Computation and Information 61

[47] Knill E. “Conventions for quantum pseudocode”, Technical Report LAUR-96-2724,


Los Alamos National Laboratory, Los Alamos, USA, 1996.
[48] Mateus P, Mitchell J, Scedrov A. “Composition of cryptographic protocols in a prob-
abilistic polynomial-time process calculus”, R. Amadio and D. Lugiez (eds.), CON-
CUR 2003 – Concurrency Theory, vol. 2761 of Lecture Notes in Computer Science,
Springer, 2003, pp. 327-349.
[49] Mateus P, Omar Y. “Quantum pattern matching”, Preprint, CLC, Department of Math-
ematics, Instituto Superior Técnico, 1049-001 Lisboa, Portugal, 2005. ArXiv quant-
ph/0508237. Full version of [50].
[50] Mateus P, Omar Y. “A quantum algorithm for closest pattern matching”, D. Angela-
kis and M. Christandl (eds.), Proceedings of NATO ASI Quantum Computation and
Information, IOS Press, in print. Short version of [49].
[51] Mateus P, Sernadas A. “Exogenous quantum logic”, W. A. Carnielli, F. M. Dionísio,
and P. Mateus (eds.), Proceedings of CombLog’04, Workshop on Combination of Lo-
gics: Theory and Applications, Departamento de Matemática, Instituto Superior Téc-
nico, Lisboa, 2004, pp. 141-149. Extended abstract.
[52] Mateus P, Sernadas A. “Reasoning about quantum systems”, J. Alferes and J. Leite
(eds.), Logics in Artificial Intelligence, Ninth European Conference, JELIA’04, vol.
3229 of Lecture Notes in Artificial Intelligence, Springer-Verlag, 2004, pp. 239-251.
[53] Mateus P, Sernadas A. “Complete exogenous quantum propositional logic”, Technical
report, CLC, Department of Mathematics, Instituto Superior Técnico, 1049-001 Lis-
boa, Portugal, 2005. Extended abstract. Short presentation at LICS 2005, Chicago,
USA, June 26-29.
[54] Mateus P, Sernadas A. “Weakly complete axiomatization of exogenous quantum pro-
positional logic”, Information and Computation, in print. ArXiv math.LO/0503453.
[55] Mateus P, Sernadas A, Sernadas C. “Exogenous semantics approach to enriching lo-
gics”, G. Sica (ed.), Essays on the Foundations of Mathematics and Logic, vol. 1 of
Advanced Studies in Mathematics and Logic, Polimetrica, 2005, pp. 165-194.
[56] Mitchell J, Ramanathan A, Scedrov A, Teague V. “A probabilistic polynomial-time cal-
culus for analysis of cryptographic protocols (Preliminary Report)”, Electronic Notes
in Theoretical Computer Science, vol. 45, pp. 1-31, 2001.
[57] Moore C, Crutchfield JP. “Quantum automata and quantum grammars”, Theoretical
Computer Science, vol. 237 no. 1-2, pp. 275-306, 2000.
[58] Navarro G. “A guided tour to approximate string matching”, ACM Computing Surveys,
vol. 33 no. 1, pp. 31-88, 2001.
[59] Nielsen MA, Chuang IL. Quantum Computation and Quantum Information, Cam-
bridge, UK, Cambridge University Press, 2000.
[60] Omar Y. “Indistinguishable particles in quantum mechanics: An introduction”, Con-
temporary Physics, vol. 46, pp. 437-448, 2005.
[61] Omar Y. “Particle statistics in quantum information processing”, International Journal
of Quantum Information, vol. 3 no. 1, pp. 201-205, 2005.
[62] Omar Y, Paunkovic N, Sheridan L, Bose S. “Quantum walk on a line with two en-
tangled particles”, Preprint, CFP, Department of Physics, Instituto Superior Técnico,
1049-001 Lisboa, Portugal, 2004. Submitted for publication.
[63] Raussendorf R, Briegel HJ. “A one-way quantum computer”, Physical Review Letters,
vol. 86 no. 22, pp. 5188-5191, 2001.
62 A. Sernadas et al.

[64] Schrödinger E. “Die gegenwartige Situation in der Quantenmechanik”, Naturwis-


senschaften, vol. 23, pp. 807-812, 823-823, 844-849, 1935. English translation: John
D Trimmer, Proceedings of the American Philosophical Society, 124, 323-38 (1980),
Reprinted in Quantum Theory and Measurement, p. 152 (1983).
[65] Schumacher B. “Quantum coding”, Physical Review A, vol. 51, pp. 2738-2747, 1995.
[66] Schumacher B, Westmoreland M. “Sending classical information via noisy quantum
channels”, Physical Review A, vol. 56, pp. 131Ű138, 1997.
[67] Schwartz JT. “Fast probabilistic algorithms for verification of polynomial identities”,
Journal of the ACM, vol. 27 no. 4, pp. 701-717, 1980.
[68] Shannon CE. “A mathematical theory of communication”, Bell System Technical
Journal, vol. 27, pp. 379, 623, 1948.
[69] Shor PW. “Algorithms for quantum computation: Discrete logarithms and factoring”,
S. Goldwasser (ed.), Proceedings of the 35th Annual Symposium on the Foundations
of Computer Science, IEEE Computer Society, 1994, pp. 124-134.
[70] Shor PW. “Scheme for reducing decoherence in quantum computer memory”, Physical
Review A, vol. 52, pp. R2493, 1995.
[71] Shor PW. “Polynomial-time algorithms for prime factorization and discrete logarithms
on a quantum computer”, SIAM Journal on Computing, vol. 26 no. 5, pp. 1484-1509,
1997. Presented at FOCS’94.
[72] Steane AM. “Error correcting codes in quantum theory”, Physical Review Letters,
vol. 77 no. 5, pp. 793-797, 1996.
[73] Vandersypen LMK, Steffen M, Breyta G, Yannoni CS, Sherwood MH, Chuang IL.
“Experimental realization of Shor’s quantum factoring algorithm using nuclear mag-
netic resonance”, Nature, vol. 414, pp. 883-887, 2001.
[74] Wiesner S. “Conjugate coding”, SIGACT News, vol. 15 no. 1, pp. 78-88, 1983. Ori-
ginal manuscript written circa 1970.
[75] Yao AC. “Theory and applications of trapdoor functions”, 23rd IEEE Symposium on
Foundations of Computer Science (FOCS), IEEE Computer Society, 1982, pp. 80-91.
[76] Zippel R. “Probabilistic algorithms for sparse polynomials”, EUROSAM ’79: Pro-
ceedings of the International Symposiumon on Symbolic and Algebraic Computation,
Springer-Verlag, 1979, pp. 216-226.
PART II

BASIC SCIENCES
AN OVERVIEW OF SOME MATHEMATICAL
MODELS OF BLOOD RHEOLOGY

Adélia Sequeira1 and João Janela2


1
Dept. Matemática, IST and CEMAT/IST, Universidade Técnica de Lisboa, Av. Rovisco Pais,
1, 1049-001 Lisboa, Portugal; email: adelia.sequeira@math.ist.utl.pt
2
Dept. Matemática, ISEG and CEMAT/IST, Universidade Técnica de Lisboa, Rua do Quel-
has, 6, 1200 Lisboa, Portugal; email: jjanela@iseg.utl.pt

Abstract: Experimental investigations over many years reveal that blood flow exhib-
its non-Newtonian characteristics such as shear-thinning, viscoelasticity and
thixotropic behaviour. The complex rheology of blood is influenced by nu-
merous factors including plasma viscosity, rate of shear, hematocrit, level of
erythrocytes aggregation and deformability. Hemodynamic analysis of blood
flow in vascular beds and prosthetic devices requires the rheological beha-
viour of blood to be characterized through appropriate constitutive equations
relating the stress to deformation and rate of deformation.
The objective of this paper is to present a short overview of some macro-
scopic constitutive models that can mathematically characterize the rheology
of blood and describe its known phenomenological properties. Some numer-
ical simulations obtained in geometrically reconstructed real vessels will be
also presented to illustrate the hemodynamic behaviour using Newtonian and
non-Newtonian inelastic models under a given set of physiological flow con-
ditions.

Keywords: Blood rheology, shear-thinning fluid, generalized Newtonian model, vis-


coelasticity, pressure pulse, wall shear stress.

1. INTRODUCTION
Mathematical and numerical models together with computer simulations
are playing an increasingly relevant role in biology and medicine. Applica-
tions to blood flow in the human circulatory system and to its inherent patho-
logies, are certainly one of the major mathematical challenges of the coming
decades.
Blood performs the essential functions of delivering oxygen and nutrients
to all tissues, it removes waste products and defends the body against infec-
tion through the action of antibodies. Blood is a multi-component mixture

65
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 65–87.
© 2007 Springer. Printed in the Netherlands.
66 A. Sequeira and J. Janela

with complex rheologic characteristics which interacts both mechanically


and chemically with vessel walls, giving rise to complex fluid-structure in-
teraction models whose mathematical analysis is still incomplete and which
are difficult to simulate numerically in an efficient manner. The blood cir-
culation in the cardiovascular system depends not only on the rheology of
blood itself but also on the driving force of the heart and the architecture and
mechanical properties of the vascular system. Hemodynamic factors such as
flow separation, flow recirculation, or low and oscillatory wall shear stress
are now recognised as playing an important role in the localization and de-
velopment of arterial diseases. They can have useful applications in medical
research, surgical planning and therapy to restore blood flow in pathological
organs and tissues. For instance, in the case of atherosclerosis numerous
investigations report that the genesis and the progression of the disease are
related with the locally complex and multi-directional flow field in the vicin-
ity of curvatures, branches and bifurcations of large and medium sized ves-
sels. The combined effects of complex arterial geometry with flow pulsatility
and rheology induce low oscillating wall shear stress, high pressure distri-
bution and a enhanced particle residence time in flow separation and flow
recirculation zones, resulting in a locally distributed mass transfer (see e.g.
[6, 34, 41]).
In contrast to vessel obstruction resulting from atherosclerotic disease,
aneurysmal disease results in vessel enlargement and in some cases rupture.
It is currently believed that the most important factors in the genesis of ab-
dominal or cerebral saccular aneurysms (found in and about the circle of
Willis) are congenital defects in the artery along with the thrust of pulsatile
blood flow at these weak branched or bifurcating points (see e.g. [9, 56]).
Clinically relevant hemodynamic parameters, including pressure, velo-
city, blood flow patterns and shear stress, can be directly or indirectly quanti-
fied. Experimental measurements of blood flow velocity and pressure fields
are very important in the diagnosis and surgical planning therapies of patients
with congenital and acquired cardiovascular diseases. In vivo and in vitro ex-
perimental methods for quantifying these hemodynamic parameters involve
both invasive and non-invasive techniques such as intra-vascular ultrasound
probes [33], electromagnetic flow probes [45] or Magnetic Resonance Ima-
ging (MRI) [52]. The corresponding collected data are accurate enough for
quantification of some aspects of the arterial diseases but are very sensitive
to disturbing factors. This results in difficult interpretations in most relevant
cases.
The development of effective and accurate numerical simulation tools to
better understand local hemodynamics can play a crucial role in this pro-
cess. Besides their employment in medical research, numerical models of
vascular flows can provide a virtual experimental environment to be used as
An Overview of Some Mathematical Models of Blood Rheology 67

a training system. For instance, techniques currently used to open narrowed


atherosclerotic arteries are angioplasty (also called balloon angioplasty) and
vascular stenting, which are minimally invasive procedures performed by an
interventional radiologist to improve blood flow in the body’s arteries. In
the angioplasty procedure, the physician threads a balloon-tipped catheter (a
thin, plastic tube) to the site of a narrowed or blocked artery and then inflates
the balloon to open the vessel. The balloon is then deflated and removed
from the artery. Vascular stenting, which is often performed at the same
time as an angioplasty, involves the placement of a small wire mesh tube
called a stent in the newly opened artery. This may be necessary after some
angioplasty procedures to prevent recoil of the vessel after removal of the
balloon. Currently there is much excitement in the cardiology community
about drug–eluting stents which are a promising new treatment for coron-
ary artery disease. This ingenious therapy involves coating the outer aspect
of a standard coronary stent with a thin polymer containing medication that
is released after implantation, dramatically decreasing the chance of resten-
osis at the site of treatment. However these medical techniques are largely
empirical, they are related to the specific patient and their success depends
mostly on the surgeon’s decision and practice. Currently, using intravascular
ultrasound, it is possible to obtain both intra pressure waves downstream and
upstream the vascular constriction, as well as the velocity profiles of blood
flow in the arteries, and build a 3D model of the local artery circulation. It
is also possible to obtain data from patients with different diseases, such as
variability in heart rate and reflex control of the circulation, both baro and
chemoreflex. These data sets can be used to validate relevant hemodynamic
flow quantities and generate metrics of disease state that will provide the
design of algorithms for patient specific treatment strategies. The outcome
of this new technique of the clinical practice is to develop a computer-aided
surgical planning in a grid-supported virtual environment and is referred to
as “predictive medicine” (see [46]). These are just a few examples of med-
ical problems and procedures where computer simulations based on models
of blood rheology play a major role.
Blood is a suspension of cellular deformable components (red blood cells,
white blood cells and platelets) in plasma containing proteins, lipids, elec-
trolytes and other matter. The study of blood flow in the vascular system
is complicated in many respects and thus simplifying assumptions are of-
ten made. In the large vessels (1 − 3cm of diameter) where shear rates are
high enough, it is reasonable to assume that blood has a constant viscosity
and a Newtonian behaviour. Numerical blood flow studies in these vessels
are usually based on the Navier–Stokes equations with an appropriate con-
stant reference viscosity. However in smaller vessels (arteries and arterioles,
or veins and venules, with 0.2mm to 1cm of diameter) or in some diseased
68 A. Sequeira and J. Janela

conditions (like hypertension or atherosclerosis, among others) the presence


of the cells induces low shear rate (0.1s −1 ) and blood exhibits remarkable
non-Newtonian properties, like shear-thinning viscosity and viscoelasticity,
mainly due to red blood cells aggregation and deformability as reported by
many authors (see details below, on Section 2). At the smallest levels (capil-
laries) blood cannot be modelled anymore as a homogeneous fluid, since the
dimension of the particles are now of the same order of that of the vessels
and the effect of wall permeability becomes also important.
In this work we assume that all macroscopic length and time scales are
sufficiently large compared to those of blood formed elements. Thus the
models presented here would not be appropriate in the capillary network.
For an overview of hemorheology in the microcirculation we refer the reader
to the review articles of Popel and Johnson [36], Pries and Secomb [37] and
Cristini and Kassab [17].
The word hemorheology was introduced by A. L. Copley in a survey on
rheology of blood in 1952, [15]. He defined the term as follows: ’Hemorhe-
ology is concerned with the deformation and flow properties of cellular and
plasmatic components of blood in macroscopic, microscopic and submicro-
scopic dimensions, and in the rheological properties of the vessel structure
which directly comes in contact with blood’. Additionally, A. L. Copley
and G. Seaman [16] widened this definition saying that: ‘Hemorheology is
also the study of the interaction of blood or its components and the vascu-
lar system with added foreign materials, such as drugs, plasma expanders or
prosthetic devices. Thus hemorheology is the study of how the blood and the
blood vessels can function and interact as parts of the living organism’.
Clinical hemorheology deals with pathological hemorheological abnor-
malities and has developed based on the evidence that the change of rheolo-
gical properties of blood and its components might be the fundamental cause
of many cardiovascular diseases. Hemorheological alterations can easily be
considered as a result (or an indicator) of insufficient circulatory function.
Alternatively, alterations in hemorheological parameters may affect tissue
perfusion and be manifested as circulatory problems. Basically, pathologies
with hematological origin like leukemia, hemolytic anemia, thalassemia or
pathologies associated with the risk factors of thrombosis and atherosclero-
sis like myocardial infarction, hypertension, strokes or diabetes are mainly
related to disturbances of local homeostasis. Therefore the mathematical
and numerical study of powerful, yet simple, constitutive models that can
capture the rheological response of blood over a range of flow conditions is
ultimately recognised as an important tool for clinical diagnosis and thera-
peutic planning (see e.g. [29]).
This paper is organized as follows. In the next sections we give a short de-
scription of the main rheological properties of blood, followed by an outline
An Overview of Some Mathematical Models of Blood Rheology 69

of various macroscopic constitutive models based on its mechanical prop-


erties. Finally, we present the results of some numerical simulations using
a finite element approach of non-Newtonian inelastic fluid models, to show
the importance of the rheology of blood under a given set of physiological
flow conditions.

2. BLOOD MORPHOLOGY AND VISCOMETRIC


PROPERTIES
Blood is a multi-component mixture with complex rheological character-
istics. It consists of multiple cellular elements: (i) red blood cells – RBCs
(erythrocytes), the most numerous of the formed elements (about 98%) are
tiny biconcave discoid particles, filled with a fluid, which are involved in
oxygen and carbon dioxide transport; (ii) white blood cells – WBCs (leuk-
ocytes) are much less numerous than RBCs, they have nuclei, and are clas-
sified into two groups: granulocytes and agranulocytes. Leukocytes are in-
volved in the organism’s defence against invasion by bacteria and viruses
and, as well as erythrocytes are formed from stem cells in the bone marrow;
(iii) platelets (thrombocytes), small discoid cell fragments containing various
chemicals such as serotonin, thrombin, ADP, are much smaller than erytro-
cytes (approximately 6µm3 in size as compared to 90µm3 ) and form a small
fraction of the particulate matter of human blood (around 3% by volume).
Platelets get activated due to several biochemical reactions and mechanical
processes and are involved in the formation of clot cascades (see more de-
tails in Section 2.2), but they have a negligible effect on the mechanics of
normal blood, compared to erythrocytes.
The cellular elements are suspended in an aqueous polymer solution, the
plasma, containing electrolytes as well as organic molecules such as meta-
bolites, hormones, enzymes, antibodies and other proteins and representing
approximately 55% of the blood volume. Plasma transports nutrients as well
as wastes throughout the body.
White blood cells are normally roughly spherical in shape, with the dia-
meters ranging from about 7–22 µm. Rather little is known of the mech-
anical properties of the WBCs. It has been argued that they are stiffer than
RBCs, because in a collision between a red and a white cell in flowing blood,
it is the former which mainly deforms.
We focus particular attention on the red blood cells because they are the
only cells which significantly influence the mechanical properties of blood.
They do this because they are present in very high concentration (approxim-
ately 5 × 106 /mm3 ) comprising about 40 to 45% of its volume (hematocrit).
The rheology of blood is therefore primarily determined by the behaviour of
the erythrocytes at different shear rates.
70 A. Sequeira and J. Janela

2.1 Blood viscosity and viscoelasticity


When a suspension of randomly distributed particles (be they rigid, de-
formable or fluid) is flowing in an apparatus whose dimensions are large
compared to those of the particles and the space between them, the mixture
can be regarded as a homogeneous fluid. By studying the mechanical prop-
erties of such a suspension, we can see what determines its viscosity and
whether it has a Newtonian (shear stress proportional to the rate of shear) or
non-Newtonian behaviour.
As already referred red blood cells are highly flexible biconcave discs
(some 8.5 µm in diameter) with a very thin membrane (2.5 µm of maximum
thickness) and filled with a saturated solution of hemoglobin, which are cap-
able of extreme distortion, without changing their surface area, as when they
travel along capillaries with diameters smaller than their own. Another phe-
nomenon closely linked to the deformability of the RBCs is the rotation
of the membrane around the liquid interior in a shear flow (tank-threading
movement, [7]). At sufficiently low shear rates (smaller than 10s −1 ) RBCs
tend to aggregate attaching side-by-side and forming long clusters called
rouleaux, see Figure 1. Under no flow conditions, the time scale for the
formation of these aggregates is 60s. If shear rate is decreased even further,
to 1s −1 , the rouleaux form long column-like structures, inducing an addi-
tional increase of the viscosity. The time required for building a network
is even longer than for rouleaux formation. This mechanism is still incom-
pletely understood. It appears that the erythrocytes attract one another and
the process depends in particular on the influence of bridging macromolec-
ules, especially fibrinogen and globulins in the plasma. The process will
not occur in their absence and it occurs progressively faster with increas-
ing concentration of these macromolecules [5]. If shear rate is increased,
and is high enough, the rouleaux break up, RBCs deform into an infinite
variety of shapes without changing volume, they align with the flow field
and tend to slide upon plasma layers formed in between. This induces the
decrease of the blood viscosity. Deformability, orientation and aggregation
of red blood cells result in shear-thinning viscosity of blood (Figure 2). It
should be added, however, that other non-Newtonian phenomena occur in
small sized vessels, such as the Fåhraeus-Lindqvist effect [20] (cell aligne-
ment and plasma skimming), Fåhraeus effect [19] (dymanic reduction of
hematocrit in small vessels) and sedimentation, reducing the apparent vis-
cosity of blood in the microvessels (see e.g. [17, 36, 37]).
Since blood cells are essentially elastic membranes filled with a fluid, it
seems reasonable, at least under certain flow conditions, to expect blood to
behave like a viscoelastic fluid. At low shear rates RBCs aggregate and are
‘solid-like’, being able to store elastic energy that accounts for the memory
An Overview of Some Mathematical Models of Blood Rheology 71

Figure 1. Profile view of erythro- Figure 2. Viscosity in steady shear of normal


cytes forming rouleaux (courtesy of blood, blood with hardened erythrocytes (no deforma-
Prof. M.V. Kameneva, University of tion) and blood in a Ringer solution (no aggregation),
Pittsburgh, USA). from Chien et al. [12] and Caro et al. [6].

effects in blood. Dissipation is primarily due to the evolution of the RBC


networks and, given the paucity of data on temperature effects, the internal
energy is assumed to depend only on the deformation gradient. At high
shear rates, the RBCs disaggregate forming smaller rouleaux, and later in-
dividual cells, that are characterized by distinct relaxation times. RBCs be-
come ‘fluid-like’, losing their ability to store elastic energy and the dissip-
ation is primarily due to the internal friction. Upon cessation of shear, the
entire rouleaux network is randomly arranged and may be assumed to be iso-
tropic with respect to the current natural configuration. Thurston (see [48])
was among the earliest to recognize the viscoelastic nature of blood and that
the viscoelastic behaviour is less prominent with increasing shear rate. He
investigated viscoelastic properties of blood in the linear viscoelastic regime
and measured a significant elastic component in oscillatory blood flow. He
also measured the shear rate dependence of the viscoelastic properties of
blood at a given frequency [49]. From these measurements, the non-linear
viscoelastic properties of blood are evident.
It also been experimentally observed that aggregation, break down of
rouleaux and orientation of RBCs take place over different non-zero time
scales. McMillan et al. [30] investigated the transient properties of blood in
viscometric flow and measured shear stress generated by blood at different
shear rates. These authors verified a delayed relaxation of shear stress but
they could not detect any measurable first normal stress differences in blood.
Based on these results, blood can also be considered thixotropic at low shear
rates [25].
The rheological behaviour of blood is mainly governed by the concen-
tration and the properties of the red blood cells, as mentioned above. The
deformability, orientation and aggregation of RBCs induce the specific be-
72 A. Sequeira and J. Janela

haviour of blood in simple shear flow. Using viscometers, a uniform velocity


field is generated and by measuring the flow induced torque, the viscomet-
ric properties of blood can be determined. However, due to inhomogenities
of blood and its complex behaviour, the determination of the viscometric
properties of blood is complicated and results from literature should be in-
terpreted with caution. For an extended review on the technical problems
that arise when the properties of blood are determined through viscometry,
see e.g. Cokelet [14].

2.2 Platelet activation and blood coagulation


While there has been a considerable research effort in blood rheology, the
constitutive models have thus far focused on the aggregation and deformab-
ility of the RBCs, ignoring the role of platelets in the flow characteristics.
However they are by far the most sensitive of all the components of blood
to chemical and physical agents, and play also a significant role in blood
rheology.
The hemostatic system is maintained in a state of readiness to respond
immediately to vascular injuries due to the perforation of the vessel wall, ac-
tivation of endothelium by chemicals or inflammatory processes. This high
reactivity can inevitably result in the initiation of clotting and the threat of
thrombosis. Blood platelets participate in both hemostasis and thrombosis
by adhering to damaged vessels and by getting activated releasing chemic-
als (activators responsible for the blood coagulation cascade) into the blood
plasma. They can induce other platelets to become activated and to aggregate
and, once the activated platelets bind with the sub-endothelium, the aggreg-
ate interacts with fibrin to form irreversible hemostatic plugs (thrombi). Prior
to this, however, platelet aggregates that are formed by this process can break
up (when the concentration of activators exceeds a certain value), damaging
the platelets and causing aggregation at locations different from the site of
damage. Arterial occlusion, acute myocardial infarction, venous thrombosis
and most strokes are some of the pathological processes related to platelet
activation.
Understanding these processes is an issue of major medical importance.
Numerous clinical and experimental studies recognized that thrombus form-
ation occurs not in regions of parallel flow, but primarly in regions of stagna-
tion point flows, within blood vessel bifurcations, branching and curvatures.
Moreover, internal cardiovascular devices such as prosthetic heart valves,
ventricular assisting devices and stents, generally harbor high hemodynamic
shear stresses that can cause platelet activation. Thrombotic deposition en-
countered in these devices is a major cause of their failure and of the patho-
logical effects mentioned above. A reliable model that can predict regions of
An Overview of Some Mathematical Models of Blood Rheology 73

platelet activation and deposition (either in artificial devices or in arteries),


has the potential to help optimize design of such devices and also identify
regions of the arterial tree susceptible to the formation of thrombotic plaques
and possible rupture in stenosed arteries.
The mechanism of platelet activation and blood coagulation is quite com-
plicated and not yet completely well understood. Recently, Kuharsky and
Fogelson [28] have developed a model consisting of 59 first order ODEs that
combines a fairly comprehensive description of coagulation biochemistry,
interactions between platelets and coagulation proteins and effects of chem-
ical and cellular transport. They assume that all reactions occur in a thin layer
shell above the injured surface and the constants and concentrations used in
the models are only based on those available in the literature (no curve fit-
ting was done). This model, as well as previous work developed along these
lines (see e.g. [21, 27, 55]) can be considered as an important achievement
to capture many of the biochemical aspects of the problem. However, they
do not allow for the realistic hydrodynamical and rheological characteristics
of blood flow in vessels whose geometry is made complex by the presence
of wall-adherent platelets or atherosclerotic plaques. A phenomenological
model introduced by Anand and Rajagopal [1, 2] can be considered as the
first approach to address this oversight. This last paper features an extensive
bibliography on the subject.

3. BLOOD CONSTITUTIVE MODELING


In large and medium sized vessels blood can be modelled as an homogen-
eous incompressible Newtonian fluid, with flow behaviour described by the
time-dependent Navier–Stokes equations. These equations are derived from
the conservation of linear momentum and mass (incompressibility condition)
and, in a general form, they read

∂u
ρ + ρ(u · ∇)u = ∇ · σ + f , ∇ · u = 0 in t ⊂ IR3 , t ∈ (t0 , T )
∂t
(1)
where t is an open bounded set, representing the interior of a vessel at time
t, with a sufficiently regular boundary denoted by t composed of tw , tin
and tout the vessel lateral wall, inlet boundary and outlet boundary, respect-
ively. The convective term is


3

(u · ∇)u = uj u. (2)
j =1
∂xj

Here u denotes the flow velocity vector, ρ is the constant fluid density and
f are the external body forces per unit volume (e.g. gravity). The Cauchy
74 A. Sequeira and J. Janela

stress tensor σ is expressed as the combination of an isotropic pressure p


and the viscous contribution

σ = −pI + 2ηD (3)

where η is a constant dynamic viscosity and D is the rate of deformation


tensor defined by
1 
D= ∇u + (∇u)T . (4)
2

The system of equations (1) must be closed by imposing appropriate ini-


tial and boundary conditions. This usually reduces to prescribing either the
velocity field or tangential and normal components of the stress vector in  in
and  out . We prefer to consider the flow as being driven by a pressure drop,
but this must be done in a careful way since only for fully developed outflow
velocities a prescribed normal component of the stress vector (together with
zero tangential velocity) corresponds to a prescribed pressure.
In cases where the vessel is not assumed to be rigid these equations are
generally rewritten using the ALE (Arbitray-Lagrangean-Eulerian) formula-
tion that is more suitable for moving domains. When considering the full
fluid-structure interaction problem with the vessel walls, a model must be
specified for the structure and convenient interface conditions in the solid-
fluid interface.
As already pointed out blood is essentially a non-Newtonian fluid and the
constitutive equations for blood must incorporate the non-linear viscometric
properties of blood previously discussed. In this section we present a review
on the macroscopic constitutive models that can mathematically characterize
the rheology of blood and describe its known phenomenological properties,
especially the shear-thinning and viscoelastic behaviour. The corresponding
non-Newtonian constitutive equations are subdivided into generalized New-
tonian or inelastic models and viscoelastic models.

3.1 Generalized Newtonian models


We start from the constitutive assumption that the Cauchy stress tensor
σ only depends on the fluid mass density and the velocity gradient, mean-
ing that the current state of stress depends only on the velocity gradient at
the current time and not on the history of deformations the fluid may have
undergone in the past. If we further demand invariance under a superposed
rigid motion, using a representation theorem for isotropic symmetric tensor
functions, it can be shown that the most general form σ can assume is

σ = φ0 I + φ1 D + φ2 D 2 (5)
An Overview of Some Mathematical Models of Blood Rheology 75

where D is the symmetric part of the velocity gradient (4) and φ0 , φ1 , φ2


depend on the density ρ and on the three principal invariants of D, ID =
tr(D), IID = ((trD)2 − tr(D 2 ))/2 and IIID = det (D). Using the same
arguments for incompressible fluids for which the stress tensor only depends
on the velocity gradient, it can be seen that the stress tensor must be of the
form
σ = αI + φ1 D + φ2 D 2 (6)
where α is a Lagrange multiplier connected to the incompressibility con-
straint and φ1 , φ2 only depend on IID and IIID . These fluids are generally
known has Reiner-Rivlin fluids. If φ2 = 0 and φ1 is constant, we recover
the classical Newtonian fluids. On the other hand Reiner-Rivlin fluids with
φ2  = 0 don’t match any experimental results under simple shear. Finally, if
we consider that the dependence of φ1 on IIID is negligible, we obtain the
so called Generalized Newtonian fluids. Thermodynamic considerations and
the analysis of their behaviour under simple shear (and other viscometric
flows) leed to the final form of σ
σ = −pI + 2η(γ̇ )D (7)

where γ̇ := 2D : D is the shear rate. Generalized Newtonian models
differ only on the functional dependence of the non-Newtonian viscosity η,
on the shear rate. Each model involves a number of parameters that allow
for fitting to experimental data of the fluid under analysis.
Table 1 summarizes some of the most common generalized Newtonian
models that have been considered for the shear-dependent viscosity of whole
human blood (see Cho and Kensey [13]).
In these models the constants η0 and η∞ are the asymptotic viscosities at
zero and infinity shear rates, i.e.
η0 = lim η(γ̇ ), η∞ = lim η( γ̇ ),
γ̇ →0 γ̇ →∞
n is the power index and λ are parameters determined by numerical fitting of
experimental data.
Attempts to recognize the shear–thinning nature of blood were initiated by
Chien et al. [10, 11] in the 1960s. Empirical models like the power-law (or
Walburn-Schneck power-law [54], with constants related to hematocrit and
the content of protein minus albumin), Cross [18], Carreau [8], Carreau–
Yasuda or modified models [53] were seen to agree well in their predictions
and were preferred over the power-law model which has an unbounded vis-
cosity at zero shear-rate. The main advantage of simpler models like power-
law is that there are exact solutions available in some geometries and flow
conditions, providing natural benchmarks for the numerical codes. For a re-
cent survey and experimental tests on several inelastic constitutive models
76 A. Sequeira and J. Janela

Table 1. Comparison of various Generalized Newtonian models for blood.

Model non-Newtonian viscosity model constants for blood

Power-Law η(γ̇ ) = k γ̇ n−1 n = 0.61, k = 0.42

η(γ̇ ) − η∞ sinh−1 (λγ̇ ) η0 = 0.056P as, η∞ = 0.00345P as


Powell-Eyring =
η0 − η∞ λγ̇ λ = 5.383s

η0 − η∞ η0 = 0.056P as, η∞ = 0.00345P as


Cross η(γ̇ ) = η∞ +
1 + (λγ̇ )m λ = 1.007s, m = 1.028

η0 − η∞ η0 = 0.056P as, η∞ = 0.00345P as


Modified Cross η(γ̇ ) = η∞ +
(1 + (λγ̇ )m )a λ = 3.736s, m = 2.406, a = 0.254

η(γ̇ ) − η∞ η = 0.056P as, η∞ = 0.00345P as


Carreau = (1 + (λγ̇ )2 )(n−1)/2 0
η0 − η∞ λ = 3.313s, n = 0.3568

η(γ̇ ) − η∞ η = 0.056P as, η∞ = 0.00345P as


Carreau–Yasuda = (1 + (λγ̇ )a )(n−1)/a 0
η0 − η∞ λ = 1.902s, n = 0.22, a = 1.25

for blood, see [58]. Also the belief that blood demonstrates a yield shear
stress led to one of the simplest constitutive models for blood, the Casson’s
equation (see [44]), which is valid only over a small range of low shear rates
and in steady flow. The evidence for yield stress in blood is circunstantial
and there is no consensus about its value. However, none of the above ho-
mogeneized models are capable of describing the viscoelastic response of
blood.

3.2 Viscoelastic models


A simple way to account for the elastic effects in a non-Newtonian fluid
is to consider the constitutive equation for the Maxwell fluid given by

S + λ1 S = 2µ0 D, σ = −pI + S (8)

where S is the extra-stress tensor and ∇ stands for the upper-convected de-
rivative of a tensor field
∇ ∂S
S= + (u · ∇)S − S · ∇u − (∇u)T · S (9)
∂t
The constant λ1 > 0 is the stress relaxation time (the larger is λ1 , the
slower is relaxation) and the material constant µ0 is the (zero shear rate)
viscosity coefficient.
An Overview of Some Mathematical Models of Blood Rheology 77

A more general class of rate type models, the so-called Oldroyd-type mod-
els, can be defined by
∇ ∇
S + λ1 S = 2µ0 (D + λ2 D) (10)

where λ2 is the relaxation time, with 0 ≤ λ2 < λ1 .


The computational approach makes use of a decomposition of the total
extra-stress tensor S into its non-Newtonian (polymeric) S 1 and Newtonian
(solvent) S 2 parts such that

S = S1 + S2. (11)

The corresponding stress relations become



S 1 + λ1 S = 2µ1 D, (12)

S 2 = 2µ2 D, (13)
where µ1 is the elastic viscosity and µ2 the Newtonian viscosity. It can be
shown that
µ0 = µ1 + µ2 and λ2 = µ2 λ1 /µ0 . (14)
If λ2 = 0 the model reduces to the upper-convected Maxwell fluid (8), while
if λ1 = λ2 = 0 it is a purely Newtonian fluid (3) with viscosity µ0 .
By substituting relations (11) and (13) into the constitutive equation (10)
and taking into account the conservation of linear momentum and mass, the
equations of motion of an Oldroyd-B fluid can be written as
 
∂u
ρ + (u · ∇)u − µ2 u + ∇p = ∇ · S 1 , ∇ · u = 0
∂t
  (15)
∂S 1
S 1 + λ1 + u.∇S 1 − S 1 · ∇u − ∇uT · S 1 = 2µ1 D
∂t

in t ⊂ IR3 , with t ∈ (t0 , T ).


The governing equations of an Oldroyd-B model are of mixed parabolic-
hyperbolic type. To close the system initial and boundary conditions must
be given. In this case the boundary conditions are the same as for the
Navier–Stokes equations, supplemented by the specification of the stress
components at the inlet boundary. Usually the constitutive equations of
non-Newtonian viscoelastic fluids of differential or rate type lead to highly
non-linear systems of partial differential equations of this kind (parabolic-
hyperbolic for unsteady flows and elliptic-hyperbolic for steady flows) and
specific techniques of non-linear analysis, such as fixed-point arguments as-
sociated to auxiliary linear sub-problems are required to study the behaviour
78 A. Sequeira and J. Janela

of their solutions in different geometries. The mathematical and numerical


analysis of non-Newtonian fluid models is a very rich field of research, with
many fascinating problems (see e.g. [31, 43]).
As already referred in Section 2.1 various attempts have been made to re-
cognize the viscoelastic nature of blood at low shear rates. Thurston [48] pro-
posed a generalized Maxwell model that was applicable to one dimensional
flow simulations and observed later that, beyond a critical shear rate, the
non-linear behaviour is related to the microstructural changes that occur in
blood (see [49, 51]). Quemada [38] also derived a non-linear Maxwell type
model involving a first order kinetic equation used to determine a structural
parameter related with the viscosity. Phillips and Deutsch [35] proposed a
three-dimensional frame invariant Oldroyd-B type model with four constants
which could not capture the shear-thinning behavior of blood throughout the
range of experimental data. Other rate-type constitutive models for describ-
ing blood rheology have been proposed in the recent literature. Yeleswarapu
[57] has obtained a three constant generalized Oldroyd-B model by fitting
experimental data in one-dimensional flows and generalizing such curve fits
to three dimensions. It captures the shear-thinning behaviour of blood over a
large range of shear rates but it has limitations, given that the relaxation times
do not depend on the shear rate, which does not agree with experimental ob-
servations. A variant of this model, which also includes a shear-thinning
viscosity function has been proposed and studied by Arada and Sequeira [4].
The model recently developed by Anand and Rajagopal [3] in the general
thermodynamic framework of Rajagopal and Srinivasa [40] includes relaxa-
tion times depending on the shear rate and gives good agreement with exper-
imental data in steady Poiseuille flow and oscillatory flow. Finally we also
refer to a recent shear-thinning, viscoelastic and thixotropic model related to
the microstructure of blood, derived by Owens [32]. This model is inspired
on the classical theory of transient networks for viscoelastic fluids and their
predictions compare well with experiments for simple shear flows.

4. SOME NUMERICAL SIMULATIONS


Numerical simulation is an important tool for prediction of non-
Newtonian phenomena. In the last two decades, intensive research has been
performed in this area, mainly for differential and rate-type models [31].
The hyperbolic nature of the constitutive equations is responsible for many
of the difficulties associated with the numerical simulation of viscoelastic
flows. Some factors including singularities in the geometry, boundary lay-
ers in the flow and the dominance of the non-linear terms in the equations,
result in numerical instabilities for high values of the Weissenberg number
(non-dimensional number related with the elasticity of the fluid). Numerical
An Overview of Some Mathematical Models of Blood Rheology 79

schemes used for solving these complex systems of PDEs must be based on a
deep understanding of the mixed mathematical structure of the equations, in
order to prevent numerical instabilities on problems that are mathematically
well-posed. Discretizations in space are usually performed with Galerkin
methods (Petrov-Galerkin or generalized Galerkin) or by collocation, redu-
cing the problems to finite dimensional spaces. These choices are involved in
the finite element method (conforming, non-conforming, mixed or hybrid),
in the spectral method (Legendre or Chebychev expansion) or in the finite
volume method. Finite difference and fractional-step schemes are generally
used for marching in time (see e.g. [39]). All these methods lead to the solu-
tion of algebraic systems, typically very large, that are solved using direct or
iterative methods. The solution of these algebraic systems often requires the
use of preconditioners that can be regarded as operator projections, multigrid
or domain decomposition methods. The major difficulties in many numer-
ical schemes are related to the large amount of computation involved and to
the loss of convergence or stability. This is the object of active research in
the field.

4.1 Geometric reconstruction and mesh generation


Relevant blood flow simulations must be able to incorporate patient spe-
cific data. Since hemodynamics depends heavily on the geometry, it is im-
portant to run the numerical simulations on realistic geometries coming from
medical imaging. The most common medical imaging technique presently
used to obtain 3D representations of the human body is magnetic resonance
(MR). The images obtained with this technique (MRI) are density plots of su-
cessive cross-sections of the area under investigation. Many algorithms using
for instance levelset theory were developed to identify lines in these cross-
sections, resulting in images like the one shown in Figure 3. Nowadays, fast
and accurate scanning devices (e.g. magnetic resonance) are widely avail-
able for engineering and biomedical applications. The challenge is not to
collect the data but to be able to translate them into something usable in
computer simulations. On the other hand, discrete data resulting from this
image acquisition is usually converted into polygonal surface meshes that of-
ten contain millions of elements. Many times this huge amount of polygonal
elements comes not from the geometric complexity of the organs, arteries or
tissues but from an excessive density of data offered by the scanning devices.
In this perspective, even though brute force algorithms like the marching-
cubes are still very popular, a great effort must be made to devise adaptive
reconstruction algorithms that capture the essence of the geometric object at
a lower computational and storing costs. For further details, see Frey [22]
and references therein.
80 A. Sequeira and J. Janela

Formally, we can describe the problem of simplifying the initial brute-


force mesh in the following way: starting with a bounded and closed set ⊂
IR3 defined by its boundary  and assuming an initial meshing Mref (),
possibly with an associated metric Href () to prescribe the size of elements,
the goal is to construct a more suitable mesh M() for calculations, i.e. with
much less elements but the same accuracy in the geometric description. The
usual procedure is as follows:
1 The initial meshing, obtained by applying some brute-force method
in the original medical images, is simplified and optimized within a
tolerance envelope, based on a Hausdorff distance supplied by the user.
This yields a geometric reference meshing Mref,G ().
2 A Geometric piecewise C 1 object is constructed over Mref,G () de-
fining in this way a representation of the surface .
3 The metric Href,G () is modified to account for the geometry and
desired smoothing effects.
4 The mesh Mref () is adapted with respect to the modified metric
giving the final mesh M().
As an example we present in Figure 4 the results of this procedure, starting
with a very fine mesh of a human hand and ending up with two coarser
meshes.

Figure 3. Surface reconstruction from a Figure 4. Example of geometric simpli-


series of parallel cutting surfaces obtained fication. From the original brute-force dis-
from magnetic resonance. Initial lines are cretization to a course mesh still showing
obtained from images using levelset meth- the main geometric features.
ods.

4.2 Finite element method and results


The finite element method is one of the most important numerical meth-
ods to approach the solution of partial differential equations. One of its sig-
nificant advantages over other methods like finite differences, finite volumes
An Overview of Some Mathematical Models of Blood Rheology 81

or spectral methods is its high flexibility in dealing with irregular shaped


boundaries. Different techniques have been used to solve the Navier–Stokes
equations with the finite element method (see e.g. Quarteroni and Valli [39],
Temam [47], Girault and Raviart [23], Gresho and Sani [24]). However, the
development of accurate, stable and efficient algorithms is still an important
mathematical research topic.
The finite element approach requires the differential problem to be written
in a weak (variational) form. Let us define two Hilbert spaces V and Q. The
weak or variational formulation of our problem is obtained by multiplying
the governing equations by test functions v ∈ V and q ∈ Q and integrating
by parts. The use of test functions can be seen as describing indirectly the
solution by its effect on them. If we prescribe as boundary condition the
normal stress vector s = σ · n in  =  in ∪  out , together with no-slip
boundary conditions in − , our problem consists in finding u ∈ V and
p ∈ Q such that,
    
∂u
ρ + (u · ∇) · u · v + S : ∇v − p∇ · v =
∂t
 
f · v + s · v, ∀v ∈ V (16)


q∇ · u = 0, ∀q ∈ Q.

In the case of Newtonian or generalized Newtonian fluids the extra-stress


tensor S in the second term of equation (15) is explicitly computed from the
velocity gradient, through the constitutive equations (3) or (7). When dealing
with a viscoelastic fluid the extra stress-tensor is obtained as the solution of
a transport-like equation as in (15).
The discretization in time is done by a suitable second order trapezoidal
rule / back-differentiation formula and the discretization in space uses a
standard Petrov–Galerkin method (see e.g. [39]). To apply the Galerkin
method we discretize the spatial domain and introduce two families
{Vh | h > 0} and {Qh | h > 0} of finite element subspaces of V and Q, re-
spectively, satisfying a compatibility condition, the so-called discrete inf-sup
or LBB condition. The solution is approximated by piecewise polynomial
functions on each element of the discretized domain. These polynomials
must be chosen in such a way that the discrete inf-sup condition is fulfilled,
otherwise the locking phenomenon for the velocity field or spurious pressure
modes can occur. For instance equal order interpolation for both the velocity
and pressure unknowns does not verify the inf-sup condition. The most com-
mon discretization technique is P 2 − P 1 (piecewise quadratic elements for
the velocity and linear elements for the pressure) or P 1 iso P 2 − P 1 where
82 A. Sequeira and J. Janela

the velocity is linear over each of the four sub-elements obtained by joining
the midpoints of the edges of each pressure element. Since the spaces of
piecewise polynomials are of finite dimension, the substitution of the func-
tions in the weak formulation by their expansions in the basis of the discrete
spaces leads, after the numerical evaluation of the integrals, to a non-linear
system of finite dimension. The resulting system is then linearized, at each
time step, using an iterative Newton-like method. Error bounds can be de-
rived for the numerical solution of this problem, based on the size of the mesh
used to discretize the domain and on the type of finite elements (regularity
across elements and interpolation order).
In the numerical simulations presented here we use vessels reconstructed
from an MRI of the cerebral vasculature. Figures 5–9 show the original
reconstruction (Figure 5) and two extracted pieces: a slightly curved vessel
(Figure 6) and a bifurcation (Figure 7). In Figures 8 and 9 we display the
corresponding meshes.

Figure 5. Geometric re- Figure 6. Portion of the Figure 7. Portion of the


construction of the cerebral cerebral vasculature featur- cerebral vasculature featur-
vasculature. ing a slightly curved vessel. ing a non-planar bifurcation.

The numerical simulations were carried out in the vessel shown in Fig-
ure 6, considering pulsatile blood flow. The vessel has an average diameter
of 1cm and approximate length of 7cm. We compare the obtained results by
modelling blood using a Newtonian and a Carreau–Yasuda model to study
the non-Newtonian viscosity effects. Flow is driven by a pulsatile pressure
drop between the extremities of the vessel.
In Figures 10 and 11 it is visible that both models predict approximately
the same Wall Shear Stress (WSS) distribution, with the Newtonian model
yielding slightly higher values as well as larger high WSS regions. This
different behaviour can have a considerable impact for instance when the
models are used in clinical decisions related with some pathologies such as
the development of aneurysms or atherosclerotic plaque formation.
An Overview of Some Mathematical Models of Blood Rheology 83

Figure 8. Meshing of the vessel shown in Figure 9. Meshing of the bifurcating ves-
Figure 6. sel in Figure 7.

Figure 10. Wall Shear Stress (WSS) for the Figure 11. Wall Shear Stress (WSS) for the
Newtonian model at time t = 0.22. Carreau–Yasuda model at time t = 0.22.

Figure 12 represents the isovalues of the velocity field along the z axis for
both Newtonian and Carreau–Yasuda models taken in three cross sections of
the vessel. We observe different quantitative behaviours, in all cross sections,
when the results for the two models are compared. The Carreau–Yasuda
velocity shows a flattened profile (larger region of higher velocity), reaching
a lower maximum value.
84 A. Sequeira and J. Janela

Figure 12. Isovalues of the velocity along the z axis in three different cross sections of
the vessel for Newtonian model (first row) and the Carreau–Yasuda model (last row) at time
t = 0.94.

Acknowledgments
This research has been partially supported by the Center for Mathem-
atics and its Applications - CEMAT through FCTs funding program and
by the projects POCTI/MAT/41898/2001 and HPRN-CT-2002-00270 (Re-
search Training Network HaeMOdel of the European Union).

REFERENCES
[1] M. Anand, K. R. Rajagopal, A mathematical model to describe the change in the
constitutive character of blood due to platelet activation, C.R. Mécanique, 330, 2002,
pp. 557-562.
[2] M. Anand, K. Rajagopal and K. R. Rajagopal, A model incorporating some of the
mechanical and biochemical factors underlying clot formation and dissolution in flow-
ing blood, J. of Theoretical Medicine, 5, 2003, pp. 183-218.
[3] M. Anand and K. R. Rajagopal, A shear-thinning viscoelastic fluid model for describ-
ing the flow of blood, Int. J. Cardiovascular Medicine and Science, 4, 2004, pp. 59-68.
[4] N. Arada, and A. Sequeira, Strong Steady Solutions for a generalized Oldroyd-B
Model with Shear-Dependent Viscosity in a Bounded Domain, Mathematical Models
& Mehods in Applided Sciences, 13, no.9, 2003, pp. 1303-1323.
[5] O. K. Baskurt and H. J. Meiselman, Blood rheology and hemodynamics, Seminars in
Thrombosis and Hemostasis, 29, 2003, pp. 435-450.
[6] C. G. Caro, J. M. Fitz-Gerald and R. C. Schroter, Atheroma and arterial wall shear:
observation, correlation and proposal of a shear dependent mass transfer mechanism
of artherogenesis, Proc. Royal Soc. London, 177, 1971, pp. 109–159.
[7] C. G. Caro, T. J. Pedley, R. C. Schroter and W. A. Seed, The Mechanics of the
Circulation, Oxford University Press, Oxford, 1978.
An Overview of Some Mathematical Models of Blood Rheology 85

[8] P. J. Carreau, PhD Thesis, University of Wisconsin, Madison, 1968.


[9] I. Chatziprodromoua, A. Tricolia, D. Poulikakosa and Y. Ventikos, Haemodynamics
and wall remodelling of a growing cerebral aneurysm: A computational model, Journal
of Biomechanics, accepted December 2005, in press.
[10] S. Chien, S. Usami, R. J. Dellenback, M. I. Gregersen, Blood viscosity: Influence of
erythrocyte deformation, Science, 157 (3790), 1967, pp. 827-829.
[11] S. Chien, S. Usami, R. J. Dellenback, M. I. Gregersen, Blood viscosity: Influence of
erythrocyte aggregation, Science, 157 (3790), 1967, pp. 829-831.
[12] S. Chien, S. Usami, R. J. Dellenback, M. I. Gregersen, Shear-dependent deformation
of erythrocytes in rheology of human blood, American Journal of Physiology, 219,
1970, pp. 136-142.
[13] Y. I. Cho and K. R. Kensey, Effects of the non-Newtonian viscosity of blood on flows
in a diseased arterial vessel. Part I: Steady flows, Biorheology, 28, 1991, pp. 241-262.
[14] G. R, Cokelet, The rheology of human blood. In: Y. C. Fung and M. Anliker (Eds.),
Biomechanics: its foundations and objectives,, Ch. 4, Prentice Hall, 1972.
[15] A. L. Copley, The rheology of blood. A survey, J. Colloid Sci., 7, 1952, pp. 323-333.
[16] A. L. Copley and G. V. F. Seaman, The meaning of the terms rheology, biorheology
and hemorheology, Clinical Hemorheology, 1, 1981, pp. 117-119.
[17] V. Cristini and G. S. Kassab, Computer modeling of red blood cell rheology in the
microcirculation: a brief overview, Annals of Biomedical Engineering, 33, n.12, 2005,
pp. 1724-1727.
[18] M. M. Cross, Rheology of non-Newtonian fluids: a new flow equation for pseudo-
plastic systems, J. Colloid Sci., 20, 1965, pp. 417-437.
[19] R. Fåhraeus, Die Strömungsverhältnisse und die Verteilung der Blutzellen im Ge-
fässsystem, Klin. Wschr., 7, 1928, pp. 100-106.
[20] R. Fåhraeus and T. Lindqvist, The viscosity of blood in narrow capillary tubes, Am. J.
Physiol. 96. 1931, pp. 562-568.
[21] A. L. Fogelson, Continuum models of platelet aggregation: formulation and mechan-
ical properties, SIAM J. Appl. Math., 52, 1992, 1089-1110.
[22] P. J. Frey, Génération et adaptation de maillages de surfaces à partir de données ana-
tomiques discrètes, Rapport de Recherche, 4764, INRIA, 2003.
[23] V. Girault and P.-A. Raviart, Finite Element Methods for the navier-Stokes Equations,
Springer-Verlag Berlin, Heidelberg, New York, Tokyo, 1986.
[24] P. M. Gresho and R. L. Sani, Incompressible Flow and the Finite Element Method,
Vol.2, Jphn Wiley and Sons, Chichester, 2000.
[25] C. R. Huang, N. Siskovic, R. W. Robertson, W. Fabisiak, E. H. Smith-Berg and A. L.
Copley, Quantitative characterization of thixotropy of whole human blood, Biorhe-
ology, 12, 1975, pp. 279-282.
[26] R. Keunings, A survey of computational rheology, in: Proceedings of the XIIIth Inter-
national Congress on Rheology (D.M. Binding et al. ed.), British Soc. Rheol., 1, 2000,
pp. 7-14.
[27] A. Kuharsky, Mathematical modeling of blood coagulation, PhD Thesis, Univ. of Utah,
1998.
86 A. Sequeira and J. Janela

[28] A. Kuharsky, A. L. Fogelson, Surface-mediated control of blood coagulation: the role


of binding site densities and platelet deposition, Biophys. J., 80 (3), 2001, pp. 1050-
1074.
[29] D. O. Lowe, Clinical Blood Rheology, Vol. I, II, CRC Press, Boca Raton, Florida, 1998.
[30] D. E. McMillan, J. Strigberger and N. G. Utterback, Rapidly recovered transient flow
resistance: a newly discovered property of blood, American Journal of Physiology,
253, pp. 919-926.
[31] R. G. Owens and T. N. Phillips, Computational Rheology, Imperial College
Press/World Scientific, London, UK, 2002.
[32] R. G. Owens, A new microstructure-based constitutive model for human blood, J.
Non-Newtonian Fluid Mech.,, 2006, to appear.
[33] M. J. Perko, Duplex Ultrasound for Assessment of Superior Mesenteric Artery Blood
Flow, European Journal of Vascular and Endovascular Surgery, 21, 2001, pp. 106-
117.
[34] K. Perktold and M. Prosi, Computational models of arterial flow and mass transport,
in:Cardiovascular Fluid Mechanics (G. Pedrizzetti, K. Perktold, Eds.), CISM Courses
and Lectures n.446, Springer-Verlag, pp. 73-136, 2003.
[35] W. M. Phillips, S. Deutsch, Towards a constitutive equation for blood, Biorheology,
12(6), 1975, pp. 383-389.
[36] A. S. Popel and P. C. Johnson, Microcirculation and hemorheology, Annu. Rev. Fluid
Mech., 37, 2005, pp. 43-69.
[37] A. R. Pries and T. W. Secomb, Rheology of the microcirculation, Clinical Hemorhe-
ology and Microcirculation, 29, 2003, pp. 143-148.
[38] D. A. Quemada, A non-linear Maxwell model of biofluids - Application to normal
blood, Biorheology, 30(3-4), 1993, pp. 253-265.
[39] A. Quarteroni and A. Valli, Numerical Approximation of Partial Differential Equa-
tions, Springer-Verlag, Heidelberg, 1994.
[40] K. R. Rajagopal, A. Srinivasa, A thermodynamic framework for rate type fluid models,
J. of Non-Newtonian Fluid Mech., 88, 2000, pp. 207-228.
[41] G. Rappitsch, K. Perktold and E. Pernkopf, Numerical modelling of shear-dependent
mass transfer in large arteries, Int. J. Numer. Meth. Fluids, 25, 1997, pp. 847-857.
[42] M. Renardy, Existence of slow steady flows of viscoelastic fluids with a differential
constitutive equation, Z. Angew. Math. Mech., 65, 1985, pp. 449-451.
[43] M. Renardy, Mathematical Analysis of Viscoelastic Flows, CBMS 73, SIAM, Phil-
adelphia, 2000.
[44] G. W. Scott-Blair, An equation for the flow of blood, plasma and serum through glass
capillaries, Nature, 183, 1959, pp. 613-614.
[45] R. Tabrizchi and M. K. Pugsley, Methods of blood flow measurement in the arterial
circulatory system, Journal of Pharmacological and Toxicological Methods, 44 (2),
2000, pp. 375-384.
[46] C. A. Taylor, M. T. Draney, J. P. Ku, D. Parker, B. N. Steele, K. Wang, C. K. Zar-
ins, Predictive medicine: Computational techniques in therapeutic decision–making,
Computed Aided Surgery, 4(5), 1999, pp.231-247.
[47] R. Temam, Navier–Stokes Equations, Theory and Numerical Analysis, North Holland,
Amsterdam, 1984.
An Overview of Some Mathematical Models of Blood Rheology 87

[48] G. B. Thurston, Viscoelasticity of human blood, Biophys. J., 12, 1972, 1205-1217.
[49] G. B. Thurston, Rheological parameters for the viscosity, viscoelasticity and thixo-
tropy of blood, Biorheology, 16, 1979, pp. 149-162.
[50] G. B. Thurston, Light transmission through blood in oscillatory flow, Biorheology, 27,
1990, pp. 685-700.
[51] G. B. Thurston, Non-Newtonian viscosity of human blood: flow-induced changes in
microstructure, Biorheology, 31(2), 1994, pp. 179-192.
[52] R. Unterhinninghofen, J. Albers, W. Hosch, C. Vahl and R. Dillmann, Flow quantifica-
tion from time-resolved MRI vector fields, International Congress Series, 1281, 2005,
pp. 126-130.
[53] G. Vlastos, D. Lerche, B. Koch, The superimposition of steady and oscillatory shear
and its effect on the viscoelasticity of human blood and a blood-like model fluid,
Biorheology, 34(1), 1997, pp. 19-36.
[54] F. J. Walburn, D. J. Schneck, A constitutive equation for whole human blood, Biorhe-
ology, 13, 1976, pp. 201-210.
[55] N. T. Wang, A. L. Fogelson, Computational methods for continuum models of platelet
aggregation, J. Comput. Phys., 151, 1999, pp. 649-675.
[56] B. J. B. M. Wolters, M. C. M. Rutten, G. W. H. Schurink, U. Kose, J. de Hart and F. N.
van de Vosse, A patient-specific computational model of fluid-structure interaction in
abdominal aortic aneurysms, Medical Engineering & Physics, 27, 2005, pp. 871-883.
[57] K. K. Yelesvarapu, M. V. Kameneva, K. R. Rajagopal, J. F. Antaki, The flow of
blood in tubes: theory and experiment, Mech. Res. Comm, 25 (3), 1998, pp. 257-262.
[58] J.-B. Zhang and Z.-B. Kuang, Study on blood constitutive parameters in different blood
constitutive equations, J. Biomechanics, 33, 2000, pp. 355-360.
MATHEMATICAL MODELS IN FINANCE

Maria do Rosário Grossinho


Instituto Superior de Economia e Gestão, Universidade Técnica de Lisboa, Rua do Quelhas,
6, 1200-781 Lisboa, Portugal

Abstract: In this paper we illustrate the interplay between Mathematics and Finance,
pointing out the relevance of stochastic calculus and mathematical model-
ling in some important aspects of modern finance. We present two types of
mathematical models: the binomial asset pricing model and continuous-time
models. We point out some sensitive points of research.

Keywords: Mathematical Finance, stochastic calculus and modelling, options

1. INTRODUCTION
Mathematics, as the language of science, has always played a relevant role
in the development of knowledge and technology, and nowadays it occupies
a unique place in modern society. The “high-tech" character of modern busi-
ness has increased the need for advanced methods, which rely, to a large
extent, on mathematical techniques. Therefore, Mathematical Finance has
emerged as a flourishing area of modern science, applying profound know-
ledge of pure mathematics to problems of financial economics.
Modelling of risky asset prices and modern option pricing techniques are
important tools for a responsible approach to the trading, asset management
and risk control of complicated financial positions. These subjects are, how-
ever, among the most mathematically complex of all applied areas of finance.
They require in-depth knowledge of mathematical tools that can deal on the
one hand with deterministic behaviours and on the other with some degrees
of uncertainty. That is why the theory of stochastic processes perfectly suits
the needs of financial theory and strategy.
Stochastic financial mathematics is now one of the fastest developing
fields of mathematics and applied mathematics that has very close ties with
economics, and is geared towards the solution of problems appearing every
day in real financial markets. We recall here an extract from the Editorial
paper presented in the First Issue of the First volume of the journal Finance
and Stochastics that Springer-Verlag has been publishing since 1997:
89
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 89–101.
© 2007 Springer. Printed in the Netherlands.
90 M. do Rosário Grossinho

Nearly a century ago, Louis Bachelier published his thesis Théorie de la


Spéculation, Ann. Sci. École Norm. Sup. 3 (1900) [1], in which he in-
vented Brownian motion as a tool for the analysis of financial markets. A.N.
Kolmogorov, in his own landmark work “Uber die analytischen Methoden in
der Wahrscheinlichkeitsrechnung , Math. Annalen 104 (1931), pp.415-458,
credits Bachelier with the first systematic study of stochastic processes in con-
tinuous time. But in addition, Bachelier’s thesis marks the beginning of the
theory of option pricing, now an integral part of modern finance. Thus the year
1900 may be considered as birth date of both Finance and Stochastics. For
the first seven decades following Bachelier, Finance and Stochastics followed
more or less independently. The theory of stochastic processes grew fast and
incorporating classical calculus became a powerful mathematical tool - called
stochastic calculus. Finance lay dormant until the middle of the twentieth cen-
tury, and then was resurrected as an offshoot of general equilibrium theory in
economics. With the work in the late 1960s and early 1970s of Black, Merton,
Samuelson and Scholes, modelling stock prices as geometric Brownian mo-
tion and using this model to study equilibrium and arbitrage pricing, the two
disciplines were reunited. Soon it was discovered how well suited stochastic
calculus with its rich mathematical structure - martingale theory, Itô calculus,
stochastic integration and PDE’s - was for a rigorous analysis of contempor-
ary finance, which would lead one to believe (erroneously) that also these
tools were invented with the application to finance in mind. Since then the
interplay of these two disciplines has become an ever growing research field
with great impact both on the theory and practice of financial markets.

In the next sections, we shall refer to some historical aspects which will
bring some enlightenment to the interplay between Mathematics and Finance
and then we shall introduce some important aspects of modern finance, also
presenting two mathematical models, one discrete in time and the other one
continuous in time.

2. SOME HISTORICAL ASPECTS


Louis Bachelier was the first person to model the dynamics of stock prices
based on random walks and their limit cases. This mathematical concept
was named Brownian motion, after the biologist Robert Brown, due to the
parallel between the random behaviour of asset pricing in the stock market
and the random movement of small particles immersed in a fluid which had
been observed in 1826 by Brown. Five years after Bachelier, in 1905, Albert
Einstein modeled Brownian motion in the physical context, [5]. Einstein ob-
served that, if the kinetic theory of fluids was right, every small particle of
water would receive a random number of impacts of random strength and
from random directions in any short period of time. This random bombard-
ment would explain the phenomenon observed by Brown in the previous
century.
Bachelier had remarkable intuition and insight. His work is considered
quite ahead of its time, not only because in order to be rigorously justi-
Mathematical Models in Finance 91

fied from the mathematical point of view it needed the development of the-
ories that only took place years later, for instance probability theory with
Kolmogorov, but also because financial markets were not so active as to take
advantage of his studies.
For those reasons, Bachelier’s work had little impact for a long time and
it was only fifty years later that his thesis really came to the limelight after
having been discovered in the MIT library by the economist Paul Samuelson,
Nobel Laureate in Economics in 1970. The impact of Bachelier’s work in
Samuelson’s opinion can be clearly seen in his remark: Bachelier seems to
have had something of a one-track mind. But what a track [13] (see also
[14]).
Samuelson and also Black, Scholes and Merton, following Bachelier’s
work, introduced geometric Brownian motion, which avoided some draw-
backs present in Brownian motion as a model, namely the fact the prices
could be negative and no average between the change of the value and the
value itself was expressed. Fisher Black and Miron Scholes [2] and Robert
Merton [12], working independently, obtained as a model for the dynamics
of a European call option the so-called fundamental equation, which will be
mentioned later. On account of that achievement, the Nobel prize in Eco-
nomics was awarded to R. Merton and M. Scholes in 1997, thus also honor-
ing F. Black (who died in 1995).
The above-mentioned models are continuous in time. However, a couple
of years later, Cox, Ross and Rubinstein introduced a model discrete in time
which is known as the binomial model. This model not only presents great
facility in computing but also the ideas are very clear.
Before ending this paragraph, we refer to several important mathem-
aticians whose works, besides being fundamental for Mathematics and Fin-
ance in general, are intimately related to the contents of the present text:
Nobert Wiener developed a rigorous mathematical theory of Brownian
motion in 1923 [16], using new results of measure theory together with
Fourier analysis. This explains why Brownian motion is also named
Wiener process.
Paul Lévy introduced the concept of martingales in probability theory
in the late 1930s.
Andrey Kolmogorov, among numerous major contributions made in a
whole range of different areas of Mathematics, both pure and applied,
built up probability theory in a rigorous way in the 1930s, providing
the axiomatic foundation on which the subject has been based ever
since.
92 M. do Rosário Grossinho

Kiyosi Itô, considered the father of stochastic integration, was re-


sponsible for a new integral notion in 1944 [8], known as Itô’s integral
and developed the corresponding differential calculus and the analysis
of stochastic differential equations using stochastic calculus.
Doob developed extensively martingale theory publishing a funda-
mental work on the subject [4] in 1953.

3. PRICING AND HEDGING – PROBLEM


FORMULATION
We shall focus on two models in order to study theoretical pricing of fin-
ancial derivative securities or, simply, derivatives. A derivative security is
a security whose value depends on the value of the basic underlying assets.
The derivative securities that we shall consider are called options.
Recalling the definition, an option is generally defined as a contract
between two parties in which one party, the holder, has the right but not the
obligation to buy or sell some underlying asset, at an agreed price K, at or
until an assigned time T , called maturity, while the second party, the writer,
has the obligation to sell or buy it if the first party wants to exercise his right.
Call options are contracts that give the holder the right to buy a given asset,
while put options, conversely, entitle the holder to sell. If that right can only
be exercised at maturity, they are called European options; if the holder has
the right to exercise the option at any time before maturity they are named
American options. Having rights without obligations has financial value. So,
option holders must purchase these rights, that is, must pay a premium.
At expiry or maturity, a call option is worthless if ST < K but has the
value ST − K if ST > K. This means in financial terms that its payoff is
(ST − K)+ = max (ST − K, 0) .
If, at maturity, ST > K, then the option buyer, named the holder, obtains
a profit equal to (ST − K) since he can buy the stock at the price K and
sell it immediately at ST . On the other hand, if at maturity ST < K, then
the buyer simply does not exercise his right and his loss is just the premium
paid CT . Starting from Brownian motion, Bachelier derived a formula for
the expectation of the payoff (ST − K)+ of a call option which gives us
information about the value of the reasonable (fair) price to be paid by the
buyer to the writer of a call option at the moment of the contract, that is, as
referred above, the premium.
So, the premium is the “price" of the derivative or, in other words, the
price of the contract. That is, the premium is the amount of money that the
buyer agrees to give to the seller of the derivative at time t = 0 in order to
have the possibility to receive the derivative at date T (the maturity time or
Mathematical Models in Finance 93

date of expiration) for an exercise price K fixed at t = 0. On the other hand,


the seller of the contract exposes himself to a certain amount of risk at the
maturity time against which he must protect himself. The questions raised
above constitute the two main problems that arise in option theory

What is the fair price of the contract, that is, what is the reasonable
price for the premium?

How can one protect oneself against the financial risk of a potential
loss at the date of expiration?

These problems are known as pricing and hedging, respectively.


To fix ideas, we shall consider European call options from now on . Any
other situation will be clearly pointed out.

4. THE BINOMIAL ASSET PRICING MODEL


Although historically it was created some years after the Black Scholes
model, we shall begin by referring to the binomial asset pricing model since
it allies two relevant characteristics. It is very simple and easy to understand
but it already contains all the concepts involved in option pricing. Besides,
it is often used in practice.
Consider trading dates, t0 = 0 and ti such that ti+1 − ti = δt, with i =
0, ..., N. At time t1 , the asset takes the values

⎨ S1,1 = Su, with probability p
S1 =

S1,0 = Sd, with probability (1 − p)

with d < u. At time t2 , state S2 is derived from state S1 in the following


way: S1,1 = Su is followed by states S2,2 = Suu and S2,1 = Sud, while
state S1,0 = Sd is followed by states S2,1 = Sdu and S2,0 = Sdd , and
so on. A state at time tm , Sm, is a sequence of length m of coefficients u
and d multiplying S, that gives the values of the random variable. Using the
hypothesis of Cox, Ross and Rubinstein [3] such that

1
d=
u
one derives

Sm,j = S · uj · u−(m−j ) = S · d −j d m−j with j = 0, 1, 2, . . . , m. (1)

This allows the investor to associate a binomial tree to his investment. For
example, if we consider five periods, t = 0, . . . , 5, the corresponding tree
94 M. do Rosário Grossinho

will be
S · u5
S · u4
S · u3 S · u3
S · u2 S · u2
S·u S·u S·u
S S S
S·d S·d S·d
S · d2 S · d2
S · d3 S · d3
S · d4
S · d5

When the market is arbitrage free between succeeding states of nature, one
may construct, node by node, a probability on the tree, such that the dis-
counted asset price process is a martingale. More precisely, introducing two
nonnegative numbers pu e pd such that pu + pd = 1 such that

Sm,j erδ = pu .Sm+1,j +1 + pd .Sm+1,j

it is obtained in an explicit form

Sm,j erδt − Sm+1,j erδt − d


pu = = ,
Sm+1,j +1 − Sm+1,j u−d

which represents the risk-neutral probability between time m and m + 1 for


the branch of the tree starting at the node (m, j ) and which is usually sup-
posed independent of the node.
Furthermore, one may compute the random values of the terminal pay-
off CN and then, through a backward induction argument, the values of the
payoff at time tm .
In fact,
CN = max {ST − K, 0} = [ST − K]+
or, specifying the different values of the terminal payoff, that is, at time
tN = T ,

CN,j = max SN,j − K, 0 = [SN,j − K]+ , with j = 0, ...., N.

Since we are considering a risk neutral world,

erδ · Cm,j = pu · Cm+1,j +1 + pd · Cm+1,j

that is
 
Cm,j = e−rδ . pu · Cm+1,j +1 + pd · Cm+1,j (2)
Mathematical Models in Finance 95

So, for the exemple of five periods considered above, we can contruct the
option tree
[S · u5 − K]+
C4,4
C3,3 [S · u3 − K]+
C2,2 C4,3
C1,1 C3,2 [S · u − K]+
C0,0 C2,1 C4,2
C1,0 C3,1 [S · d − K]+
C2,0 C4,1
C3,0 [S · d 3 − K]+
C4,0
[S · d 5 − K]+

where u+ = max{u, 0}. The value C0,0 obtained by this backward induction
procedure is precisely the fair price for the premium.
With this construction one can implement a hedging strategy by using the
obtained elements to create a portfolio that replicates the option. This is a
characteristic of complete markets.

5. CONTINUOUS-TIME MODELS
5.1 Bachelier model
In the following model, designated as Bachelier’s model, the stock prices
S = (St )t ≤T follow a Brownian motion with drift, that is,
St = S0 + µt + σ Wt , t ≤ T. (3)
In [1], it is considered that there is a (B, S) −market such that the bank ac-
count B = (Bt )t≤T remains fixed, Bt = 1. In a differential form, Bachelier’s
model can be written as
dSt = µ.dt + σ dWt .
In his work, Bachelier gave the price for a European option. Considering the
density function and the normal distribution, respectively,
 x
1
ϕ (x) = √ e−x /2
2
and  (x) = ϕ (y) dy, (4)
2π −∞

the following formula


   
S0 − K √ S0 − K
CT = (S0 − K)  √ +σ Tϕ √ (5)
σ T σ T
is called Bachelier’s formula (which is in fact an updated version of several of
Bachelier’s results on options). Besides historical aspects, the main interest
96 M. do Rosário Grossinho

of this model lies in the fact that it is arbitrage free (does not allow profits
without risk) and complete (is replicable) [15].
In the 1960s, Samuelson suggested modelling prices using what is now
designated as geometric Brownian motion

St = S0 eσ Wt +(µ−σ
2 /2
)t (6)

and which, from Itô’s calculus, can be written in the differential form

dSt = St (µdt + σ dWt ) . (7)

This suggestion, which solved some weaknesses contained in Bachelier’s


model, provided a workable model for asset prices and anticipated the central
result of modern finance, the Black–Scholes option-pricing formula.

5.2 Black–Scholes model


Assuming that the function C = C (t, S) is sufficiently smooth, Fisher
Black and Miron Scholes [2] and Robert Merton [12], working independ-
ently, obtained as a model for the dynamics of a European call option, de-
pending on the value of the asset S and of time t, the so-called fundamental
equation
∂C 1 2 2 ∂ 2 C ∂C
+ σ S 2
+ rS − rC = 0 (8)
∂t 2 ∂S ∂S
with the final condition

C (T , S) = max (S − K, 0) . (9)

Two more conditions with financial meaning are also assumed. They are

C (0, t) = 0 (10)

and the assymptotic condition

C (S, t)  S when S → ∞ (11)

in the sense
C (S, t)
lim = 1. (12)
S→∞ S
The coefficient σ is the volatitlity of the asset S and r is the annual rate.
An explicit solution can be determined using methods of partial differential
equations that involve transforming this problem into the heat equation with
an adequate condition. The solution, that is, the price of a call option at time
t is given by
C (t, S) = S (d+ ) − Ke−r(T −t )  (d− ) ,
Mathematical Models in Finance 97

where  is defined in (4) and


 
σ2
ln KS + (T − t) r ± 2
d± = √ ,
σ T −t
and the Black–Scholes Option Pricing Formula is
⎛  2
⎞ ⎛  ⎞
σ2
ln SK0 + T r + σ2 ln SK0 + T r − 2
CT = S0  ⎝ √ ⎠ − Ke−rT  ⎝ √ ⎠.
σ T σ T

The original proof established by Black, Scholes and Merton used the solu-
tion of the fundamental equation. However, this result can be derived by a
so-called martingale proof.

6. AMERICAN OPTIONS
In the above sections we have considered European call options. Some
considerations concerning American options will be presented next, for the
sake of completeness of concepts.

As mentioned previously, American options are different from European


options in as much as the holder has the right to exercise the option at any
time before maturity (t = T ). The majority of exchange-traded options are
American.
The freedom to exercise an American option whenever the holder wishes
places the Blach-Scholes equation in the framework of a free boundary prob-
lem. From a mathematical point of view, the boundary condition is no longer
fixed at t = T , as for the European options. American options become much
more difficult to study precisely because a free boundary condition arises. In
fact, exact solutions can be found analytically only for a couple of American
options. The put-call parity relationship for American options does not ex-
ist since the exercise date is no longer fixed. We shall fix our attention on
American put options, because American call options are in fact European
call options since it can be proved that the optimal exercise time is at matur-
ity.
Analytically, one can say that the following diferential inequality must be
satisfied
∂P 1 ∂ 2P ∂P
+ σ 2 S 2 2 + rS − rP ≤ 0 (13)
∂t 2 ∂S ∂S
together with the condition
P (S, t) ≥ max {K − S, 0} . (14)
This condition arises from the need to prevent arbitrage.
98 M. do Rosário Grossinho

6.1 Boundary value problems and American options


From a financial point of view, the contract holder will ideally, of course,
only exercise the option prior to the expiry date if the present payoff at time
t exceeds the discounted expectation of the possible future values of the op-
tion from time t to T . Otherwise, he will continue to hold on to the option.
At every time t there will be a region of values of S where it is best to ex-
ercise the option (Exercise region) and a complimentary region where it is
best to keep the option (Free region). There will also be a particular value
Sf (t) which defines the optimal exercise boundary separating the two re-
gions. More precisely,
a) one region where the exercise is optimal, that is S(t) ≤ Sf (t) , where
we have ⎧
⎪ P (S (t) , t) = K − S(t)

⎪ 2
⎩ ∂P + 1 σ 2 S 2 ∂ P + rS ∂P − rP < 0
2
∂t ∂S 2 ∂S
b) another region where the exercise is not optimal, S(t) > Sf (t) , and

⎪ P (S (t) , t) > K − S (t)

⎪ 2
⎩ ∂P + 1 σ 2 S 2 ∂ P + rS ∂P − rP = 0.
2
∂t ∂S 2 ∂S
Then,  
P Sf (t) , t = max K − Sf (t) , 0
and
P (S, t) > max {K − S, 0} se S > Sf (t) .
So, where the exercise is optimal, an inequality must be verified.
An American put option satisfies the following conditions which corres-
pond to the financial features of the problem
– at expiration
P (S (T ) , T ) = max {K − S (T ) , 0}
– on the boundary
lim P (S, t) = 0, ∀t ∈ [0, T ]
S→∞

– on the free boundary


⎧  
⎨ P Sf (t) , t = max K − Sf (t) , 0 , ∀t ∈ [0, T ]

⎩ ∂P Sf (t) , t  = −1, ∀t ∈ [0, T ).



∂S
Mathematical Models in Finance 99

Using methods of partial differential equations that involve transforming this


problem by a change of variables into the heat equation, with adequate condi-
tions, and using the linear complementary method similar to what is done in
the case of the obstacle problem, one arrives at a formulation of the problem
that can be treated numerically.

6.2 Numerical challenges


From what was said above, it can be seen American options are a rich
source of challenges both in Finance and in Mathematics. Advanced meth-
ods of functional analysis can also be used to study them, as can, for instance,
variational inequalities. Numerical analysis has proved to be a powerful and
very useful means of presenting approximate solutions. Implicit and explicit
finite difference methods applied to the corresponding discretized equations
give answers to the problem with a certain degree approximation and their
accuracy can be estimated.
The binomial method referred to previously for European call options can
also be used to compute the value of the American options. As seen above,
the different paths in the evolution of the underlying asset S are forced to
pass through a pre-established set of nodal points of the binomial tree, uS,
dS, udS, . . . and so on, which allow us to compute the value of the option
by backward induction. However, this computation can only happen at these
nodal points which are discretely spaced in time. The holder of an American
put option must decide at its node of backward induction whether it is better
to exercise the option or to keep it.
Numerical methods for random number generation, interpolation, com-
putational linear algebra, linear and integer programming play an important
role in the computational playground where American options, and Math-
ematical Finance in general, operate.

7. FINAL COMMENTS
With the above simple but representative examples of modern finance we
aimed to illustrate the deep interplay between mathematics and finance and
the challenges that arise in diversified areas of research. Much more could
be said concerning these topics. Just to give a flavour, we mention Harrison,
Kreps and Pliska [6, 7] established in the so-called Fundamental Theorem
of Asset Pricing that there exists a close link between nonexistence arbitrage
opportunities and martingales. This was a pioneering contribution to the
study of arbitrage pricing theory. Also Harry Markowitz [11] Nobel Laur-
eate in 1990 carried out a pioneering work concerning the construction of
portfolios, taking into account the benefits of diversification, so that expec-
ted returns may be optimized for a given level of market risk. In that year
100 M. do Rosário Grossinho

he shared the Nobel Prize in Economics with William Sharpe, creator of


the Capital Asset Pricing Model (CAPM), and Merton Miller, who studied
the relationship between a company’s capital-asset structure and its market
value.
The financial world is fast-changing and needs constant updating in order
to operate financial resources where new financial instruments and strategies
are always appearing, leading to new developments in mathematical re-
search. Risk measures, optimal stopping, stochastic control in finance,
stochastic games are some of the topics in which research is made nowadays.
Incomplete markets in which Black–Scholes style replication is impossible
since a risk neutral world is no longer assumed. Therefore, any pricing for-
mula must involve some balance of the risks contained. These facts show that
mathematical finance must inevitably look for new approaches and come up
with new theoretical developments.
We hope we have given the reader to a brief but stimulating glance over the
great interdisciplinary character of mathematical finance that benefits from
the fruitful contributions of probability theory, statistics, integral and dif-
ferential equations, stochastic calculus, computational methods, numerical
analysis, linear algebra, linear programming, convex analysis. The work in
mathematical finance is one of the outstanding examples of how fundamental
research and practical applications can be combined successfully. The deep
interplay between theory and practice contributes to the huge complexity of
the subject but also to its fascinating features, never ceasing to create recip-
rocal challenges with reciprocal advantages for both researchers and prac-
tioners.

ACKNOWLEDGMENTS
The author is thankful to the researchers of the Department of Mathemat-
ics of Instituto Superior de Economia e GestÃo, ISEG, that
comprise the informal group Núcleo de Matemática Financeira. Fruitful
studies and helpful discussions that have taken place in regular seminars have
greatly contributed to develop and reinforce the knowledge of the subject that
underlies the present text.
The author also thanks Centro de MatemÁtica Aplicada À Pre-
visÃo e DecisÃo EconÓmica, CEMAPRE, for the institutional sup-
port always readily provided in every initiative concerning the development
of studies in Mathematical Finance.
Mathematical Models in Finance 101

REFERENCES
[1] Bachelier, L. “Théorie de la Spéculation”, Ann. Sci. Ecole Norm. Sup. 17 (1900) 21-86
[English translation in P. Cootner (ed.) The Random Character of Stock Prices, MIT
Press, 1964, reprinted Risk Books, London 2000].
[2] Black, F. and Scholes, M. “The pricing of options and corporate liabilities”, J. Political
Econom. 81, 1973, 637-654.
[3] Cox, J.; Ross S. and Rubinstein, M. “Option pricing, a simplified approach”, J. Fin-
ancial Economics 7, 1979, 229-263.
[4] Doob, J. L. Stochastic Processes. New York, Wiley, 1953.
[5] Einstein, A. “Uber die von der molekularkinetischen Theorie der Wärme geforderte
Bewegung von in ruhenden Flüssigkeiten suspendierten”, Teilchen. Annalen der
Physik 17, 1905, 549-560.
[6] Harrison, J. M. and Kreps, D. M. “Martingales and arbitrage in multiperiod securities
markets”, J. Econ. Theory 20, 1979, 381-408.
[7] Harrison, J. M. and Pliska, S. R. “Martingales and stochastic integrals in the theory of
continuous trading”, Stoch. Proc. and Appl. 110, 1981, 215-260.
[8] Itô, K. “Stochastic Integral”, Proc. Imp. Acad. Tokyo 20, 1944, 519-524.
[9] Itô, K. “On a formula concerning stochastic differentials”, Nagoya Math. J. 3, 1951,
55-65.
[10] Kolmogorov, A. N. On Analytic Methods in Probability Theory, in A. N. Shiryaev, ed.,
Selected Works of A. N. Kolmogorov; Volume II: Probability Theory and Mathemat-
ical Statistics, Dordrecht, Kluwer, pp. 62-108, 1992. [Original: Uber die analytischen-
Methoden in der Wahrscheinlichkeitsrechnung, Math. Ann. 104, (1931) 415-458.]
[11] Markowitz, H. “Portfolio Selection”, Journal of Finance 7(1), 1952, 77-91.
[12] Merton, R. C. “Theory of rational option pricing”, Bell J. Econom. Manag. Sci. 4,
1973, 141-183.
[13] Samuelson, P. “Mathematics of Speculative Price”, SIAM Review 15, 1973, 1-42.
[14] Samuelson, P. “Modern finance theory within one lifetime” , Mathematical Finance
– Bachelier Congress 2000, eds. Geman, H., Madan, D., Pliska, S. R., and T. Vorst;
Springer-Verlag, Heidelberg, 2002, pp. 41-46.
[15] Shiryaev, A. N. Essentials of Stochastic Finance. Facts, Models, Theory. World Sci-
entific, 1999.
[16] Wiener, R. “Differential Space”, J. Math. Phys. 2, 1923, 131-174.
MORE SUSTAINABLE SYNTHETIC ORGANIC
CHEMISTRY APPROACHES BASED ON
CATALYST REUSE

Carlos A. M. Afonso,1,2 Luís C. Branco,2 Nuno R. Candeias,2 Pedro M. P.


Gois,1,2 Nuno M. T. Lourenço,2 Nuno M. M. Mateus,2 and João N. Rosa2
1
CQFM, Departamento de Engenharia Química, Instituto Superior Técnico, Universidade
Técnica de Lisboa, Complexo 1, Av. Rovisco Pais, 1049-001 Lisboa, Portugal;
E-mail: carlosafonso@ist.utl.pt

2
REQUIMTE, CQFB, Departamento de Química, Faculdade de Ciências e Tecnologia,
Universidade Nova de Lisboa, 2829-516 Caparica, Portugal

Abstract: In this article are mainly described the achievements in this laboratory on the
development of more attractive sustainable approaches in synthetic organic
chemistry namely by catalyst reuse by simple immobilisation in water and
ionic liquids, asymmetric transformations induced by readily available chiral
ionic liquids and selective transport using ionic liquids as supported
membranes.

Key words: Organic Synthesis, Asymmetric Catalysis, Catalyst Reuse, Ionic Liquids,
Water

1. INTRODUCTION

Catalysis has been for long time a crucial topic in organic synthesis
having high impact in modern society due to being a powerful tool that
allows the capability to produce complex useful molecules, including chiral
ones, in different areas such as pharmaceutical, food, agrochemistry,
material chemistry and energy resources [1]. Catalysis plays frequently a
central role in the more broad topic of green chemistry according to the 12
principles defined by Anastas and Warner [2]. Catalysis is also one of the
main topics selected by James Clark for the Clean Technology Pool:

103
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 103–120.
© 2007 Springer. Printed in the Netherlands.
104 C.A.M. Afonso et al.

Intensive processing, alternative routes, life cycle assessment, supercritical


solvents, microreactors, renewable feedstock’s, telescoped reactions, non-
volatile solvents, catalysis, alternative energy savers and solventless systems
[3]. The development of catalytic systems for synthetic methodologies which
allow high regio- and enantioselectivities when applied, and in high
efficiency (high TON and TOF) is extremely important. However, the ideal
application in process chemistry generally requires the possibility to extend
the process in order to reuse and recycle the catalytic system, without
affecting the main features of the catalyst. The main approach to catalyst
reuse is based on catalyst immobilization, usually by means of chemical
immobilization in organic or inorganic supports that allow the easy
separation of the catalyst from the reaction products and solvents. In this
case, the reaction is usually performed under heterogeneous conditions
which in many cases leads to some erosion on the catalyst performance. This
limitation can be circumvented by further extensive fine changes on
structure and combination of the support-spacer-catalyst which nevertheless
requires significant research efforts. In contrast, the catalytic system is
usually more efficient under homogeneous conditions, than under
heterogeneous conditions. However, the catalyst reuse becomes generally
more difficult. One approach is based on immobilization of the catalyst in
polymers that are soluble in the reaction medium but insoluble in another
solvent media. Other approaches are based on chemical catalyst
manipulation by anchoring groups with high affinity to alternative reaction
media such as fluorinated solvents (or fluorinated supports) [4], supercritical
CO2 (scCO2) [5], water[6] and ionic liquids (ILs) [7] or membrane
technology [8]. For all of those cases, in spite of being homogeneous,
allowing high catalyst performance during the reuse process, they have the
drawback of demanding appropriate chemical manipulation on the catalyst,
which can potentially interfere with the original catalyst efficiency. One
alternate approach is based on simple use of original catalytic systems under
homogeneous conditions and without any chemical manipulation in such a
way that the reaction products can be removed from the reaction media. In
this article are mainly presented the efforts made in this research laboratory
to develop more environmentally friendly organic synthetic methodologies
throughout reuse of the catalytic system by simple catalyst immobilization in
con-conventional solvents, such as water, ionic liquids and on the
development of new ionic liquids including chiral ones, in which the chiral
media acts as chiral inducing agent.
Synthetic Organic Chemistry Approaches Based on Catalyst Reuse 105

2. DISCUSSION

2.1 Catalyst immobilization in water

Water is the solvent media used in nature for in vivo biological organic
functional group transformations. The use of water in synthetic organic
chemistry has also some advantage in comparison to traditional organic
solvents not only for being a truly green solvent, readily available, but also
by allowing a remarkable induction of reactivity due to "hydrophobic effect"
[9]. Nowadays there is a considerable range of synthetic transformations
performed in pure water or using water as a co-solvent in which it was
possible to discover new reagent combination inert to water as solvent [6].
The absence of reactivity of some reagents or intermediates in water is still a
limitation in some cases. One example is the carbenoid chemistry in which it
was assumed that Rh and Cu carbenoides and free carbenes react very fast
with X-H labile bonds such as thiols, amines and alcohols as well as water
giving preferentially the products resulting from O-H insertions [10].
As part of our ongoing interest on Rh(II) carbenoid C-H insertion of D-
diazo-D-phosphoryl-acetamides 1[11] as an efficient approach to the
synthesis of valuable molecules such as E- and J-lactams, we observed that
the C-H insertion was not affected by the use of non-anhydrous 1,2-
dichloroethane or wet ionic liquids.[12] These observations prompted us to
study this transformation in water (Figure 1) [13]. Taking advantage of the
diazo substrates available in our laboratory, mainly D-phosphoryl-D-diazo-
acetamides 1, Gois and Candeias studied the diazo decomposition in water
using Rh2(OAc)4 as the catalyst. For D-phosphono-D-diazo-acetates the
reaction was less clean giving only the product of O-H insertion in
detectable amounts. In contrast, for a considerable range of D-phosphono-D-
diazo-amides exclusive C-H insertion was observed using Rh2(OAc)4 as
catalyst. Interestingly, these substrates in general were insoluble under the
reaction conditions, suggesting that the transformation for reactions occurs
under biphasic systems, which is in line with the higher reactivity recently
described by Sharpless et al. [9]. Additionally, for substrates were O-H and
C-H insertion occurs using Rh2(OAc)4, the C-H insertion can be increased if
a more hydrophobic catalyst is used such as Rh2(pfb)4 or Rh2(Ooct)4. The
following picture pretends to explain our findings according to hydrophobic
nature of the catalyst-substrate combination (Figure 2).
106 C.A.M. Afonso et al.
C-H Insertion O-H Insertion
O O O
X Rh(II) (1 mol%) X R'
Y X Y
N R' and/or
N2 water, 80 oC OH
n n
n
R R R
1 2n=0 4 n = 0,1
Y = O, N-R' 3n=1
X = PO(OEt)2, SO 2 Ph, Ac, CO 2 Et

Figure 1. Rh(II) catalyzed insertion of Ddiazo-acetamides and acetates in water.

O Less Polar O More Polar O


X X Rh(II) (cat.) (EtO)2OP
N R'
t r a b ic

Rh(II) (cat.) Y Y

t r a l ic
bs ho
te

bs hi
te
water, 80 o C N2 water, 80 oC OH

S u d ro p
S u ro p

n n n
R
d

R O-H Insertion R

Hy
C-H Insertion
Hy

X = PO(OEt)2, SO 2 Ph, Ac, CO 2 Et Y = O, N-R'

Figure 2. General dependence of Rh(II) catalyzed C-H vs OH insertion in water with catalyst
and substrate structure.

The combination of exclusive C-H insertion for some diazo substrates


catalyzed by Rh2(OAc)4 and the complete solubility of the catalyst in water
allows the development of a simple system for expensive Rh(II) catalyst
reuse just by extraction of the reaction mixture with an organic solvent.
Table 1 shows the results observed for the model substrate 1a in which the
catalyst was reused efficiently 10 times and with low lost of catalyst into the
organic phase. Similar behavior was observed for other substrates. This
clearly shows that this simple system is very robust for catalyst reuse.

Table 1. Reuse of the Rh2(OAc)4 catalyst using the substrate model 1a.
O i)Rh 2(OAc)4, 1 mol % X O
X H 2O, 80 ºC, 24h
N
N
N2 ii)Extraction (Et2 O)
iii)New adition of 1a
1a, X=PO(OEt)2 2a
Run Yield (%) Rh in Et2O (%)b
1 to 9 88a 1.6c
10 90 1.1
11 (63)d 0.2
a
Average yield for the combined cycles 1 to 9. b Percentage of rhodium relative to initial
amount detected by ICP in the organic phase. c Average value for the combined cycles 1 to 9 d
Observed conversion by 31P NMR (1a, 37 %; 2a, 63 %).
Synthetic Organic Chemistry Approaches Based on Catalyst Reuse 107

2.2 Ionic liquids (ILs) as an efficient media for product


separation and catalyst immobilization

Low melting salts have long been used in electrochemistry applications


due to their high electrochemical window and electrolyte properties. Since
the discovery of air stable and water resistant low melting salts, later
designated as room temperature ionic liquids (ILs), created during last years
an impressive interest in the scientific community in different research areas
[14] such as electrochemistry [15], organic [7], inorganic [7], organometallic
[7], polymer [16] and material chemistry, biotransformations [7],
remediation [17], fuel and solar cells [7], and separation technology [14]
(biphasic, membranes, scCO2, systems and pervaporation), flotation fluids,
lubrificants [14], nanotechnology [18] and paint additives [19]. Perhaps the
reasons for such wide research applications are due to some unique
properties such as high conductivity, wide electrochemical window, near
non-volatility [20], high thermal stability, low flammability [14], tunable
solubility in water and in common organic solvents, insolubility in scCO2
[21] high solubility and in some cases specific affinity for organic, inorganic,
organomettalic solutes, scCO2 and other gases [22] in some ILs, and high
stability of enzymes in some IL media [23].

2.2.1 Reuse of the IL reaction media

The use of volatile solvents (VOCs) in organic synthetic transformations


presents some environmental concerns for their use in industry mainly due to
ease release to the environment. In opposition, ILs can be potential
candidates for substitution of common VOCs due to its almost non-volatility.
However, the high cost, toxicity concerns of ILs [24], have limited their use
which probably is acceptable only if the IL reuse is easily feasible and even
better if the IL medium presents some advantage for the synthetic
transformation.
The nucleophilic substitution on saturated carbons is certainly an
important method for the formation of new C-C and C-heteroatom bonds,
including chiral biphasic systems. One reliable and efficient approach is
based on two phase systems consisting of aqueous and organic phases and
using an efficient phase transfer catalyst (PTC) which is generally an organic
cation [25]. Due to the ionic nature of the IL, we tested their use as
simultaneous reaction media and phase transfer promoter in a two liquid
phase system [26]. In fact Nuno Lourenço observed that the IL [C4mim]PF6
acts as a phase transfer catalyst for simple substitution of benzyl bromide by
108 C.A.M. Afonso et al.

several nucleophiles such as PhO-, CN-, and N3- under


water/dichloromethane biphasic system (Figure 3).

M +Nu -
water/CH 2Cl2 NaOPh 80 % a (3%) b
Br + M Nu
+ -
Ph Ph Nu
rt KCN 47 % a (5 %)b
NaN 3 90 % a (37 %) b
Figure 3. Ionic liquid as a phase transfer catalyst for nucleophilic substitution reactions. a)
[C4mim]PF6 (0.5 eq); b) In the absence of IL.

The use of IL instead of the organic volatile solvent and the PTC is
desirable. We observed that high conversion under mild conditions occurs
for substitution of alkyl chlorides, bromides and iodides by several
nucleophiles such as Schiff bases, phenoxides, I-, CN- and N3-. This method
appears very attractive for azide formation in which no erosion on the yield
was observed for the maximum of 15 cycles tested (Figure 4).

O [C 4 mim]PF 6/water O
+ NaN 3
Br N3
Ph rt, 1.5 h Ph
1 cycle: 94 %
1 to 15 cycles: quantitative
Figure 4. Reuse of the IL [C4mim]PF6 for azide formation.

2.2.2 Reuse of the catalytic system immobilised in ILs

Another example studied in this laboratory was the


tetrahydropyranilation of alcohols using the efficient catalysts p-
toluenesulphonic acid (TsOH), triphenylphosphine hydrobromide (TPP.HBr)
and pyridinium p-toluenesulphonate (PPTS) immobilised in ILs [27]. In this
case, the ionic liquid [C4mim] PF6 allows a slightly faster reaction providing
an efficient catalytic system for a range of substrates. This catalytic system
can be efficiently reused for up to 17 cycles just by simply extraction of the
reaction mixture with the appropriate organic solvent (Table 2). Due to
partial extraction of the catalyst and the IL by the solvent, the extraction
process and nature of the solvent used is crucial for the catalytic system
reuse performance.


Used abbreviations for the ionic liquids based on the imidazolium cation: 1-n-butyl-3-
methylimidazolium [C4mim], 1-n-octyl-3-methylimidazolium hexafluorophosphate
[C8mim] and 1-n-butyl-2,3-dimethylimidazolium tetrafluoroborate [C4dmim].
Synthetic Organic Chemistry Approaches Based on Catalyst Reuse 109

Table 2. Reuse of the catalytic system for the tetrahydropyranilation of alcohols.

i) [C4mim]PF6, catalyst (10 mol%),


OH rt, 1 h Ph
+
O Ph ii) Extraction (Et2O) O O
iii) New addition of reagents

Cycle Catalyst Conversion (%) Cycle Catalyst Conversion (%)


1 PPTS 93.7 1 TPP.HBr 96.4
2-11 94.8a 2-16 96.1a
12 86.0 17 92.3
a
Average conversion.

In another approach João Rosa demonstrated that the enantioselective


addition of alkynes to imines catalysed by Cu(I)-bis(oxazoline) can be
performed just by using the appropriate IL[28] instead of the solvent systems
consisting of toluene or water developed by Li et al.[29] In this case the
ionic liquid [C4mim]NTf2 was the most appropriate for the transformation
and catalyst reuse (Table 3). By extraction of the reaction mixture with n-
hexane less than 0.3 % of Cu was detected in the organic phase.

Table 3. Enantioselective addition of alkynes to imines in [C4mim]NTf2 catalysed by


CuOTf/(box) and catalytic system reuse.

NHPh
Ar N CuOTf/box (5 mol%)
Ar *
+
[C 4mim]NTf 2, rt, 4 days
Ph
Ph H
Ar Yield (%) e.e. (%)
4-MeC6H4 91 86
4-CF3C6H4 76 96
4-ClC6H4 92 94
4-BrC6H4 90 99
2-naphthyl 91 86
Ph 74 94
Cycle 1-6 Ph 82a 94-88
a
Average yield obtained by extraction of the reaction mixture in each cycle with n-hexane;
more 34 % of product was isolated in cycle 6 by further extraction with diethyl ether.
110 C.A.M. Afonso et al.

Pedro Gois also demonstrated that the Rh2(OAc)4 catalysed C-H insertion
of D-diazo-phosphonates, can be efficiently performed in the IL [C4mim]PF6
allowing an efficient catalyst reuse simply by extraction of the reaction
mixture with organic solvent (Table 4) [30].

Table 4. Catalyst reuse on the C-H insertion of D-diazo-phosphono-acetamide 1h catalysed by


Rh2(OAc)4 immobilised in the IL [C4mim]PF6.

O O
i) Rh 2(OAc) 4, 1 mol %
X [C 4mim]PF 6, 80 ºC, 4h X n-Bu
N N
N2 ii) Extraction
iii) New addition of 1h

1h, X=PO(OEt)2 3h

Yield (%); Extraction with n- Yield (%); Extraction with


Run Run
hexane TBME
1 to 6 82a 1 to 5 87 a
a
Average yield for the combined cycles.

The Sharpless asymmetric dihydroxylation (AD) is a very powerfull


methodology for the synthesis of a considerable range of 1,2-diols in very
high optical purity [31]. However, the use of this methodology in large scale
presents some limitations mainly due to the high toxicity of osmium and
osmium/chiral ligand costs. To circumvent these limitations, several
approaches have been developed to reuse the catalytic system by anchoring
the chiral ligand, osmium or both in different supports under homogeneous
or heterogeneous reaction conditions. One method is based on
immobilization of chiral ligands onto soluble and insoluble polymers.
However, this approach has shown the need of long syntheses of each chiral
ligand, erosion of the enantioselectivity, and/or incomplete recovery and
reuse of the osmium-ligand catalytic system, due to the occurrence of
osmium leaching [32]. Other efficient approaches have been described:
microencapsulation of osmium catalyst in polystyrene [33], polyurea (achiral
version) [34], anchoring in poly(ethylene glycol) matrixes [35], silica
tetrasubstituted olefins [36], ion exchangers [37], nanocrystaline magnesium
oxide [38] anchoring in Amberlite containing residual vinyl groups [39],
gold colloids [40], biphasic system containing dendrimer-bound (achiral
version) [41] or fluorous (achiral version) [42] osmium catalyst.
Synthetic Organic Chemistry Approaches Based on Catalyst Reuse 111

Our preliminary observation of high solubility of the chiral ligand


(DHQD)2PHAL in [C4mim]PF6 prompted us to test the AD reaction in ILs.
In fact Luis Branco was pleased to observe that the AD reaction could be
efficiently performed in biphasic IL/water and monophasic IL/water/t-
butanol solvent systems for a range of substrates [43], using the co-oxidants
K3Fe(CN)6 or N-methylmorpholine oxide (NMO) [44].
For each substrate tested such as styrene, D-methyl-styrene, 1-hexene, 1-
methyl-cyclohexene, trans-stilbene, trans-5-decene and methyl trans-
cinnamate was possible to find one solvent system that afforded similar or
higher yields and enantioselectivities than the traditional t-BuOH/water
solvent system. More importantly, the use of IL as a co-solvent allows the
very efficient reuse of the catalytic system just by separation of the ionic
liquid containing the osmium/chiral ligand and removing the products from
the aqueous and organic phases (Table 5). In case of [C4mim]PF6 as co-
solvent it was possible to reuse the catalytic system for 9 cycles observing
only a 5% yield reduction from the first cycle (overall yield of 87%, TON =
1566). Additionally, for each cycle, the osmium content in the organic phase
which contains the AD product and in the aqueous phase was respectively in
the range of the ICP detection limit (”3%, ”7 ppb) and 3-6% of initial
amount, while the recovered IL phase contains more than 90% of the
osmium contents of the previous cycle (Table 5).
In the course of our further studies on the AD reaction, Luis Branco
observed that the AD reaction can be efficiently performed using IL as the
only solvent. After screening several ILs, it was possible to obtain even
better yields and enantioselectivities than previously using the IL as a co-
solvent [45]. Additionally, Ana Serbanovic from the research group of Prof.
Nunes da Ponte at ITQB demonstrated that after the AD reaction using
NMO as the co-oxidant, it could be possible to extract the product using the
appropriate scCO2 conditions and reuse the catalytic system (Table 6).
Another observed advantage of scCO2 approach is that the osmium
contamination in the product is inferior to earlier systems based on organic
solvent extraction. This approach was further optimised to the more
important substrate methyl trans-cinnamate since the corresponding diol is a
precursor of the Taxol side chain (Table 6) [46].
The research performed by us clearly demonstrates that the AD reaction
having ILs as reaction media is a simple, efficient and robust method for
catalyst reuse by immobilisation of the catalytic system in the IL which
allows easy product isolation and nearly without osmium contamination.
112 C.A.M. Afonso et al.

Table 5. Reuse of the catalytic system for the AD of 1-hexene and methyl trans-cinnamate
(last example) using K3Fe(CN)6 and NMO as cooxidants and ionic liquid as solvent or co-
solvent.
i) K 2OsO 2(OH) 4 (0.5 mol%), (DHQD)2 PHAL (1.0
OH
mol%), co-oxidant, IL based solvent, rt, 24 h
n-Bu OH
n-Bu
ii) extraction
iii) new addition of 1-hexene

Solvent system, (substrate) Cycle Yield ee (%) Os in water Os in product


(%) (%) (%)
[C4mim]PF6/H2O (1:2)b (1-hexene) 1-8 75a 88-81 14-3 <3d
9 70 83 3 <3d
[C4mim]PF6/H2O (1:2)c (1-hexene) 1-8 75a 96-76 2 <3d
9 50 70 1 <3d
[C8mim]PF6/H2O (1:2)b (1-hexene) 1-8 61a 82-60 2 <3d
9 39 41 3 <3d
[C4mim]PF6/H2O/t-BuOH (1:1:2)b 1-10 86a 92-82 6-3 <3d
(1-hexene)
11 63 75 4 <3d
[C4mim]PF6/H2O/t-BuOH (1:1:2)c 1-8 80a 85-72 5-1 <3d
(1-hexene)
9 60 71 1 <3d
[C8mim]PF6/H2O/t-BuOH 1-7 70a 96-59 5-3 <3d
(1:1:2)b(1-hexene)
8 12 33 2 <3d
[C4mim] NTf2 c (1-hexene) 1-13 96a 96-93 -e 1.8-1.2d
14 92 92 -e 1.3d
[bdmim] NTf2c (1-hexene) 1-13 92a 96-89 -e 2.0-1.1d
14 88 87 -e 1.3
[C8mim][PF6]c (methyl trans- 1-5 75a 79-81 -e <0.05
cinnamate)g
6 79f 81 -e <0.05
a
Average yield for the combined cycles. K3Fe(CN)6 as cooxidant. NMO as co-oxidant. d
b c

Extraction with Et2O. e No water extraction was used. f More diol (52 %) was isolated from
the remaining IL by flash chromatography.g (DHQD)2PYR as chiral ligand.

Table 6. Reuse of the catalytic system for the AD of 1-hexene and methyl trans-cinnamate
using K2OsO2(OH)4(0.5 mol%)/(DHQD)2PYR (1.0 mol%), NMO as cooxidants and ionic
liquid as solvent followed by extraction with scCO2.
Substrate Solvent system Cycle Yield (%) ee (%) Os in product (%)
1-hexene [C4mim] NTf2 1-8 80a 96-87 0.4-0.2
9 56 80 0.4
trans-cinnamate [C8mim]PF6 1-5 89a 85-77 <0.05
6 85 84 <0.03
a
Average yield for the combined cycles
Synthetic Organic Chemistry Approaches Based on Catalyst Reuse 113

2.2.3 Development of new room temperature ionic liquids

In the course of our studies to use ILs as reaction media for organic
transformations, we noticed that all ILs available in the laboratory based on
the 1-methyl-3-alkylimidazolium cation were less prone to solubilise hard
Lewis acids and primary ammonium salts, including amino acids. To
circumvent this limitation, we studied the possibility to find other ILs based
on the methylimidazolium cation [mim] by introducing other functional
groups such as hydroxyl, and ether expecting that these groups could
improve the interaction of the [mim] cation with other solutes. In fact the
structures presented in Table 7 are liquid at room temperature, moderately
viscous and by performing solubility studies using the inorganic salts LiCl,
HgCl2 and LaCl3 as representative model salts, specific high solubilities
were observed (up to 18 and 5 times) respectively for LaCl3 and HgCl2 [47].

Table 7. Structure of room temperature ionic liquids of the series [CnOmmim][X].

N N X
R

Ionic Liquid Cation Anion


R X-
[C2OHmim]Cl (CH2)2OH Cl-
[C2OHmim]PF6 (CH2)2OH PF6-
[C2OHmim]BF4 (CH2)2OH BF4-
[C2OHmim]TFA (CH2)2OH CF3CO2-
[C3Omim]Cl (CH2)2OMe Cl-
[C3Omim]PF6 (CH2)2OMe PF6-
[C3Omim]BF4 (CH2)2OMe BF4-
[C4OHmim]Br (CH2)4OH Br-
[C5O2mim]Cl (CH2)2O(CH2)2OMe Cl-
[C5O2mim]PF6 (CH2)2O(CH2)2OMe PF6-
[C5O2mim]BF4 (CH2)2O(CH2)2OMe BF4-

2.2.4 Ionic liquids as bulk and supported membranes for selective


transport

Membranes, defined as permeable and selective barriers between two


phases have been successfully applied in a large diversity of separation
processes including bioseparation [48]. As a result of our long term
collaboration with the group of Crespo at REQUIMTE-DQ-FCT-UNL and
his interest on membrane technology, he became interested to study the
possibility of using ILs as membranes for severall separation processes,
including pervaporation. Due to the IL being almost non-volatile, the use of
114 C.A.M. Afonso et al.

IL as the appropriate membrane for pervaporation allows the selective


recovery of solutes without contamination with IL [49] and is also very
convenient to increase the conversion for reversible reactions in which the
chemical [50] and enzymatic [51] esterification have already been
illustrated. The use of IL as bulk and supported membranes was studied by
Luis Branco and Raquel Fortunato at Crespo group respectively for
organic/IL and aqueous/IL systems. In case of aqueous/IL system Raquel
Fortunato observed for NaCl, aminoacids and elegantly demonstrated with
T2O, that the transport is mainly regulated by the formation and mobility of
water microenvironments inside the IL, for which a critical water
concentration inside the IL is necessary for the building-up of the water
microenvironment [52]. In clear contrast to aqueous/IL system, Luis Branco
demonstrated that for organic/IL system the scenario is completely different
[53]. In order to test the potential of ILs as liquid membranes, he performed
preliminary screening transport experiments using U-tubes containing
[C4mim]PF6 between two diethyl ether phases in which one was loaded with
a mixture of 1,4-dioxane, 1-propanol, 1-butanol, cyclohexanol,
cyclohexanone, morpholine and methylmorpholine as a representative
mixture of organic compounds. We were pleased to find that apart of
moderate selectivity transport for several compounds such as cyclohexanol
vs cyclohexanone (2:1) a remarkable selectivity of 83:1 was observed for
morpholine vs methylmorpholine. Similar behaviour was observed for
supported ILs in different solid supports e.g. nylon or PVDF and other
structurally diverse secondary vs tertiary amines. According to 1H NMR
experiments the selectivity is rationalised mainly by the formation of
preferential HNR2/C(2)-H hydrogen bonding to the imidazolium cation. This
preferential interaction was also recently explored by Chiappe et al for the
inhibition of over-alkylation of primary amines [54].
The possibility to perform continuous operation using IL as supported
membranes was tested using a 1:1 mixture of diisopropylamine (DIIPA) and
triethylamine (TEA) as model compounds which present very similar boiling
points (respectively 84 oC and 89 oC). We tested the process using one cell
of 30 mL volume for each side and just aprox. 500 mg of the IL [C4mim]PF6
supported in a PVDF membrane of 1.65 cm diameter and circulating in one
side diethyl ether and in the other a 1:1 mixture of DIIPA (500 mL)/TEA
(500 mL) in diethyl ether (Figure 5) [55]. For a total volume of 20 litres of
diethyl ether we collected in one side during 14 days of continuous operation
contained 423.5 ml of DIIPA (84.7 % of initial volume) and 74.4 ml of TEA
(14.9 % of initial volume). As can be seen in Figure 5 the continuous
operation gave persistently high selectivity. This experiment demonstrates
the potential application of ILs as supported membranes for continuous
separation.
Synthetic Organic Chemistry Approaches Based on Catalyst Reuse 115

5 7 7 6
100

80
4

y (amine) / %
60
2
40
1
3 20

0
0 2 4 6 8 10 12 14
Side A Side B t / days
diisopropylamine triethylamine

A) B)
Figure 5. Continuous separation of an initial 1:1 mixture of diisopropylamine (DIIPA) and
triethylamine (TEA) based on IL as supported membrane in PVDF. A) Schematic diagram of
the cell used: volume of each side of the cell = 30 ml; 1) supported liquid membrane; 2)
connection to the pumps; 3) magnetic stirrer; 4) pumps; 5) feed solution containing the
solutes; 6) receiving solution of fresh solvent; 7) bottle for collection. B) Relative percentage
of each amine in the receiving stream obtained for eighteen collecting samples (12 hours
period), during 14 days of continuous operation.

2.2.5 Asymmetric transformations induced by chiral ionic liquids


based on natural chiral anions

In the course of our search for other ionic liquids based on different
structures from the immidazolim cation, we became interested on the
guanidinium cation unit mainly due to its stability by charge delocalization
between four atoms and the possibility to introduce six different alkyl
chains. Starting from commercial available N,N-dimethylphosgeniminium
chloride Nuno Mateus prepared different guanidinium chlorides in one step
following general reported method [56], which allows the preparation of
different ionic liquids resulted of further chloride exchange (Table 8) [57].

Table 8. ILs based on the dimethylguanidinium [dmg] cation.


Cl i) HNR 1R 2 (2.1 eq), TEA (2.2 eq), CH 2Cl2 NR 1R 2
N N
Cl ii) MX (M = Na, K or Li), CH 2Cl2 NR 1R 2
Cl X

R1 R2 X
Me n-Bu PF6
Et n-Bu BF4, NTf2
n-Hexyl n-Hexyl Cl, PF6, BF4, NTf2
n-Octyl n-Octyl Cl, PF6, BF4
116 C.A.M. Afonso et al.

These ILs present moderate viscosity, high thermal stability and


complementary stability to the ILs based on the imidazolium cation. More
importantly, despite the high number of carbon atoms, e.g. 27 for the
dimethylguanidinium [dmg] cation tetra-n-hexyl-dimethylguanidinium
[(di-h)2dmg] they are less prone to crystallise even in the presence of anions
which persistently become solid when combined with different cations. In
contrast to ILs based on the [im] cation, this notable property of the [dmg]
cation opens the opportunity to create a new generation of chiral ionic
liquids from the recent reported ones [58] just by simple exchange of the
[dmg]Cl salt with natural or easily functionalised chiral natural anions
(Figure 6). In Figure 6 is provided representative examples such as
mandelate, lactate, camphorsulphonate and carboxylates of boc-alanine and
quinic acid [59].

R R
N R M [chiral anion] N R
N N
N R N R
Cl R R
R = n-Hexyl
[Chiral Anion]
[(di-h)2dmg] Cl
Chiral Anions
HO CO2-
OH R

Ph CO2- CO2-
HO OH
R = OH [lactic] OH
[(S)-mand] R = NHBoc [Boc-ala] [quinic]
OH

N CO2-
O R
SO3-

[(S)-CSA] R = Ac [Ac-prol-OH]
Figure 6. Chiral ionic liquids prepared by combination of readily accessible chiral anions with
([(di-h)2dmg]) cation.

The possibility to use this chiral media was explored by taking advantage
of our ongoing research on Rh(II) catalysed C-H insertion of D-diazo-
acetamides and on the Sharpless asymmetric dihydroxylation (AD) (Figure
7). Interestingly, for both transformations was possible to achieve yields and
Synthetic Organic Chemistry Approaches Based on Catalyst Reuse 117

enantioselectivities similar or higher to the ones using chiral ligands. This


new chiral media, due to the diversity of chiral anions available, being ionic,
moderately viscous, almost non-volatile and potentially reusable, opens new
opportunities not only to induce chirality by performing different reactions
in the media but also in other areas such as new materials, chiral resolution
by chromatographic methods and membrane technologies.

i) Rh(II) carbenoid asymmetric C-H insertion


O O O
EtO P
(a) X
N N
EtO
N2 Ph
Ph
72 % (trans/cis = 67:33)
ee = 27 %
ii) Sharpless asymmetric dihydroxylation (AD)
OH
(b)
OH
R R
R = n-Bu 95%, ee = 85%
R = Ph 92%, ee = 72%

Figure 7. Induction of chirality by chiral ionic liquids (CILs) in: i) C-H insertion of Į-diazo-
a
acetamides catalyzed by Rh2(OAc)4 and ii) Sharpless AD. Rh2(OAc)4 (1 mol%), [(di-
h)2dmg] [(R)-mand] (0.3 g), diazo (0.15 mmol), 110 oC, 3 h. b Alkene (0.5 mmol),
K2OsO2(OH)4 (0.5 mol %), NMO (1 eq), [(di-h)2dmg] [quinic] (0.3 mL), rt, 24 h.

3. CONCLUSIONS

In this laboratory we have been continuing to develop potentially more


environmentally friendly synthetic methodologies with special emphasis on
catalyst reuse. We hope that our contribution to this research area has been
useful to the scientific community. Our research will continue in this area,
perhaps with major focus on the use of our recent chiral ionic liquids based
on the guanidinium cation as a source to induce chirality in different
synthetic applications.
118 C.A.M. Afonso et al.

ACKNOWLEDGMENTS

We thank to Fundação para a Ciência e Tecnologia and FEDER for the


financial support and Solchemar company (http://www.solchemar.com) for
providing some of the ionic liquids.

REFERENCES
1. Catalytic Asymmetric Synthesis, 2nd ed.; Ojima, I., (Ed.); pp 357, VCH: Weinheim,
2000.
2. Paul Anastas and John Warner in Green Chemistry: Theory and Practice; Oxford
University Press, 2000.
3. James Clark in Green Separation Processes: Fundamentals and Applications, Afonso,
C. A. M.; Crespo, J. P. S. G. (Eds.), Wiley-VCH, Weinheim, 2005.
4. H. Matsubara, I. Roy in Green Separation Processes: Fundamentals and Applications,
Afonso, C. A. M.; Crespo, J. P. S. G. (Eds.), Wiley-VCH, Weinheim, 2005.
5. A. B. Osuna, A. Serbanovic, M. Nunes da Ponte in Green Separation Processes:
Fundamentals and Applications, Afonso, C. A. M.; Crespo, J. P. S. G. (Eds.), Wiley-
VCH, Weinheim, 2005.
6. C.-J. Li, Chem. Rev. 2005, 105, 3095.
7. J. Duppont, R. F. Souza, P. A. Z. Suarez, Chem. Rev., 2002, 102, 3667.
8. I. F. J. Vankelecom, L. E. M. Gevers in Green Separation Processes: Fundamentals and
Applications, Afonso, C. A. M.; Crespo, J. P. S. G. (Eds.), Wiley-VCH, Weinheim,
2005.
9. R. Breslow, Acc. Chem. Res. 2004, 37, 471. S. Narayan, J. Muldoon, M. G. Finn, V. V.
Fokin, H. C. Kolb, K. B. Sharpless, Angew. Chem. Int. Ed. 2005, 44, 3275.
10. D. J. Miller, C. J. Moody, Tetrahedron 1995, 51, 10811.
11. P. M. P.Gois, C. A. M. Afonso, Eur. J. Org. Chem. 2004, 3773.
12. P. M. P. Gois, C. A. M. Afonso, Eur. J. Org. Chem. 2003, 3798. P. M. P. Gois, N. R.
Candeias, C. A. M. Afonso, J. Mol. Catal. A-Chem. 2005, 227, 17.
13. N. R. Candeias, P. M. P. Gois, C. A. M. Afonso, Chem. Commun. 2005, 391.
14. R. D. Rogers K. R. Seddon, (Eds); Ionic Liquids; Industrial Applications for Green
Chemistry; ACS Symposium Series 818, ACS, Washington DC, 2002. P. Wasserscheid,
T. Welton, Ionic Liquids in Synthesis, VCH-Wiley, Weinheim, 2002.
15. M. C. Buzzeo, R. G. Evans, R. G. Compton, ChemPhysChem 2004, 5, 1106.
16. P. Kubisa, Prog. Polym. Sci. 2004, 29, 3.
17. H. Zhao, S. Xia, P. Ma, J. Chem. Technol. Biotechnol. 2005, 80, 1089.
18. M. Antonietti, D. Kuang, B. Smarsly, Y. Zhou, Angew. Chem. Int. Ed. 2004, 43, 4988.
19. B. Weyershausen, K. Lehmann, Green Chem. 2005, 7, 15.
20. M. J. Earle, J. M. S. S. Esperança, M. A. Gilea, J. N. C. Lopes, L. P. N. Rebelo, J. W.
Magee, K. R. Seddon, J. A. Widegren, Nature, 2006, 439, 831.
21. C. Cadena, J. L. Anthony, J. K. Shah, T. I. Morow, J .F. Brennecke, E. J. Maginn,. J. Am.
Chem. Soc., 2004, 126, 5300.
22. W. Wu, B. Han, H. Gao, Z. Liu, T. Jiang, J. Huang, Angew. Chem. Int. Ed. 2004, 43,
2415.
Synthetic Organic Chemistry Approaches Based on Catalyst Reuse 119

23. S. Garcia, N. M. T. Lourenço, D. Lousa, A. F. Sequeira, P. Mimoso, J. M. S. Cabral, C.


A. M. Afonso, S. Barreiros, Green Chem. 2004, 6, 466.
24. B. Jastorff, K. Mölter, P. Behrend, U. Bottin-Weber, J. Filser, A. Heimers, B.
Ondruschka, J. Ranke, M. Schaefer, H. Schröder, A. Stark, P. Stepnowski, F. Stock, R.
Störmann, S. Stolte, U. Welz-Biermann, S Ziegerta J. Thöming, Green Chem. 2005, 7,
362.
25. M. J. O’Donnel, In Catalytic Asymmetric Synthesis, 2nd ed.; Ojima, I., (Ed.); Chapter 10,
VCH: Weinheim, 2000.
26. N. M. T. Lourenço; C. A. M. Afonso, Tetrahedron, 2003, 59, 789.
27. L. C. Branco, C. A. M. Afonso, Tetrahedron 2001, 57, 4405.
28. J. N. Rosa, A. G. Santos, C. A. M. Afonso, J. Mol. Catalysis-A 2004, 214, 161.
29. C. Wei, C.-J. Li, J. Am. Chem. Soc. 2002, 124, 5638.
30. P. M. P. Gois, C. A. M. Afonso, Tetrahedron Lett. 2003, 44, 6571.
31. R. A. Johnson, K. B. Sharpless, In Catalytic Asymmetric Synthesis, 2nd ed.; Ojima, I.,
Ed.; VCH: Weinheim, 2000; pp 357.
32. A. Mandoli, D. Pini, M. Fiori, P. Salvadori, Eur. J. Org. Chem. 2005, 1271; and
references cited therein.
33. T. Ishida, R. Akiyama, S. Kobayashi, Adv. Synth. Catal. 2005, 347, 1189; and references
cited therein.
34. S. V. Ley, C. Ramarao, A.-L. Lee, N. Østergaard, S. C. Smith, I. M. Shirley, Org. Lett.
2003, 5, 185.
35. B. S. Lee, S. Mahajan, K. D. Janda, Tetrahedron Lett. 2005, 46, 4491; and references
cited therein.
36. A. Sevrens, D. E. De Vos, L. Fiermans, F. Verpoort, P. J. Grobet, P. A. Jacobs, Angew.
Chem. Int. Ed. 2001, 40, 586.
37. B. M. Choudary, K. Jyothi, S. Madhi, M. L. Kantam, Adv. Synth. Catal. 2003, 345, 1190;
and references cited therein.
38. B. M. Choudary, K. Jyothi, M. Roy, M. L. Kantam, B. Sreedhar, Adv. Synth. Catal.
2004, 346, 1471.
39. J. W. Yang, H. Han, E. J. Roh, S. Lee, C. E. Song, Org. Lett. 2002, 4, 4685.
40. H. Li, Y.-Y. Luk, M. Mrksich, Langmuir 1999, 15, 4957.
41. W.-J. Tang, N.-F. Yang, B. Yi, G.-J. Deng, Y.-Y. Huang, Q.-H. Fan, Chem. Comm.
2004, 1378.
42. Y. Huang, W.-D. Meng, F.-L. Qing, Tetrahedron Lett. 2004, 45, 1965.
43. L. C. Branco, C. A. M. Afonso, Chem. Comm. 2002, 3036.
44. L. C. Branco, C. A. M. Afonso, J. Org. Chem. 2004, 69, 4381.
45. L. C. Branco, A. Serbanovic, M. N. da Ponte, C. A. M. Afonso, Chem. Comm. 2005,
107. L. C. Branco, A. Serbanovic, M. N. da Ponte, C. A. M. Afonso, PT 10293703 and
PCT/PT2005/000007.
46. A. Serbanovic, L. C. Branco, M. N. da Ponte, C. A.M. Afonso, J. Organometallic
Chemistry, 2005, 690, 3600.
47. L. C. Branco, J. N. Rosa, J. J. M. Ramos, C. A. M. Afonso, Chem Eur. J., 2002, 8, 3671.
48. J. G. Crespo, I. M. Coelhoso, R. M. C. Viegas, Membrane Contactors: Membrane
Separations, in Encyclopedia of Separation Processes, Academic Press, San Diego,
2000, pp. 3303-3311.
49. T. Schäfer, C. M. Rodrigues, C. A. M. Afonso, J. G. Crespo, Chem. Commun. 2001,
1622.
50. P. Izák, N. M. M. Mateus, C. A.M. Afonso, J. G. Crespo, Separation and Purification
Technology, 2005, 41, 141.
120 C.A.M. Afonso et al.

51. L. Gubicza, N. Nemestóthy, T. Fráter, K. Bélafi-Bakó, Green Chem. 2003, 5, 236.


52. R. Fortunato, C. A. M. Afonso, M. A. M. Reis, J. G. Crespo, Journal of Membrane
Science 2004, 242, 197. R. Fortunato, M. J. González-Muñoz, M. Kubasiewicz, S.
Luque, J. R. Alvarez, C. A. M. Afonso, I. M. Coelhoso, J. G. Crespo, Journal of
Membrane Science, 2005, 249, 153.
53. L. C. Branco, J. G. Crespo, C. A. M. Afonso, Chem Eur. J. 2002, 8, 3865.
54. C. Chiappe, D. Pieraccini, Green Chem. 2003, 5, 193.
55. L. C. Branco, J. G. Crespo, C. A. M. Afonso, Angew Chem. Int. Ed., 2002, 41, 2771.
56. T. Schlama, V. Gouverneur, A. Valleix, A. Greiner, L. Toupet, C. Mioskowski J. Org.
Chem. 1997, 62, 4200.
57. Mateus, N. M. M.; Branco, L. C.; Lourenço, N. M. T.; Afonso, C. A. M. Green Chem.
2003, 5, 347. N., M. M. Mateus, L. A. A. F. C. Branco, C. A. M. Afonso, PT 10293703
and EP 1 466 894 A1.
58. G. Tao, L. He, N. Sun, Y. Kou, Chem. Comm. 2005, 3562; and references cited therein.
59. L. C. Branco, P. M. P. Gois, N. M. T. Lourenço, V. B. Kurteva, C. A. M. Afonso, Chem.
Comm. 2006, 2371.
SIMULATION AND MODELING IN
COMPUTATIONAL CHEMISTRY: A
MOLECULAR PORTFOLIO

José N. A. Canongia Lopes1


1
Centro de Química Estrutural, Instituto Superior Técnico, Universidade Técnica de Lisboa,
Av. Rovisco Pais, 1049-001 Lisboa, Portugal e-mail: jnlopes@ist.utl.pt

Abstract: This communication describes the scientific research work of the author at
Centro de Química Estrutural, Instituto Superior Técnico, between 1996 and
2006, in the area of computational methods applied within the framework of
statistical mechanics. The first simulation methods to be introduced are those
based on Monte Carlo algorithms, namely the extended version of the Gibbs
Ensemble Monte Carlo (GEMC) method. Several examples of the application
of the method to the study of fluid phase equilibria in model systems are
discussed. The rest of the communication is dedicated to Molecular Dynamics
techniques and their application to the study of molecular systems. The case of
ionic liquids, a class of compounds that attracted in recent years a lot of
attention from the scientific and technological communities, is particularly
addressed. The diversity of the molecular systems that were studied using
computer simulations (and that can be described as the molecular portfolio of
the author) is a measure of the growing importance of these methods at the
forefront of scientific research

Key words: Computational Chemistry, Statistical Mechanics, Monte Carlo, Molecular


Dynamics.

1. INTRODUCTION

This article describes the author’s simulation and modeling work at


Centro de Química Estrutural of Instituto Superior Técnico, IST, of
Universidade Técnica de Lisboa, UTL, from 1996 to the beginning of 2006.
The presentation of the different research topics follows approximately a

121
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 121–135.
© 2007 Springer. Printed in the Netherlands.
122 J.N.A. Canongia Lopes

chronological order although the structure of the article is mostly based on


the way those topics are interconnected to each other. In other words, this
communication can be understood as a collection of abstracts of scientific
papers (published or under preparation) describing in a logical way the
scientific activity of the author in the area of computational chemistry and
statistical mechanics of molecular systems. Each abstract is followed by a
figure that tries to capture the issues under discussion.

2. GIBBS ENSEMBLE MONTE CARLO


SIMULATIONS

The work in the area of simulation methods within the framework of


statistical mechanics started in 1996 at Imperial College, London, during a
post-doctoral fellowship under the supervision of Professor Dominic
Tildesley.
The first computational method to be studied was the Gibbs Ensemble
Monte Carlo (GEMC) algorithm, a method particularly suited to the study of
phase equilibria [1]: The Gibbs ensemble simulation method was extended
to multiphase equilibria by increasing the number of boxes that can be used
concurrently in the simulation. Atoms were moved within each box and pairs
of boxes were selected at random for the volume and particle exchange
moves. The equivalence between the Gibbs ensemble with an arbitrary
number of boxes and the corresponding canonical ensemble was established.
Simulations of two-component, three-phase equilibria and three-component,
four-phase and three-phase equilibria were demonstrated for simple model
systems, and the model phase diagrams were determined.

Figure 1. Equimolar mixture of 600 Lennard-Jones atoms in liquid-liquid-vapor equilibrium.


Snapshot of a NVT-GEMC simulation with 106 steps.
Simulation and Modeling in Computational Chemistry 123

The extended Gibbs Ensemble Monte Carlo method was then applied to
the systematic study of phase diagrams and their relation with the parameters
that govern the interactions between atoms. This is the field of the so-called
Global Phase Diagrams and this line of research produced at this stage two
papers [2,3] describing the “building-up of Phase Diagrams”.
In the first of those papers, a three-box version of the Gibbs ensemble
Monte Carlo method was used to determine the phase diagram type of
several binary mixtures of one-centre Lennard-Jones particles. The method
was used to establish a direct link between the intermolecular potential
modeling the interactions in a given system and its fluid phase diagram,
without the knowledge of the corresponding equation of state governing its
pVT behavior. As an example of the application of the method, closed-loop
behavior in an isotropic system could be found using a set of Lennard-Jones
parameters exhibiting a cross-interaction diameter with a negative deviation
from the Lorentz-Berthelot combination rule.

Figure 2. Four types of vapor-liquid equilibria (Scott & van Konynenburg classification)
generated by the GEMC method.

The second article, published in Pure and Applied Chemistry [3], began
with a bird’s-eye view of the history of phase equilibrium diagrams for
mixtures, their classification and interpretation. Running throughout the
discussion are the fertile ideas of van der Waals. The Scott and van
Konynenburg classification was revisited, and various types of phase
diagrams were generated by computer simulation, using the Gibbs Ensemble
Monte Carlo Method for one-centre 12-6 Lennard-Jones molecules. The
work was hopefully made more attractive and appealing to students by a
judicious choice of architectural and engineering equivalents.
124 J.N.A. Canongia Lopes

Figure 3. Cross-section of the international terminal at Waterloo Station, London, “exhibiting


a positive azeotrope”. © Nicholas Grimshaw & Partners Ltd. (1993)

The implementation of the extended GEMC method produced other more


recent and diversified studies, ranging from the possibility of phase
separation under osmotic conditions [4], to the formation of nano-structures
in isotropic systems [5], to the application of the extended GEMC concept to
ternary systems [6].
In the first case the Gibbs Ensemble Monte Carlo method was used to
simulate osmotic equilibria for Lennard-Jones mixtures. When simulations
were performed with two independent boxes separated by a selective
membrane (exchange algorithm), one containing solvent and the other a
mixture of solute and solvent significantly negative osmotic pressures (š)
developed. Those simulations were extended to include a third box and the
possibility of modeling three coexisting phases. The new simulations
showed that the two phase equilibria with negative values of š were
metastable and that the system spontaneously separated into three phases:
pure solvent, dilute solute plus solvent and dense solute plus solvent with a
resulting osmotic pressure that was normally small and positive.
In the second study [5], several binary systems composed of one-centre
Lennard-Jones particles with identical size and interaction parameters but
different cross diameter parameters were simulated using the extended
version of the Gibbs ensemble Monte Carlo method, in an attempt to predict
the fluid phase behavior in such systems. No liquid–liquid phase separation
was found in the mixtures but all cases studied exhibit a structured liquid
phase in equilibrium with its vapor. The domain shapes and patterns
observed both in three- and two-dimensional simulations (stripe and bubble
phases) were interpreted within the framework of the phenomenology of
modulated phases, also known as microphase separated fluids. The
appearance of stripe patterns occurs when the cross-interaction diameter is
Simulation and Modeling in Computational Chemistry 125

larger than particles diameter, while the opposite is true for the bubble
phases.

Figure 4. Osmotic equilibria in solutions of Lennard-Jones atoms.

The last article [6] discussed some of the issues related to the
classification and representation of ternary diagrams and how these can be
incorporated into the more general framework of global phase diagrams. It
was shown that the representation of binary mixtures in the form of T–x
diagrams constitutes an insightful and logical approach to the problem of a
systematic classification of fluid phase behavior in terms of bifurcations (the
yin and yang) and that the concept can be extended to the classification of
ternary systems.

Figure 5. Microphase separation in a bidimensional simulation of a mixture of Lennard-


Jones atoms using the GEMC method.
126 J.N.A. Canongia Lopes

Figure 6. Application of the extended GEMC method to ternary systems: ternary diagram
showing the boundaries of liquid-liquid immiscibility.

3. MOLECULAR DYNAMICS SIMULATIONS

The change from atomic to molecular – and from Monte Carlo to


Molecular Dynamics methods - was a natural step that occurred more or less
in the middle of the period corresponding to the different studies just
presented (around 2000). Surprisingly it was old experimental work, from
the time of the PhD. Thesis at IST/UTL, under the supervision of Professor
Jorge Calado, that started this new line of research. From those times (and
also from other more recent ones) remained the interest concerning the
experimental study of isotope effects on the thermodynamic properties of
pure substances and their mixtures. Unfortunately, the relation between
Molecular Dynamics simulations and isotope effects is not a straightforward
one: the former methods are based on classical equations of motion, which
do not contain the quantum mechanical treatment necessary to interpret
correctly the latter effects. Nevertheless, and without the need to use more
complicated schemes involving the incorporation of quantum corrections
into the computational methods, it was possible to use the statistical theory
of isotope effects in condensed phase (a theoretical tool that includes the
required quantum treatment) in combination with data obtained from
Molecular Dynamics simulations. Another interesting turn of events was that
most of the Molecular Dynamics work that was to follow was done in
collaboration with Professor Agílio Pádua, presently at Blaise Pascal
Université in Clermont Ferrand, France, but also a fellow PhD. student at
IST/UTL.
The first problem to be studied in this way was the isotope effect on the
solubility of methane in aqueous solution [7].
Simulation and Modeling in Computational Chemistry 127

Figure 7. One methane molecule (center) surrounded by 511 water molecules. Molecular
Dynamics simulation to obtain data on the solubility of isotopically substituted methane in
water.

The isotope effect on the Henry’s law coefficients of methane in aqueous


solution (H/D and 12C/13C substitution) were interpreted using the statistical
mechanical theory of condensed phase isotope effects. The missing
spectroscopic data needed for the implementation of the theory were
obtained either experimentally (infrared measurements), by computer
simulation (Molecular Dynamics technique), or estimated using the Wilson’s
GF matrix method. The order of magnitude and sign of both solute isotope
effects can be predicted by the theory. Even a crude estimation based on data
from previous vapor pressure isotope effect studies of pure methane at low
temperature can explain the inverse effect found for the solubility of
deuterated methane in water.
Another similar problem (involving isotope effects) was also solved
using a statistical mechanics theory solved by computational (numerical)
methods [8].
Vapor pressure isotope effects (VPIEs) in monatomic systems (neon to
xenon, either between pure isotopes or in their binary mixtures) were
evaluated using an integral equation theory for a Lennard-Jones fluid with
the Duh–Haymet–Handerson closure. The most relevant quantity obtained in
this way is the average of the Laplacian of the potential energy of the
system, also known as the mean force constant. The results correctly predict
the different rare-gas VPIEs which span over several orders of magnitude.
Using a simple two-parameter corresponding states principle, the method
was capable of predicting VPIEs simply from the knowledge of isotopically
independent Lennard-Jones parameters of each rare gas and the masses of its
isotopes. Each type of VPIE (in pure isotopes or mixtures) map onto two
reduced variable equations. The first variable represents a reduced form of
the reduced partition function ratio (a measure of the VPIE between pure
128 J.N.A. Canongia Lopes

isotopes) while the second is a reduced form of the liquid activity coefficient
at infinite dilution (a measure of VPIEs in isotopic binary mixtures). Several
issues related to the temperature and density dependence of the mean force
constant are also addressed in this work.

Figure 8. Schematic representation of the work involving the determination of the vapor
pressure isotope effect in the rare gases (neon, argon, krypton and xenon) using integral
equations theory.

After the study of isotope effects using Molecular Dynamics simulations


as a means to obtain auxiliary data, the investigations using Molecular
Dynamics as the central predictive or interpretative tool rapidly gained
center-stage status on a wide variety of molecular systems: policylclic
aromatic compounds (heptacyclene isomers, [9]), organometallics (ferrocene
and its derivatives, [10]), inorganic crystals (apatites, [11-13]), ramified
oligomers and polymers of controlled generation (dendrimers, [14]) and “last
but indeed not the least” low melting temperature salts (room temperature
ionic liquids, [15-19]).
In the case of heptacyclene molecules, the energetics of the thermal
dimerization of acenaphthylene to give Z- or E-heptacyclene was
investigated. The standard molar enthalpy of the formation of monoclinic Z-
and E-heptacyclene isomers at 298.15 K were determined by
microcombustion calorimetry and the corresponding enthalpies of
sublimation were also obtained by Knudsen effusion and Calvet-drop
microcalorimetry methods. These results, in conjunction with the reported
enthalpies of formation of solid and gaseous acenaphthylene, and the
entropies of acenaphthylene and both hepatcyclene isomers obtained by
quantum mechanical calculations led to the conclusion that at 298.15 K the
thermal dimerization of acenaphthylene is considerably exothermic and
exergonic in the solid and gaseous states, suggesting that the non-
observation of the reaction under these conditions is of kinetic nature. A full
determination of the molecular and crystal structure of the E dimer by X-ray
diffraction was also reported for the first time. Finally, molecular dynamics
computer simulations on acenaphthylene and the heptacyclene solids were
Simulation and Modeling in Computational Chemistry 129

carried out and the results discussed in light of the corresponding structural
and enthalpy of formation data obtained experimentally.

Figure 9 Acenaphtylene and the Z and E isomers of heptacyclene.

For metallocenes of the ferrocene family, a new force field for their
molecular modeling was constructed [10]. The model was based on the
OPLS-AA/AMBER framework. Ab initio calculations were performed to
obtain several terms in the force field not yet defined in the literature. These
included geometrical parameters, torsion energy profiles and distributions of
atomic charges that blend smoothly with the OPLS-AA specification for
alkyl chains. Validation was carried out by comparing simulated and
experimental data for five different ferrocene-based complexes in the
crystalline phase.

Figure 10. Snapshot of a Molecular Dynamics simulation of a ferrocene crystal.

The structural and thermodynamic properties of crystalline calcium


apatites, Ca10(PO4)6(X)2 (X = OH, F, Cl, Br), were investigated using an all-
atom Born-Huggins-Mayer potential by a molecular dynamics technique
[11,12]. The accuracy of the model at room temperature and atmospheric
pressure was checked against crystal structural data, with maximum
deviations of ca. 4% for the haloapatites and 8% for hydroxyapatite. The
130 J.N.A. Canongia Lopes

standard molar lattice enthalpy of the crystalline apatites was calculated and
compared with previously published experimental results, the agreement
being better than 2%. The molar heat capacity at constant pressure, Cp,m, in
the range 298-1298 K, was estimated from the plot of the molar enthalpy of
the crystal as a function of temperature. High-pressure simulation runs, in
the range 0.5-75 kbar, were performed in order to estimate the isothermal
compressibility coefficient, T, of those compounds. The deformation of the
compressed solids is always elastically anisotropic, with BrAp exhibiting a
markedly different behavior from those displayed by HOAp and ClAp.
High-pressure p-V data were fitted to the Parsafar-Mason equation of state
with an accuracy better than 1%.
Molecular dynamics simulations of molten hydroxyapatite were also
performed, for the first time, in the range 2000K< T < 3000K and pressures
up to 20 GPa [13]. The all-atom Born–Huggins–Mayer potential energy
function employed had been previously used to study the thermodynamic
properties of the solid compound (see above). High-temperature simulation
runs were used to generate the pVT surface of the melt, from which
properties like the isobaric thermal expansion coefficient, Įp and the
isothermal compressibility, țT, could be evaluated. The heat capacity at room
pressure, Cp, in the range 2000–3000 K, was estimated from the plot of the
molar enthalpy of the melt as a function of temperature. The intermolecular
atom–atom distribution functions, at several temperatures and pressures,
were also investigated. A universal EoS proposed by Parsafar et al. was
shown to give a good account of the MD data, the precision being better than
0.5%. Likewise, the Parsafar–Mason regularity which assumes a linear
dependence of (Zí1)V2 on ȡ2, has been established for molten
hydroxyapatite.

Figure 11. Two views of a chloro-apatite crystal showing the spatial arrangement of the
chloride anions (isolated atoms) in hexagonal channels.
Simulation and Modeling in Computational Chemistry 131

The structure and dynamics of poly(amido amide) (PAMAM) dendrimers


have been of great interest both scientifically and industrially, but such
important features as the distributions of atoms, channels, and strain inside
these molecules remain unresolved. The work on this systems [14] involves
systematic investigations of the atomistic structure of PAMAM dendrimers
as a function of their generation. Structural properties such as the radius of
gyration, shape tensor, asphericity, fractal dimension, monomer density
distribution, solvent accessible surface area, molecular volume, and end
group distribution functions, were evaluated from extensive molecular
dynamics simulations.
In the case of ionic liquids, the first studies concentrated on the
development of force-fields specially adapted to this class of compounds
[15-18].

Figure 12. Fourth generation PANAM dendrimer.

A new force field for the molecular modeling of ionic liquids of the
dialkylimidazolium cation family was constructed [15,16]. The model is
based on the OPLS-AA/AMBER framework. Ab initio calculations were
performed to obtain several terms in the force field not yet defined in the
literature. These include torsion energy profiles and distributions of atomic
charges that blend smoothly with the OPLS-AA specification for alkyl
chains. Validation was carried out by comparing simulated and experimental
data on fourteen different salts, comprising three types of anion and five
lengths of alkyl chain, in both the crystalline and liquid phases. The present
132 J.N.A. Canongia Lopes

model can be regarded as a step toward a general force field for ionic liquids
of the imidazolium cation family that was built in a systematic way, is easily
integrated with OPLS-AA/AMBER, and is transferable between different
combinations of cation-anion.

Figure 13. Initial configuration of a Molecular Dynamics simulation of a 1-ethyl-3-


methylimidazolium nitrate ionic liquid below its melting temperature.

After the dialkylimidazolium cations, the force field development


concentrated on two anions that started to be widely used in ionic liquids
formulations [17]: a set of force field parameters was proposed for the
molecular simulation of ionic liquids containing the anions
trifluoromethylsufate and bis(trifluoromethylsulfonyl)imide, also known as
triflate and bistriflylimide, respectively. The new set can be combined with
existing force fields for cations in order to simulate common room-
temperature ionic liquids, such as those of the dialkylimidazolium family,
and can be integrated with the OPLS-AA or similar force fields. Ab initio
quantum chemical calculations were employed to obtain molecular
geometry, tortional energy profiles, and partial charge distribution in the
triflate and bistriflylimide anions. One of the torsions in bistriflylimide,
corresponding to the dihedral angle S-N-S-C, has a complex energy profile
which is precisely reproduced by the present parameter set. A new set of
partial electrostatic charges is also proposed for the pyrrolidinium and tri-
and tetra-alkylammonium cations. Again, these parameters can be combined
with the OPLS-AA specification for amines in order to simulate
alkylammonium salts. The force-field models were validated against crystal
structures and liquid-state densities.
An article describing the force fields for other cations
(monoalkylimidazolium, phosphonium and pyridinium) and anions
(bromide, dicyanamide) is currently under preparation [18].
Simulation and Modeling in Computational Chemistry 133

The second line of research concerning ionic liquids refers to the


obtention of their structural properties using simulation data [19,20].
Nanometer-scale structuring in room-temperature ionic liquids was
observed using molecular simulation [19]. The ionic liquids studied belong
to the 1-alkyl-3-methylimidazolium family with hexafluorophosphate or
with bis(trifluoro-methanesulfonyl)amide as the anions, respectively. They
were represented, for the first time in a simulation study focusing on long-
range structures, by an all-atom force field of the AMBER/OPLS_AA family
containing parameters developed specifically for these compounds. For ionic
liquids with alkyl side chains longer than or equal to four carbon atoms,
aggregation of the alkyl chains in nonpolar domains was observed. These
domains permeated a tridimensional network of ionic channels formed by
anions and by the imidazolium rings of the cations. The nanostructures were
visualized in a conspicuous way simply by color coding the two types of
domains (in this work, we chose red (polar) and green (nonpolar). As the
length of the alkyl chain increases, the nonpolar domains became larger and
more connected and caused swelling of the ionic network, in a manner
analogous to systems exhibiting microphase separation. The consequences of
these nanostructural features on the properties of the ionic liquids were
analyzed.

Figure 14. Nano structures in ionic liquids. The polar and non-polar domains are color-coded
dark and light grey, respectively. In the last representation the bonds between atoms are
shown in order to enhance the distinction between the two domains. .

In a second work, the dihedral distribution of the alkyl side chain of the
imidazolium cations of ionic liquids was discussed by comparison of
spectroscopical (Raman) and ab initio data with results obtained by
molecular dynamics simulations [20]: a molecular force field for the
computer simulation of ionic liquids was validated a posteriori by
confrontation against Raman spectroscopic data, published after the force
field had been formulated. Specifically, the terms in the force field
describing the conformational aspects of dialkylimidazolium cations, that
134 J.N.A. Canongia Lopes

were specifically developed for these compounds using high level ab initio
calculations, were those affecting the distribution of conformers in simulated
ionic liquids. Those distributions are compared with analyses of the liquid-
phase Raman spectra, and the features of a series of dihedral torsions along
the alkyl side chains in 1-alkyl-3-methylimidazolium cations in several ionic
liquids were discussed.

4. CONCLUSION

The diversity of computer simulations presented in this paper (the


portfolio) demonstrates on one hand the ease of implementing this kind of
calculation for new systems and, on the other hand, the predictive and
interpretative power of these methods to address issues that are not easy to
study by purely experimental or theoretical techniques.

REFERENCES
1. J. N. Canongia Lopes and D. J. Tildesley, Multiphase equilibria using the Gibbs
ensemble Monte Carlo method. Molecular Physics, 92, 187-195, 1997.
2. J. N. Canongia Lopes, Phase equilibra in binary Lennard-Jones mixtures: phase diagram
simulation. Molecular Physics, 96, 1649-1658, 1999.
3. J. C. G. Calado and J. N. Canongia Lopes, The building-up of phase diagrams. Pure and
Applied Chemistry, 71, 1183-1196, 1999.
4. J. N. Canongia Lopes and D. J. Tildesley, Three-phase osmotic equilibria using the
Gibbs ensemble simulation method. Molecular Physics, 98, 769-772, 2000.
5. J. N. Canongia Lopes, Microphase separation in mixtures of Lennard-Jones particles.
Physical Chemistry Chemical Physics, 4, 949-954, 2002.
6. J. N. Canongia Lopes, On the classification and representation of ternary phase
diagrams: The yin and yang of a T-x approach. Physical Chemistry Chemical Physics, 6,
2314-2319, 2004.
7. Z. Bacsik, J. N. Canongia Lopes, M. F. C. Gomes, G. Jancsó, J. Mink and A. A. H.
Pádua, Solubility isotope effects in aqueous solutions of methane. Journal of Chemical
Physics, 116, 10816-10824, 2002.
8. J. N. Canongia Lopes, A. A. H. Pádua, L. P. N. Rebelo and J. Bigeleisen, Calculation of
vapor pressure isotope effects in the rare gases and their mixtures using an integral
equation theory. Journal of Chemical Physics, 118, 5028-5037, 2003.
9. R. C. Santos, C. E. S. Bernardes, H. P. Diogo, M. F. M. da Piedade, J. N. Canongia
Lopes and M. E. M. da Piedade, Energetics of the Thermal Dimerization of
Acenaphthylene to Heptacyclene. Journal of Physical Chemistry A, 110, 2299-2307,
2006.
10. J. N. Canongia Lopes, P. Cabral do Couto and M. E. Minas da Piedade, An OPLS-based
All-Atom Force Field for Metallocenes, in preparation.
Simulation and Modeling in Computational Chemistry 135

11. F. J. A. L. Cruz, J. N. Canongia Lopes, J. C. G. Calado and M. E. M. da Piedade, A


molecular dynamics study of the thermodynamic properties of calcium apatites. 1.
Hexagonal phases. Journal of Physical Chemistry B, 109, 24473-24479, 2005.
12. F. J. A. L. Cruz, J. N. Canongia Lopes and J. C. G. Calado, Molecular dynamics study of
the thermodynamic properties of calcium apatites. 2. Monoclinic phases. Journal of
Physical Chemistry B, 110, 4387-4392, 2006.
13. F. J. A. L. Cruz, J. N. Canongia Lopes and J. C. G. Calado, Molecular Dynamics
Simulations of Molten Calcium Hydroxyapatite. Fluid Phase Equilibria, in press.
14. P. Paulo, J. N. Canongia Lopes and S. B. Costa, A molecular dynamics study of
PANAM dendrimers, in preparation.
15. J. N. Canongia Lopes, J. Deschamps and A. A. H. Pádua, Modeling ionic liquids using a
systematic all-atom force field. Journal of Physical Chemistry B, 108, 2038-2047, 2004.
16. J. N. Canongia Lopes, J. Deschamps and A. A. H. Pádua, Modeling ionic liquids of the
1-alkyl-3-methylimidazolium family using an all-atom force field. R. D. Rogers and K.
R. Seddon eds., Ionic Liquids IIIA: Fundamentals, Progress, Challenges, ACS
Symposium Series 901, ACS, Washington D. C., U. S. A., 134-149, 2005.
17. J. N. Canongia Lopes and A. A. H. Pádua, Molecular force field for ionic liquids
composed of triflate or bistriflylimide anions. Journal of Physical Chemistry B, 108,
16893-16898, 2004.
18. J. N. Canongia Lopes and A. A. H. Pádua, Molecular force field for ionic liquids. 3.
Monoalkylimidazolium, Phosphonium and Pyridinium Cations; Bromide and
Dicyanamide Anions, in preparation.
19. J. N. Canongia Lopes and A. A. H. Pádua, Nanostructural Organization in Ionic Liquids.
Journal of Physical Chemistry B, 110, 3330-3335, 2006.
20. J. N. Canongia Lopes and A. A. H. Pádua, Using spectroscopic data on imidazolium
cation conformations to verify a molecular force field to ionic liquids. Journal of
Physical Chemistry B, accepted for publication.
EXPERIMENTAL PARTICLE AND
ASTROPARTICLE PHYSICS

M. Pimenta
LIP – Laboratório de Instrumentação e Física Experimental de Partículas, Av. Elias Garcia,
nº 14 – 1º, 1000-149 Lisboa, Portugal, Instituto Superior Técnico, Universidade Técnica de
Lisboa, Av. Rovisco Pais, 1 – 1096 Lisboa, pimenta@lip.pt

Abstract: Rutherford and Hess experiment, in the first years of the XX century, mark the
beginning of the experimental Particle and Astroparticle Physics.
In the next one hundred years many, many discoveries were made, and we
have now a much deeper understanding of the elementary constituents and of
the fundamental interaction of the Universe. However, we know now at the
beginning of the XXI century that most of the Universe (~95%) is filled by
mysterious entities, we have almost no idea about the dark matter and the dark
matter energy.
The present and future experimental program of Particle and Astroparticle
Physics is ambitious and we may hope for considerable progress in the years
to come.
A short review of the participation of the LIP Portuguese teams in such
program is briefly described..

Key words: Particle Physics, Astroparticle, Cosmic Rays, LHC, Auger, Dark Matter, Dark
Energy.

1. EXPERIMENTAL PARTICLE AND


ASTROPARTICLE PHYSICS

Experimental Particle and Astroparticle Physics were born in the


beginning in the XX Century. The Rutherford / Geiger / Marsdem
experiment (1909-1912) can still be considered the paradigm of a Particle
Physics Experiment: A beam of particles collides with a target and one’s

137
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 137–150.
© 2007 Springer. Printed in the Netherlands.
138 M. Pimenta

observes what and how get’s out! With such a simple scheme the structure of
the atom was discovered. The beam was an D particle from a radioactive
source; the target a thin gold sheet; and the detector a fluorescent screen.
Nowadays the primary beams, mainly protons or electrons, are accelerated
by, many kilometre devices and the detectors are huge and complex. But the
principle still applies.
At the same time (1912-1914) Victor Hess, in a series of balloon flights,
proved the existence of particles coming from the skies. In fact the intensity
of the mysterious radiation that ionizes the air, increases with altitude and
does not depend on the day or night. Some years later this radiation was
named “Cosmic Rays” by Milikan.
All around the XX Century many, many discoveries were made. Just two
striking examples: In 1933 Anderson discovered, in a cosmic ray
experiment, the existence of a particle with the same mass and absolute
charge of an electron, but curving in a magnetic field in the opposite
direction. The antimatter, predicted just a few years before by Dirac, was
discovered! In the 80’s, at CERN, the electroweak bosons, Z0, W+ and W-,
were first discovered and then hugely produced. A big step towards the
unification dream of all the interactions was made.
In the last year of the XX century, for some physicists, Stephen Hawking
on the top, the game seemed almost over. We know already that atoms are
made of electrons, protons and neutrons, that protons and neutrons are made
of quarks, that there are three families of quarks which can combine between
themselves (and with its anti-partners – the antiquarks) to form a myriad of
the so called elementary particles (hadrons and mesons). Besides the quarks,
the framework of the constituents of the Standard Model also includes three
lepton families (of electron type particles and neutrinos) and gauge bosons
responsible for the fundamental interactions.
Three of these fundamental interactions (electromagnetic, weak and
strong) can be described by elegant theories, based in symmetry principles
which have the same mathematical structure, and thus can be unified in a
single and final theory. The unification of gravitation with the other three
interactions is not easy (no one knows yet how to build a realistic quantum
gravity theory) but this aim is actively pursued namely by the string
theoreticians.
But there are a few problems…
First of all the introduction of the particle masses in the theory. Particles
do have mass and its values covers a extreme wide range (from zero, the
photon, or just a few decimals or centesimal of electron Volts, the neutrinos,
to the hundreds of giga-electron Volts, the top quark).
Peter Higgs, in the early 60’s created a mechanism to “give” effective
masse to massless particles. These effective mass would be the result of the
Experimental Particle and Astroparticle Physics 139

interaction of the massless particles with a background field (the Higgs


field). A new boson should then exist - the Higgs boson - and experimental
particle physicists have been and will be in the next years, desperately
looking for it.
But, if there was an enormous progress in the XX century, in the
understanding of the matter we are made of, two new and mysterious entities
have appeared: the Dark Matter and the Dark Energy.
Dark matter was indirectly discovered studying the rotation curves of
stars around the centre of its galaxy. The rotation velocity of a star increases
with its distance to centre of the galaxy, in contradiction with the prediction
of Newton Law supposing that the distribution of the galaxy mass follows
the luminosity profile of the galaxy. The most popular solution is to predict
the existence of the “Dark Matter”, that means, some fundamental particles
that have neither electromagnetic interactions nor strong interactions, but
still have gravitational interactions. The theoretical candidates are many and
the experimental effort considerable but, until now, there is no direct
discovery of the dark matter particles.
Dark Energy was recently introduced in the “jargon” of Particle Physics
to account for the discovery that the Universe expansion is accelerated! In a
Universe of “normal” and “dark” matters, at large scales, the attractive
gravitational interaction takes over and no accelerated expansion is possible.
Therefore there is the need of some unknown entity able to have repulsive
gravitational interaction - that’s the Dark Energy.
In summary, and taking into account all the experimental results obtained
so far, the Universe is composed of 4% of “normal matter”, we know about,
23% of “dark matter”, we have ideas about, and 73% of “dark energy” we
dream about.
The challenge for experimental Particle and Astroparticle Physics is
therefore enormous!
In Particle Physics the next big step is the large Hadron Collider (LHC)
designed to collide two rotating beams of 7 TeV protons. The LHC is being
built at CERN, Geneva, and its start up is foreseen for 2007/2008. The LHC
will explore an energy domain ten times higher than the available in the
present most powerful accelerators and thus may shed light on the subjects
we have just mentioned: the Higgs boson, the dark matter, the dark energy.
In Astroparticle Physics the race for higher and higher energy is
nowadays led by the Pierre Auger Observatory (PAO). PAO aims to study
the highest energy particles produced somewhere in the Universe and
detected at Earth in the form of Cosmic Rays (with energies 108 higher than
the energy of the LHC beams). The origin of these particles is one of the
mysteries of our time and the solution of this enigma may considerably
advance our knowledge of the Universe. However these particles are
140 M. Pimenta

extremely rare. Cosmic rays above 1020 electron Volts arrive at Earth at a
rate of about one per square kilometre per century, and therefore the
detection area must be huge. PAO is in the final stage of constructing an
array of 1600 water tanks and 24 fluorescence detectors covering an area of
about 3000 square kilometres in Malargüe, Argentina. A second similar
observatory is foreseen to be built in a few years in Colorado, USA:
The LHC and PAO are just examples of the many different experiments
that are or will be in a near future running.
Indeed from the mines in Japan to the space, from the desert in Namibia
to the South Pole, many, many physicists are working hard in this beginning
of the XXI century.
The reason is simple: The Universe is still there to be discovered!

2. THE PARTICIPATION OF PORTUGUESE


TEAMS

The Portuguese participation in the experimental program of Particle and


Astroparticle Physics is mainly of the responsibility of LIP. LIP (Laboratório
de Instrumentação e Física Experimental de Partículas) was created twenty
years ago when Portugal joined CERN as a full member-state and its main
objective is the scientific and technological research in the field of
Experimental High Energy Physics and Astrophysics, Associated
Instrumentation and Computation.
The scientific and technological activity of LIP takes place in the
framework of international collaborations, aiming at the realisation of
experiments, mainly in the European Laboratory for Particle Physics
(CERN), in Geneva, Switzerland. LIP does part of its job in Portugal
(experiments conception, fundamental research on radiation physics and
detectors, technological development and data analysis) executing at CERN
or elsewhere all that concerns the direct preparation of experiments and data
acquisition.
In close connection with the LHC effort, LIP increased its involvement in
computing for LHC, mainly in developing the GRID Computing Paradigm.
Together with its participation in the EU Datagrid project, LIP joined as full
partner the EU CrossGrid Project, where it plays a central role. LIP is a full
partner in the international consortium that submitted the proposal “Enabling
Grids for E-Science and Industry in Europe” (EGEE) to the 6th Framework
Program of the European Union. EGEE is proposed as a project funded by
the European Union and is the largest project in Information Technology.
LIP will contribute to the area of Quality Assurance and TestBed Support.
Experimental Particle and Astroparticle Physics 141

LIP is involved in space projects in the framework of the NASA/AMS


experiment and has celebrated with ESA several contracts in the area of
Radiation Environment, Effects and Components Degradation.
LIP has recently joined the Pierre Auger Observatory, now in completion
in Argentina, covering a surface of 3.000km2 and aiming to increase by more
than one order of magnitude the present statistic in Extreme High Energy
Cosmic Rays. In the area of Astroparticles LIP has also a small participation
in SNO (Sudbury Neutrino Observatory) Collaboration, one of the most
important solar neutrino experiments.
LIP is a member of European Particle Physics Outreach program that
aims to promote education in basic sciences. An experimental set-up for
collecting cosmic rays has been developed and is currently being installed in
Secondary Schools.
In the following paragraphs the LIP participation in some of the projects
referred above are briefly described.

2.1 Participation in the LHC Project

The Large Hadron Collider (LHC) is being built on the European


Organization for Nuclear Research (CERN) in Geneva-Switzerland. The
LHC will provide proton beam collisions with a centre of mass energy up to
14 TeV with a challenging luminosity of 1034 cm-2s-1. The collider will also
provide heavy-ion collisions at a centre of mass energy higher than 1000
TeV and 1028 cm-2s-1 of luminosity.
Four main experiments will be placed around the accelerator: the
Compact Muon Solenoid (CMS), A Toroidal LHC Apparatus (ATLAS), the
LHC beauty (LHCb) and A Large Ion Collider Experiment (ALICE). LIP is
a member of CMS and ATLAS Collaborations.
The CMS and ATLAS experiments are general purpose experiments
which aims the study of very high energy collisions of proton beams.
Investigation of the most fundamental properties of matter, in particular the
study of the nature of the electroweak symmetry breaking and the origin of
mass, is the experiment scope.
The LIP activity at CMS has two main components:

1) The development of hardware and software for the calorimeter


trigger and for the readout system of the electromagnetic
calorimeter. The project is carried on in collaboration with INESC.
2) The search for extra-dimensions in proton-proton collisions at the
LHC.
142 M. Pimenta

The calorimeter trigger system of the CMS/LHC experiment at CERN is


a high performance electronics and computing system which processes on-
line the detector data, about one hundred thousand calorimeter channels, to
select electrons, photons, taus and missing energy events, as well as, samples
of jet events. The trigger system performs the first selection step in search
for new physics reactions. The Electromagnetic Calorimeter (ECAL) is an
electron and photon detector composed by eighty thousand PbW04 crystals.
The extremely fine granularity and the excellent energy resolution makes
this instrument very well suited for the measurements of electrons and
photons at the LHC. The readout system is responsible for collecting data
from 80000 channels.
The LIP Activities at ATLAS have been centred in the design,
procurement and test of optical components and construction of the electron
magnetic calorimeter (TILCAL). LIP is strongly involved in the TILECAL
Detector Control System (DCS) with responsibility of the development of
the supervisor station user interface and the integration of the TILECAL
controls (cooling, HV, LV, laser, cesium and minimum bias) in the general
DCS. LIP is now involved in the commissioning of the detector, at the level
of the hardware and software, including the commissioning with cosmic
muons. Super-LHC scenarios with radiation levels up to 10 times higher
than in LHC are an extra motivation to continue LIP activity on the optics
ageing. Taking advantage from our experience in scintillating fibres and
their aluminization, LIP participates in the R&D for the construction of the
scintillating fibre detector of the ATLAS luminosity monitor. In the area of
Physics simulation and software development we will focus on the W boson
mass and width and on the top quark physics. In terms of precision
measurements we will continue the study of several systematic contributions
to the W boson mass uncertainty using fast simulation and later with full
simulation. In terms of searches of new physics LIP has been studying the
ATLAS sensitivity to the measurement of top decays through Flavour
Changing Neutral Currents, both in t-tbar and single top events, and the
measurements of the forward-backward asymmetry in top decays, both of
them sensitive to physics beyond the Standard Model of Particle Physics. In
terms of the hadronic calibration, development of methods able to reach an
accuracy of ~1% on the jet energy scale through decay of W into jets are
planned.

2.2 Participation in the COMPASS experiment

COMPASS experiment is dedicated to the study of the structure of


matter, namely the gluon polarization ǻG/G (from open charm
photoproduction and high Pt physics), the transverse-spin structure and
Experimental Particle and Astroparticle Physics 143

fragmentation functions. With a hadron beam, COMPASS aims to study


some spectroscopy issues, as the production of new mesons and baryons,
namely exotics, hybrids and double charmed particles. COMPASS uses high
intensity beams, that is, a polarized muon (or hadron) beam impinging on a
longitudinally or transversely polarized target (or silicon microstrips target)
followed by a two stage spectrometer: a first one with a large angular
acceptance, followed downstream by a second one with a reduced
acceptance, designed to detect particles up to more than 100 GeV/c. In the
original design, as stated in the accepted Proposal, each spectrometer is
equipped with a magnet sorrounded by trackers, a set of electromagnetic and
hadronic calorimeters, muon filters and a Cerenkov detector (RICH) for
particle identification. The data acquisition system is based in a parallel
read-out of the front-end electronics plus a distributed set of event-builders,
specially designed to cope with huge data volumes. The LIP group has taken
the full responsibility of the COMPASS Detector Control System (DCS). In
fact, the system did not fulfill the requirements imposed by the
Collaboration, concerning reliability, versatility, and speed. The huge task of
its completely redesign is being addressed. Important improvements, mainly
concerning flexibility and speed, have already been performed. A major
implementation concerning the software packages and their relationship,
leading to a new approach, is under way.

2.3 Participation in the Pierre Auger Observatory

The study of Ultra High Energy Cosmic Rays (UHECR) is a recent and
promising field. Indeed, the existence of cosmic rays with energies above
1020 eV is now established, opening a new channel in astronomy and
astrophysics. The subject is however still largely open and motivating the
research of a large and growing community of both experimental and
theoretical physicists. Experimentally, the main issue stems from the very
low arrival rate on Earth of these events (about 1 event per Km2 per century),
which makes the statistics collected up to now clearly insufficient for the
characterization of the cosmic ray spectrum in this energy range, and in
particular to establish whether the expected GZK cutoff is present or not.
The need for a next generation of experiments, increasing the statistics by
orders of magnitude, is thus a consensus. From a theoretical point of view,
both the acceleration of particles to such energies and the fact that they may
travel relatively large distances before reaching Earth are to some extent
unexpected, making UHECR a privileged probe of the Universe evolution
and of the fundamental laws of nature. An important step forward is the
Pierre Auger Observatory (PAO), now under construction in Argentina
(completion in 2007) and expected to collect 30-50 events per year with
144 M. Pimenta

energy above 1020 eV - in one year, more than the present total available
statistics.
Three main lines of activities are presently developed by the LIP team
involving experimental and theoretical aspects: 1) Phenomenological studies
and data analysis in the areas of the energetic hadronic interactions and in
the searches for New Physics; 2) Detailed simulations of the Extended Air
Shower (EAS) development in the atmosphere using the CORSIKA
program, and of particular aspects of the detector performance using the
GEANT4 toolkit; 3) R&D in view of the enhancement of Auger-South and
the preparation of Auger-North.

2.4 Participation in the GRID experiment

GRID computation is a recent paradigm of computation which consists in


the aggregation of autonomous computational resources, heterogenic and
distributed in an unique infrastructure. The GRID ‘hides’ in it the
specificities of the resources that constitutes it allowing an easy and
transparent access to powerful means of calculation. The motivation implied
is resolving complex computational problems which demands high calculus
skills. The LHC experiments fit in this context since they require dozens of
millions of processors. The ATLAS and CMS experiments in the context of
the LHC, and in which LIP participates, have chosen the Grid paradigm as a
solution for the integration of the computational resources belonging to the
participant institutions. A national Grid initiative was recently launched by
the Portuguese government which will enable to establish the national LHC
grid computing infrastructure. This infrastructure will be based on a
federated Tier-2 centre formed by two clusters in Lisbon and Coimbra. Most
of these efforts will be performed in coordination with the international
projects WLCG and EGEE (Enabling Grids for E-Science). Both projects are
closely related with the provisioning of technologies and services for the
LHC computing.
LIP will be also involved in two new EU grid projects. The int.eu.grid
project is a follow-up of CrossGrid where LIP participated previously. The
int.eu.grid project aims to build an infrastructure for demanding parallel,
interactive and compute intensive applications using LCG/EGEE
middleware and taking advantage of the EGEE clusters provided by the
project partners. The project will offer resources for a wide range of
applications including High Energy Physics. LIP will participate in the
infrastructure coordination and management. The EELA project aims to
extend the EGEE middleware technologies and the infrastructure into Latin
America introducing the advantages of Grid computing and enabling
researchers on the other side of the Atlantic to better collaborate with their
Experimental Particle and Astroparticle Physics 145

European colleagues. LIP will contribute to the infrastructure operation,


certification authorities setup and virtual organizations support.

2.5 Participation in the Outreach

The education and Public Outreach has become a crucial necessity in


Particle Physics today. To this end CERN, the European Organization for
Particle Physics, has set-up a working group of scientists and educators, the
European Particle Physics Outreach Group, with the aims of accessing and
providing the tools to boost particle physics education in Europe mainly:

- the particle physics education and public outreach activities in


Portugal, including the organization of the Portuguese part of the
program “High-School Teachers at CERN”,
- The development of tools for the promotion of Particle Physics in
Portugal, in particular small detectors of particles, that can be used
in exhibitions,
- The development of divulgation material, in particular animated
films about Particle Physics, CD’s with presentations, booklets and
Web-related tools,
- The measurement of cosmic rays with equipment installed in high-
schools,
- The Project CRESCERE – new phase for 2006-2008, which
proposes to students and teachers of high-schools the remote
realization of experiments with detectors located at LIP/IST (one
station of cosmic rays).

ACKNOWLEDGEMENTS

This article benefits from the work of many Portuguese physicists who
are involved in the adventure of Experimental Particle and Astroparticle
Physics.

GENERAL REFERENCES
1. Rutherford, E., Philos. Mag. 21, 604.
146 M. Pimenta

2. See for instances the book of Emelio Segre, From Atoms to Quarks, San Francisco,
W.H.Freeman (1980).
3. Anderson, C.D. Phys. Rev., 43, 491 (1933).
4. See for instances the book of P.Watkins, Story of the W and Z, Cambridge University
Press, New York (1986).
5. See for instances the book of F. Halzen and A.D.Martin, Quarks and Leptons, john
Wiley & Sons, New York (1984).
6. Higgs, P.W., Phys. Rev. 145, 1156 (1964) .
7. See for instances the book of D. Perkins, Particle Astroparticle, Oxford University Press,
Oxford (2003).
8. http://public.web.cern.ch/Public/Welcome.html
9. http://www.auger.org/
10. https://edms.cern.ch/cedar/plsql/cedarw.site_home
11. http://www.lip.pt/experiments/cms/PtCMS.html
12. http://www.lip.pt/experiments/atlas/
13. http://www.lip.pt/experiments/compass/
14. http://eppog.web.cern.ch/eppog/

LIP PUBLICATIONS – 2005


1. Clear-PEM: A dedicated PET camera for improved breast cancer detection, 2005-01-01,
M. C. Abreu, P. Almeida, F. Balau, N. C. Ferreira, S. Fetal, F. Fraga, M. Martins, N.
Matela, R. Moura, C. Ortigão, L. Peralta, P. Rato, R. Ribeiro, P. Rodrigues, A. I. Santos,
A. Trindade and J. Varela, Radiation Protection Dosimetry (2005), Vol. 116, No. 1–4,
pp. 208–210
2. The data acquisition system of the neutron time-of-flight facility n_TOF at CERN, 2005-
01-01, U. Abbondanno et al. (n-TOF Collaboration), Nucl. Instrum. and Meth. A 538,
2005, 692-702
3. Flavour changing strong interaction effects on top quark physics at the LHC , 2005-01-
01, P. M. Ferreira, O. Oliveira and R. Santos, Phys.Rev.D72 (2005) 075010;hep-
ph/0507128
4. Microscopic black hole detection in UHECR: the double bang signature, 2005-01-15, V.
Cardoso, M.C. Esprito-Santo, M. Paulos, M. Pimenta and B. Tomé, Astroparticle
Physics 22(2005)399.
5. A new measurement of J/PSI suppression in Pb-Pb collisions at 158 GeV per nucleon,
2005-01-21, NA50 Collaboration (P. Bordalo, G. Borges, C. Quintans, S. Ramos, H.
Santos R. Shahoyan et al.), Eur Phys J C39 (2005) 335, URL http://arxiv.org/abs/hep-
ex/?0412036
6. Electron Energy Spectra, Fluxes, and Day-Night Asymmetries of B-8 Solar Neutrinos
from the 391-Day Salt Phase SNO Data Set, 2005-02-01, SNO Collaboration (J. Maneira
et al.), Phys. Rev. C72, 055502 (2005),URL http://arxiv.org/pdf/nucl-ex/0502021
7. http://na50.web.cern.ch/NA50/papers.html
8. Search for the Phi(1860) pentaquark at COMPASS, 2005-03-15, COMPASS
Collaboration (P. Bordalo, C. Quintans, S. Ramos, M. Varanda et al.), Eur Phys J C41
(2005) 469
9. http://wwwcompass.cern.ch/compass/publications/papers/locked/pq_EPJC.pdf
Experimental Particle and Astroparticle Physics 147

10. Measurement of the spin structure of the deuteron in the DIS region, 2005-03-21,
COMPASS Collaboration (P. Bordalo, C. Quintans, S. Ramos, M. Varanda et al.), Phys
Lett B612 (2005) 154
11. http://wwwcompass.cern.ch/compass/publications/papers/locked/2005_plb612_154.pdf
12. Rare Gas Liquid Detectors, Capítulo de Livro, publicado, 2005-04-10, M. I. Lopes and
V. Chepel, in “Electronic Excitations in Liquified Rare Gases”, Edited by W. Schmidt
and E. Illenberger, American Scientific Publishers, 2005, pp.331-385
13. Determination of the b quark mass at the M_Z scale with the DELPHI detector at LEP,
2005-04-27, J. Abdallah et al., The DELPHI Collaboration, CERN PH-EP 2005-020
(accepted by EPJC)
14. http://delphiwww.cern.ch/pubxx/papers/public/paper0343.ps.gz
15. Charged Particle Multiplicity in Three-Jet Events and Two-Gluon Systems, 2005-05-14,
J. Abdallah et al., The DELPHI Collaboration, Eur. Phys. J. C44 (2005) 311-331
16. http://delphiwww.cern.ch/pubxx/papers/public/paper0344.ps.gz
17. Single Intermediate Vector Boson production in e+e- collisions at sqrt(s) = 183 - 209
GeV, 2005-05-20, J. Abdallah et al., The DELPHI Collaboration, Eur. Phys. J. C45
(2006) 273-289
18. http://delphiwww.cern.ch/pubxx/papers/public/paper0337.ps.gz
19. A new low-energy bremsstrahlung generator for GEANT4, 2005-06-01, L. Peralta, P.
Rodrigues, A. Trindade and M. G. Pia, Radiation Protection Dosimetry (2005), Vol. 116,
No. 1–4, pp. 59–64
20. A Model for Mars Radiation Environment Characterization, 2005-06-01, A. Keating, A.
Mohammadzadeh, P. Nieminen, D. Maia, S. Coutinho, H. Evans, M. Pimenta, J.-P.
Huot, and E. Daly, IEEE Trans. on Nuc. Sci., 2005.
21. Study of double-tagged gamgam events at LEPII, 2005-06-30, J. Abdallah et al., The
DELPHI Collaboration, CERN PH-EP 2005-029 (accepted by EPJC)
22. http://delphiwww.cern.ch/pubxx/papers/public/paper0348.ps.gz
23. A Search for Periodicities in the B-8 Solar Neutrino Flux Measured by the Sudbury
Neutrino Observatory, 2005-07-01, SNO Collaboration (J. Maneira et al.), Phys. Rev.
D72, 052010 (2005)
24. http://arxiv.org/pdf/hep-ex/0507079
25. Percolation Effects in Very High Energy Cosmic Rays, 2005-07-01, J. Dias de Deus,
M.C. Espirito Santo, M. Pimenta, C. Pajares, hep-ph/0507227
26. Optical Calibration Hardware for the Sudbury Neutrino Observatory, 2005-07-01, J.
Maneira, B.A. Moffat, R.J. Ford, F.A. Duncan, K. Graham, A.L. Hallin, C.A.W. Hearns,
P. Skensved, D.R. Grant, Nucl. Inst. & Meth. A 554 (2005) 255-265
http://arxiv.org/pdf/nucl-ex/0507026
27. HERA-B framework for online calibration and alignment, 2005-07-11, J.M. Hernandez
et al., Nucl.Instrum.Meth.A546:574-583,2005
28. http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TJM-4G4MHBK-
1&_coverDate=07%2F11%2F2005&_alid=384948301&_rdoc=1&_fmt=&_orig=search
&_qd=1&_cdi=5314&_sort=d&view=c&_acct=C000057391&_version=1&_urlVersion
=0&_userid
29. Performance of a Chamber for Studying the Liquid Xenon Response to Gamma-Rays and
Nuclear Recoils, 2005-07-23, F. Neves, V. Chepel, V. Solovov, A. Pereira, M. I. Lopes,
J. Pinto da Cunha, P. Mendes, A. Lindote, C.P. Silva, R. Ferreira Marques and A. J.P.L.
Policarpo, IEEE Trans. on Nucl. Sci.
30. Charm, beauty and charmonium production at HERA-B, 2005-08-01, HERA-B
collaboration, European Physical Journal C 43 (1-4): 179-186 Aug 2005
148 M. Pimenta

31. Bose-Einstein Correlations in W ^ + W ^ - events at LEP2, 2005-09-13, J. Abdallah et


al., The DELPHI Collaboration, Eur. Phys. J. C44 (2005) 161-174
32. http://delphiwww.cern.ch/pubxx/papers/public/paper0342.ps.gz
33. Flavour Independent Searches for Hadronically Decaying Neutral Higgs Bosons, 2005-
09-20, J. Abdallah et al., The DELPHI Collaboration, Eur. Phys. J.C44 (2005) 147-159
34. http://delphiwww.cern.ch/pubxx/papers/public/paper0330.ps.gz
35. Primary scintillation yield and alpha/beta ratio in liquid xenon, 2005-10-01, V. Chepel,
M.I. Lopes and V. Solovov, Radiat. Phys. and Chem. 74, 2005, 160-167
36. A PET imaging system dedicated to mammography, 2005-10-01, J. Varela, Accepted for
publication in Radiation Physics and Chemistry (in press)
37. Masterclass spreads the word for physics, 2005-10-01, Michael Kobel (EPOG), CERN
Courier vol.45, n.8 (2005)
38. http://www.cerncourier.com/main/article/45/8/18
39. Production of ;c0 and ;b in Z decays and lifetime measurement of ;b, 2005-10-06, J.
Abdallah et al., The DELPHI Collaboration, Eur. Phys. J. C44 (2005) 299-309
40. http://delphiwww.cern.ch/pubxx/papers/public/paper0338.ps.gz
41. Prototype Study of the cherenkov imager of the AMS experiment, 2005-10-09, P. Aguayo
et al., Nuclear Instruments and Methods in Physics Research
42. The production of rho, omega and phi vector-mesons by protons and sulphur ions with
incident momentum of 200 GeV/c per nucleon, 2005-10-11, NA38 Collaboration (MC
Abreu, P. Bordalo, G. Borges, R. Ferreira, J. Guimarães, C. Quintans, S. Ramos, H.
Santos, R. Shahoyan et al.), Eur. Phys. J C 44 (2005), 375
43. New approach to the calculation of the refractive index of liquid and solid xenon, 2005-
10-11, A. Hitachi, V. Chepel, M. I. Lopes, V. N. Solovov, Journal of Chemical Physics
123, 2005, 234508
44. A Determination of the Centre-of-Mass Energy at LEP2 using Radiative 2-fermion
Events, 2005-10-18, J. Abdallah et al., The DELPHI Collaboration, CERN PH-EP 2005-
050 (subm. to EPJC)
45. http://delphiwww.cern.ch/pubxx/papers/public/paper0350.ps.gz
46. Performance Simulation Studies of the Clear-PEM DAQ/Trigger System, 2005-11-01, P.
Rodrigues, P. Bento, F. Gonçalves, C. Leong, P. Lousã, J. Nobre, J. C. Silva, L. Silva, J.
Rego, P. Relvas, I. C. Teixeira, J. P. Teixeira, A. Trindade and J. Varela, Accepted for
publication in Transaction on Nuclear Science
47. Percolation and cosmic ray physics above 1017eV, 2005-11-01, P. Brogueira, J. Dias de
Deus, M.C. Espirito Santo, M. Pimenta, hep-ph/0511279
48. Conceptual Design of the CMS Trigger Supervisor, 2005-11-01, Ildefons Magrans de
Abril, Claudia-Elisabeth Wulz, and João Varela, IEEE TRANSACTIONS OF
NUCLEAR SCIENCE, VOL. 1, NO. 11, NOVEMBER 2005
49. The sensitivity of cosmic ray air shower experiments for excited lepton and leptoquark
detection, 2005-11-01, M.C. Espirito Santo, A. Onofre, M. Paulos, M. Pimenta, J. C.
Romão, B. Tomé, hep-ph/0508100
50. The Clear-PEM Electronics System, 2005-11-01, E. Albuquerque, P. Bento, C. Leong, F.
Gonçalves, J. Nobre, J. Rego, P. Relvas, P. Lousã, P. Rodrigues, I. C. Teixeira, J. P.
Teixeira, L. Silva, M. Medeiros Silva, A. Trindade and J. Varela, Accepted for
publication in Transaction on Nuclear Science
51. Desing and Evaluation of the Clear-PEM Scanner for Positron Emission
Mammography, 2005-11-01, PET-Mammagraphy Consortium (LIP authors: M. C.
Abreu, B. Carriço, M. Ferreira, P. R. Mendes, R. Moura, C. Ortigão, L. Peralta,
Experimental Particle and Astroparticle Physics 149

R.Pereira, R. Ribeiro, P. Rodrigues, J. C. Silva, P. Sousa, , A. Trindade and J. Varela),


Accepted for publication in Transaction on Nuclear Science
52. Novel Single Photon Detectors for UV Imaging, 2005-11-11, P. Fonte, T. Francke, N.
Pavlopoulos, V. Peskov, I. Rodionov, Nucl. Instrum. and Meth. in Phys. Res. A553
(2005) 30
53. http://dx.doi.org/10.1016/j.nima.2005.08.002
54. Determination of heavy quark non-perturbative parameters from spectral moments in
semileptonic B decays, 2005-11-16, J. Abdallah et al., The DELPHI Collaboration, Eur.
Phys. J. C45 (2006) 35-59
55. http://delphiwww.cern.ch/pubxx/papers/public/paper0340.ps.gz
56. Gluon polarization in the nucleon from quasi-real photoproduction of high-pT hadron
pairs, 2005-11-22, COMPASS Collaboration (P. Bordalo, C. Quintans, S. Ramos, M.
Varanda et al.), Phys Lett B 633 (2006) 25
57. http://wwwcompass.cern.ch/compass/publications/papers/locked/2006_plb633_025.pdf
58. - Results of the first performance tests of the CMS electromagnetic calorimeter, 2005-
11-23, The CMS Electromagnetic Calorimeter Group, Eur Phys J C 44, s02, 1-10 (2006)
59. CHARGE AND CP SYMMETRY BREAKING IN TWO HIGGS DOUBLET MODELS,
2005-12-01, A. Barroso, P.M. Ferreira and R. Santos, Phys.Lett.B632 (2006) 684-687;
hep-ph/0507224
60. O Projecto CRESCERE (em 4 línguas) - Cientista por um dia, cientista para a Vida!,
Livro, publicado, 2005-12-01, P. Abreu, H. Bilokon, F. Fabbri, A. Maio, Livrinho de
divulgação (16pp) distribuído pelas escolas secundárias
61. Measurement of the J/Psi Production Cross Section in 920 GeV/c Fixed-Target Proton-
Nucleus Interactions, 2005-12-01, HERA-B Collaboration, hep-ex/0512029
62. http://www-library.desy.de/preparch/hep-ex/0512/0512029.ps.gz
63. Advances in detectors for single crystal neutron diffraction, 2005-12-01, J.C. Buffet, J.F.
Clergeau, R.G. Cooper, J. Darpentigny, A. De Laulany, C. Fermon, S. Fetal, F.
Fraganex, B. Guérard, R. Kampmann, A. Kastenmueller, G.J. Mc Intyre, G. Manzin, F.
Meilleur, F. Millier, N. Rhodes, L. Rosta, Nuclear Instruments and Methods in Physics
Research Section A: Accelerators, Spectrometers, Detectors and Associated
EquipmentVolume 554, Issues 1-3, 1 December 2005, Pages 392-405
64. Preliminary results on position reconstruction for ZEPLIN III, 2005-12-01, A. Lindote,
H. M. Araujo, J. Pinto da Cunha, V. Chepel, F. Neves, M. I. Lopes et al. Nucl. Instrum.
and Meth A.
65. CHARGE BREAKING BOUNDS IN THE ZEE MODEL, 2005-12-01, A. Barroso and
P.M. Ferreira, Phys.Rev.D72 (2005) 075010
66. The effect of temperature on the rate capability of glass timing RPCs, 2005-12-15, D.
González-Díaz, D. Belver, A. Blanco, R. Ferreira Marques, P. Fonte, J. A.Garzón, L.
Lopes, A. Mangiarotti, J. Marín, Nucl. Instrum. and Meth. in Phys. Res. A555 (2005) 72
67. http://dx.doi.org/10.1016/j.nima.2005.09.005
68. Preliminary results on position reconstruction for ZEPLIN III, 2005-12-15, A. Lindote,
H. M. Araujo, J. Pinto da Cunha, V. Chepel, M. I. Lopes et al., Nucl. Instrum. and Meth
A.
69. RPC-PET: a new very high resolution PET technology, 2005-12-31, A.Blanco,
N.Carolino, N. Chichorro, C.Correia, M. P. Macedo, L. Fazendeiro, R. Ferreira Marques,
P.Fonte, IEEE Trans. Nucl. Sci
70. EM Reconstruction Algorithm with Resolution Modelling Applied to an RPC-PET
Prototype, 2005-12-31, L. Fazendeiro, N. C. Ferreira, A. Blanco, P. Fonte, R. Ferreira
Marques, IEEE Trans. Med. Imag.
150 M. Pimenta

71. Very high position resolution gamma imaging with Resistive Plate Chambers, 2005-12-
31, A. Blanco, N. Carolino, C.M.B.A. Correia, L. Fazendeiro, Nuno C. Ferreira,
M.F.Ferreira Marques, R. Ferreira Marques, P. Fonte, C. Gil, M. P. Macedo, Nucl.
Instrum. and Meth. in Phys. Res. A
72. Accurate timing of gamma photons with high-rate Resistive Plate Chambers, 2005-12-
31, L.Lopes, A.Pereira, P.Fonte, R.Ferreira Marques, Nucl. Instrum. and Meth. in Phys.
Res. A
FRONTIERS ON EXTREME PLASMA PHYSICS

Luís Oliveira e Silva


GoLP/Center for Plasma Physics and Department of Physics, Instituto Superior Técnico,
Universidade Técnica de Lisboa, 1049-001 Lisbon, Portugal, luis.silva@ist.utl.pt

Abstract: The technological revolutions in ultra intense lasers and computing power are
driving the exploration of new physics regimes dominated by relativistic
effects, nonlinear and complex phenomena, and extreme conditions. The
evolution in laser intensity and in computing peak power are discussed, and
some of the advances for the next decade that will leverage in this evolution
will be outlined, along with opportunities for research in this field at IST.

Key words: Intense Lasers, Advanced Computing, Plasma Accelerators, Nuclear Fusion.

1. INTRODUCTION

Laser technology has gone through a strong revolution with the


introduction of the concept of Chirped Pulse Amplification (CPA) in 1987,
by Gerard Mourou and co-workers, opening the way to laser systems of
unimaginable intensity. What were considered “giant laser pulses” in the
60’s can nowadays be achieved with tabletop systems. The most
sophisticated systems can reach powers in the PetaWatt (PW) range, and
intensities in excess of 1022 W/cm2.
The interaction of these light pulses is dominated by nonlinear
phenomena, associated with the relativistic dynamics of electrons, extreme
conditions in density (103 the solid density), and the pressure (~ Gbar).
Research in this field of extreme scenarios, highly nonlinear and
complex, has been called “Relativistic Engineering”, “The X-Games of
Contemporary Science” or “High Energy Density Science” [1]. The dramatic
advances and the tremendous prospects rely on the combination of ultra
intense lasers, sophisticated diagnostic techniques and large-scale numerical

151
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 151–161.
© 2007 Springer. Printed in the Netherlands.
152 L.O. Silva

simulations. After discussing these technological advances, I will discuss


their expected impact in novel particle accelerators, either using lasers or
intense particle beams, and in nuclear fusion using lasers. These advances
are borne on the recent demonstration of monoenergetic electron beams in
the 100s of MeV range [2, 3], with accelerating distances three orders of
magnitude shorter than in conventional accelerators, as well as in the
production of proton beams in energies close to 200 MeV [4], and their
potential impact in cancer therapy [5]. Furthermore, the HiPER project [6]
can lead to the installation in Europe of the most powerful laser systems in
the World devoted to the study of the fast ignition concept for nuclear
fusion, as well as to fundamental studies in laboratory astrophysics, and
QED vacuum properties. Since “Instituto Superior Técnico” (IST) is closely
associated with this project it is very likely that our contribution for this field
will be important, with an excellent opportunity for IST to play a central role
in this field of research.

2. ULTRA INTENSE LASERS AND HIGH


PERFORMANCE COMPUTING

The strong interest in the field of High Energy Density Science stems
from the revolution in ultra intense lasers and available computing power,
along with the development of sophisticated diagnostics which provide a
picture of unprecedented detail of many experimental and astrophysical
conditions. The new developments in laser and computer technology have
pro-vided not only a quantitative boost in the laser and computer
specifications but have opened new avenues of research, by providing the
tools to explore novel physical scenarios, which can only be attained due to
the qualitatively evolution in laser intensity and computing peak
performance: not only the top technological specifications are significantly
different, but what can be done with the technology at these specification is
qualitatively novel and different.

2.1 New physics at the focus of a new generation of


lasers

Since its invention in the 1950s, the laser has evolved into many different
configurations, which explore different properties of the “20th Century
light”. This progress can be easily depicted in the evolution of the laser
power as a function of time (cf. Figure 1), where the introduction of different
technologies (e.g. Q-switching, mode-locking, CPA) has consecutively
pushed the laser power to higher values. Nowadays, several laser systems in
Frontiers on Extreme Plasma Physics 153

the World have powers in the 10s to 100s TeraWatt (TW), and plans to build
multi-PW systems are underway in Europe, in the United States, and in Asia
(Japan, China, South Korea).

Figure 1. Evolution of the laser power (Courtesy G. Figueira, IST)

Until the 1980s, the increase of laser power was sustained by the
development of very large systems, with the dimensions of a basketball
court, where high powers were achieved by increasing the energy of long
laser pulses (on the nanosecond time scale) with very large glass amplifiers.
The invention of the CPA changes this scenario.
In the CPA, a very short (~ fs), and very low energy (~ nJ) laser pulse is
first stretched in time to a few 100s ps, keeping a well defined frequency
coding, or chirp, and then it is amplified by conventional techniques
(regenerative amplifier, glass amplifiers) up to the Joule level. After the
amplification process, the laser pulse is compressed to a duration very close
to the original pulse duration, taking advantage of the fact that there is a
precise coding of the frequency content of the laser pulse, thus facilitating
the recombination of the different frequency components into the original
configuration, with a very short duration pulse. After compression, the laser
pulse can now have power as high as PW, and it can be focused to spot sizes
of a few micrometers, thus leading to intensities in excess of 1022 W/cm2.
The CPA al-lows for ultra intense multi-TW laser systems to be installed in
just a few tables, such as in the system installed at IST (Figure 2), which
154 L.O. Silva

delivers 10 TW in 150 fs laser pulses. In the focus of this laser, we reach the
highest intensities south of Bordeaux, i.e. the brightest spot in Portugal!
The dramatic increase of laser power has also been accompanied by the
possibility of exploring new physics, and new applications. While in the
early days of the laser, applications were essentially associated with
metrology, the increase in laser power opened new regimes in the interaction
of radiation with matter (e.g. nonlinear optics), and applications in new
radiation sources, and new developments in atomic physics have been
obtained since the 1960s. As the laser power increases, the interaction of
radiation with matter becomes increasingly complex and nonlinear. For
powers in the sub-TW range, the laser field is strong enough to fully ionize a
gas target or a cluster target, thus forming a plasma, or to heat electrons in
thin solid targets to the keV range. The oscillation of the electrons in the
electromagnetic field of these lasers is relativistic, and many novel nonlinear
physical phenomena associated with the interaction of light at these
intensities with plasmas show up, leading to the nonlinear optics of free
electrons or, relativistic nonlinear optics. It is the richness of the new physics
at the focus of these laser systems that the field of High Energy Density
Science aims to harness and to explore.

Figure 2. The Laboratory for Intense Lasers at IST (Courtesy G. Figueira, IST)

2.2 A qualitative revolution in computing power

The exploration of these extreme scenarios requires sophisticated


diagnostics and, most of all, detailed numerical simulations. Theoretical
Frontiers on Extreme Plasma Physics 155

models have limited applicability in these nonlinear and complex conditions,


and only numerical models, functioning as numerical experiments, can guide
theoretical developments, provide a testbed for new ideas, and give further
detailed information about the experiments.
There is a long tradition of plasma simulations, but only now, with the
available computing power, it is possible to perform numerical simulations
that mimic the exact experimental parameters. Unlike the experiments, in the
simulations the scientists have access to an almost unlimited number of
diagnostics, with all the information about the fields and the particles. The
possibility to perform such detailed simulations relies on the evolution of
computer technology expressed by Moore’s law, depicted in Figure 3 in
terms of the available computing power.
More important than the computing power in high-end very expensive
machines (shown in blue), the most dramatic and revolutionary change
comes from the inexpensive low-end machines, whose power is already
comparable to some of the supercomputers in the early 1980s. Furthermore,
the widespread of computer technology, including developments in open-
source software and the interconnect/network infrastructure, has triggered
the development of cluster computing for scientific applications, based on
the Beowulf concept, bringing incredible amounts of computing power even
to small research groups in the universities. In the field of High Energy
Density Science, the availability of such computer power has lead to the
development of e-Science infrastructures, capable of running one-to-one
simulations of physical scenarios involving ultra intense lasers in plasmas,
and to perform data exploration and scientific visualization of huge data sets
(TeraByte range).

Figure 3. Evolution of the computing power in the high end machines (blue), and the in the
low end machines (brown)
156 L.O. Silva

3. CHALLENGES FOR THE NEXT DECADE

All the implications of the revolution in laser power, and in computing


power in High Energy Density Science are only now being identified [1], but
it is already possible to point out sub-domains where the impact in Science
and Technology during the next decade will be tremendous. The focused
intensities reached in these laser systems can recreate, in laboratory, many
astrophysical scenarios, thus allowing for the detailed study of the conditions
in the center of the giant planets, in supernovae remnants, or in gamma ray
bursters. The combination of ultra intense lasers (in the infra-red domain)
with ultra intense free electron lasers (such as the XFEL - DESY, Hamburg)
or the LCLS, Stanford) also opens the way to the study of warm dense
matter, and the possibility to develop novel x-ray sources for biology and
medicine. However, the most important challenges for the next decade,
which can pro-vide paths to radical new science and technology, are
associated with particle accelerators, and with nuclear fusion with lasers.

3.1 Plasma based accelerators

Particle accelerators are at the core of some of the fundamental


discoveries of the 20th Century, having contributed to test our understanding
of the building blocks of the Universe. Moreover, particle accelerators are
critical in many applications in biology, medicine, or materials research,
either when the accelerated particles are used directly to interact with matter,
or when the accelerated particles are used to produce radiation (e.g. bright x-
rays). Particle accelerators are the microscopes of modern science.
Unfortunately, the conventional accelerator technology is reaching
saturation, because the maximum accelerating field sustained by present-day
accelerators (~ 10s MV/cm) is very close to the fields for break-up of the
materials in the vessel chamber/structure of the accelerators, and the cost of
state-of-the-art accelerators can only be supported by very large consortia of
many countries.
The most promising way to overcome the technological limits in
conventional accelerators is to use plasmas to sustain ultra high gradient
fields. In a plasma, the electrons are already dissociated from the ions.
Therefore, they can sustain very high electric fields, scaling as Emax ~
(n0[cm-3])1/2 V/cm, where n0 is the electron plasma density. For a plasma
with n0=1019 cm-3 produced in a typical low pressure gas chamber, the
maximum accelerating electric field can be ~ 3 GeV/cm, two orders of
magnitude higher than in conventional accelerators.
Frontiers on Extreme Plasma Physics 157

Figure 4. Propagation of an intense electron beam (shown in orange) in a gas target, ionizing
the gas target and generating a relativistic periodic structure (or wake field) sustaining ultra
high accelerating gradients.

The possibility to excite such fields in plasmas using ultra intense lasers
[7] or particle beams [8], was first proposed by Dawson and co-workers in
the early 1980s, but only nowadays it is possible to efficiently drive
relativistic plasma waves with lasers or particle beams.
A major milestone for laser-plasma accelerators was achieved in 2004,
when three experimental groups confirmed previous simulation results [2] by
demonstrating the acceleration of monoenergetic bunches of electrons with
100s MeV [3]. Similar energies where previously obtained by the same
groups, but not with accelerator type of energy spectra, with low emmi-
tances and very low energy spread electron beams. Further theoretical work
[9] as shown that monoenergetic GeV electron beams are within reach of
present day laser systems. Several groups around the World are pursuing this
goal. Some of these groups are also exploring con-figurations where the
intense laser is guided in a preformed plasma channel. This optical fiber for
ultra high intensities will allow for even higher energies, which can be as
high as 10 GeV for a 2 PW laser in a 10 cm channel [9]. One of the key
158 L.O. Silva

advantages of this technique is that it will allow for the production of high
quality high energy electron beams, with fs duration, high charge (~ nC), at a
high repetition rate (~ 10 Hz), with “cheap” compact systems. This will lead
to the widespread use of GeV electron beams and their associated light
sources, in a myriad of scenarios, and to the development of novel
applications based on this technology. Furthermore, the impact on
conventional accelerators can also be significant since laser-plasma
accelerators will be able to provide the injector for these accelerators, at
much higher energies and with significantly lower costs.
The same intensities available in laser are already available in intense
electron beams of conventional accelerators, such as the 30 GeV Stanford
Linear Accelerator Center (SLAC) beam. In this case, the relativistic
electron beam ionizes the gas target, and drives a wake field in the blow-out
regime, by pushing the plasma electrons out of their propagation. The result-
ing ion channel pulls the electrons to the center thus forming the oscillations
shown in Figure 4. The resulting plasma oscillation moves very close to the
speed of light (with the velocity of the electron beam), and provides an
excellent accelerating structure either for the trailing part of the electron
beam, or for a co-propagating beam. This is the exact same physical picture
that explains the underlying physics in the laser-plasma accelerators [10].
In recent experiments at SLAC, researchers have been able to accelerate
electrons from 30 GeV to 45 GeV in just a few 10s of centimeters [11], a
much shorter distance than the 3 kilo-meters required to accelerate the beam
from 0 to 30 GeV in the main beamlie. The challenge for the next decade is
to implement the afterburner concept, thus doubling the energy of a 50 GeV
electron beam to 100 GeV, and making it possible to reach the energy
frontier, and to probe “Higgs physics”.

3.2 Fast ignition of fusion targets and HiPER

Historically, intense lasers have been associated with research in inertial


confinement fusion (ICF), or nuclear fusion with lasers. The standard
configurations for ICF require hundreds of intense laser beams focused on a
target/pellet with just a few millimeters. The challenges for ICF are as
demanding as for magnetic nuclear fusion (based on tokamaks or stel-
larators), because plasmas are “beasts” subject to instabilities. Taming these
instabilities is a tremendous challenge, and facilities like the National
Ignition Facility (Livermore, California) or the Laser MegaJoule (Bordeaux,
France) aim to achieve ignition using the standard ICF configurations.
The fast ignition concept [12], first proposed by Tabak and co-workers,
provides an alter-native route to ICF that relies on the use of ultra intense
lasers. In the fast ignition of fusion targets, the ignition of the fuel is
Frontiers on Extreme Plasma Physics 159

achieved by sending a relativistic electron beam to the core of the target,


without requiring further compression of the fuel. This electron beam, carry-
ing a current of several 100s MA, is generated by a multi-PW laser,
interacting with the coronal plasma, and it acts like the spark in a gas engine.
Such scenario requires laser systems different from the ones presently
available, strongly relying in CPA technology, its most re-cent developments
(OPCPA), and new amplifier materials for PW class lasers. All these tech-
nologies are to be available in the HiPER laser [5].
This European project for the next decade, now at the financing stage,
will be the focus of research in fast ignition of fusion targets, since facilities
in Japan and in the United States have not been designed to achieve nuclear
fusion conditions along this path. When built, it will achieve ignition i.e. the
energy released will be higher than the energy injected in the sys-tem. It will
also provide a research facility to explore extreme scenarios only achieved in
the most violent astrophysical conditions, and to develop novel plasma
accelerators for electrons and for protons [4, 5].

4. THE FUTURE AT IST

The team involved at IST has been collaborating with the leading groups
in the World, in theory, simulations and experiments, actively involved, and
sometimes even leading, European networks of excellence. The expertise at
IST thus covers theory, simulation, and experiments as shown in Figure 5, in
a combination not easily found in other University laboratories around the
World.
It is a team of young researchers, strongly networked, which has been
able to attract bright students not only from Portugal but also from other
countries in Europe (e.g. UK, Italy, Switzerland). Furthermore, the
Laboratory for Intense Lasers is going to be deployed to a new custom built
laboratory, and a new computing cluster will be available at IST very soon.
The people, the will, and the infrastructures to conduct research in this
exciting field at the highest level are present; the future at IST can thus be as
bright as the laser in the Laboratory for Intense Lasers!
160 L.O. Silva

Figure 5. The full combination of methodologies is available at IST for research in the field
of High Energy Density Science: a strong tradition in Theoretical Plasma Physics, a software
and hardware infrastructure for High Performance Computing, and a Laboratory for Intense
Lasers, hosting a multi-terawatt laser system.

ACKNOWLEDGEMENTS

I would like to acknowledge very useful discussions with all the


members of the Group for Lasers and Plasmas (GoLP), and with Professors
W. B. Mori, R. Bingham, and T. Katsouleas.

REFERENCES
1. R. Davidson et al, Frontiers in High Energy Density Science: the X-Games of Contem-
porary Science. Nat’l Academies Press, 2002.
2. F. Tsung et al, Near-GeV energy laser wakefield acceleration of self-injected electron in
a centimeter scale plasma channel. Physical Review Letters 93, 185002, 2004.
3. S. Mangles et al, Monoenergetic beams of relativistic electrons from intense laser-
plasma interactions. Nature 431, 535-538, 2004; C. Geddes et al, High quality electrons
beams from a laser wakefield accelerator using plasma channel guiding. Nature 431,
538-541, 2004; J. Faure et al, A laser plasma accelerator producing monoenergetic elec-
tron beams. Nature 431, 541-544, 2004
4. L. O. Silva et al, Proton shock acceleration in laser-plasma interactions. Physical Re-
view Letters 92, 015002, 2004.
Frontiers on Extreme Plasma Physics 161

5. J. Fuchs et al, Laser-driven proton scaling laws and new paths towards energy increase.
Nature Physics 2, 48-54, 2006.
6. M. Schiber, For nuclear fusion, could two lasers be better than one? Science 310, 1610-
1611, 2005; M. Dunne, A high-power laser fusion facility for Europe. Nature Physics 2,
2-5, 2006.
7. T. Tajima, and J. M. Dawson, Laser electron accelerator, Physical Review Letters 43,
000267 (1979)
8. P. Chen et al, Acceleration of electrons by the interaction of a bunched electron beam
with a plasma, Physical Review Letters 54, 693 (1985)
9. W. Lu et al, Generating multi-GeV electron bunches using laser wakefield acceleration
in the blowout regime, submitted to Nature Physics (2006)
10. S. F. Martins et al, Three-dimensional wakes driven by intense relativistic beams in gas
targets, IEEE Transactions on Plasma Science 33, 558 (2005)
11. T. Katsouleas, private communication
12. M. Tabak et al., Ignition and high gain with ultra powerful lasers, Physics of Plasmas 1,
1626 1994
NUCLEAR FUSION: AN ENERGY FOR THE
FUTURE

Carlos Varandas and Fernando Serra


1
Centro de Fusão Nuclear – Associação EURATOM/IST, Instituto Superior Técnico,
Universidade Técnica de Lisboa, Avenida Rovisco Pais, 1049-001 Lisboa, Portugal,
cvarandas@cfn.ist.utl.pt; fserra@cfn.ist.utl.pt

Abstract: Nuclear Fusion is a potential clean environment friendly, safe, practically


inexhaustible and economically attractive energy source. The research carried
out since 1970 in this area, in the more developed countries, culminates now
with the construction of the first nuclear fusion experimental reactor, ITER.
This tokamak shall prove the scientific and technical feasibility of fusion
energy, and test simultaneously all the technologies required to operate a
Nuclear Fusion reactor. Afterwards, it will be necessary to transform fusion
energy in electricity and to ensure the continuous operation of an electrical
power plant based on this technology.

Key words: Nuclear Fusion, Energy, Tokamak.

1. INTRODUCTION

The present energetic paradigm of our society, based on the massive use
of fossil fuels, has to be changed rapidly due both direct problems (increase
of oil prices, shortage of reserves and political instability in the main oil
producers countries) and, above all, the serious implications to the climate
and environment of large emissions of greenhouse effect gases into the
atmosphere. On the other hand, the world population keeps growing and
improving the standard of living, facts that lead to a significant increase of
energy consumption [1].

163
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 163–170.
© 2007 Springer. Printed in the Netherlands.
164 C. Varandas and F. Serra

The new energetic policy has to be based in a better efficiency of


production and consumption, in the increase of the use of renewable energies
and in the support to the development of new energy sources [1].
In the present status of the scientific knowledge, Nuclear Fusion has the
potential to become a large-scale energy source, clean, safe, environment
friendly, practically inexhaustible and economically attractive. Clean,
because there are no emissions of greenhouse effect gases into the
atmosphere. Safe, because the reactions can be stopped almost
instantaneously, if a failure occurs in the reactor, as the fuels (deuterium and
tritium, two hydrogen isotopes) are injected in the reactor while they are
being burned. Environment friendly as there is no transport of radioactive
elements outside the power plant and because the nuclear fusion reactions do
not produce radioactive waste. Practically inexhaustible, because deuterium
and tritium can be obtained, respectively, from water and lithium, two
elements largely existing and widely distributed in the Earth’s crust.
Economically attractive, as the kilowatt-hour generated by fusion reactions
is cheaper than when generated by fossil fuels.

2. NUCLEAR FUSION

Nuclear Fusion consists in the coalescence of two light nuclei (isotope


elements of the hydrogen), leading to the formation of more stable nuclei
[2].The energy released results from the difference in the binding energy of
the nuclei and corresponds to the mass reduction of the reagents, accordingly
to the famous Einstein equation E=mc2. This amount of energy is about 10
million times larger than the energy released in a chemical reaction of
burning fossil fuels.
Nuclear Fusion reactions take place in the Sun and other stars. Man is
trying to handle them, in a controlled way, in laboratories to produce energy
that might be used in the generation of electric energy. The easiest reaction
to achieve on earth uses deuterium (D) and tritium (T).

D+T o He4 + n + 17.6 MeV (1)

Although, the atoms should reach temperatures (Ti) corresponding to 10


– 20 keV (about 100 million degrees), in order to overcome the repulsion
forces between the two nuclei until fusion might occur. At these
temperatures D and T are ionized, forming a plasma, the fourth state of the
matter.
In order that a fusion reactor might be economically attractive it is
necessary that the plasma density (n) and the energy confinement time (W)
Nuclear Fusion: An Energy for the Future 165

are high enough so that the energy released by the fusion reactions overcome
the thermal energy lost from the plasma plus the energy spent to operate the
reactor. The Lawson Criterion establishes a minimum value for the product
n TiW

n TiW!um-3 keV s (2)

There are three main methods of confinement which might force the
deuterium and tritium nuclei to fuse together: gravitational, magnetic and
inertial confinements (Figure 1). These confinements use, respectively, the
gravitational forces associated with the huge masses of the stars, a magnetic
field that constrains the motion of plasma charged particles to orbits around
their lines of force , and powerful lasers or particle beams which compress
fuel pellets to densities up to 1000 times those of the solids.

Figure 1. Types of confinement: gravitational (left), magnetic (centre) and inertial (right)

3. SCIENTIFIC RESEARCH AND


TECHNOLOGICAL DEVELOPMENT IN
NUCLEAR FUSION

Significant progresses were achieved from mid last century on the way to
the main goal of the research and development activities in nuclear fusion:
the construction of a commercial reactor, to generate a significant amount of
electrical energy (Figure 2). These progresses were especially remarkable in
the magnetic confinement experiments and, in particular, with the tokamak
configuration (Figure 3). The first D-T fusion reactions were obtained in the
tokamaks TFTR (1994) and JET (Figure 3), where, in 1997, 16 MW of
fusion energy were obtained during 2 seconds, with Q # 0.6 (Q corresponds
to the ratio between the fusion energy and the external energy delivered to
heat the plasma) [3].
166 C. Varandas and F. Serra

4. ITER AND THE PORTUGUESE PARTICIPATION

ITER (“International Thermonuclear Experimental Reactor”) (Figure 4)


will be built in Cadarache (France), within the scope of a wide international
cooperation envolving EURATOM (European Union and Switzerland),
Japan, Russia, South Korea, China and India. This Mega-Project is foreseen
to last 35 years and the investment costs in the construction phase will
amount to about 4000 million Euros. The main objectives of ITER are: (i) to
demonstrate the scientific and technical viability of Nuclear Fusion energy,
by producing 500 MW, during 300 sec, with Q = 10 – 20; and (ii) to test the
simultaneous and integrated operation of all the technologies required for a
Nuclear Fusion reactor.

Figure 2. Evolution of the tokamak dimensions (left) and of the Lawson parameter (right).

Figure 3. Schematic view of a tokamak , where the current is induced in the plasma by a
primary transformer winding (left) and a picture from JET (right)
Nuclear Fusion: An Energy for the Future 167

These technologies cover the areas of ultra-high vacuum, cryogenic


systems, superconductor magnets, materials, tritium handling, radio-
frequency systems and particle beams for heating and non-inductive current
drive, remote handling, real time control and data acquisition and processing
systems, a wide variety of plasma diagnostics and systems for the remote use
of diagnostics and of experimental data.
Portugal participates in the EURATOM Fusion Programme since 1990,
through a Contract of Association with Instituto Superior Técnico, with a
team of 85 people. Several Projects are being carried on, covering the
operation and scientific exploitation of the tokamak ISTTOK (CFN, IST),
participation in the use of JET facilities (Culham, United Kingdom) by the
EFDA (“European Fusion Development Agreement”) Associates,
collaboration in the Project ITER, participation in the Programmes of
ASDEX-Upgrade (Garching, Germany), TJ-II (Madrid, Spain), TCV
(Lausanne, Switzerland) and MAST (UKAEA, United Kingdom), studies on
theory and modeling and R&D activities on control and data acquisition. The
expertise acquired until now enables to foresee a significant participation in
ITER, with special emphasis in the areas of diagnostics integration,
reflectometry, data acquisition, and real time plasma control.

Figure 4. Schematic view of ITER


168

Figure 5. Fast track approach to a fusion power plant


C. Varandas and F. Serra
uNclear uFsion:An nEergyfor the uFture 169

5. NUCLEAR FUSION ELECTRICAL POWER


PLANT

On the way to the construction of a fusion power plant it is necessary to


solve some other problems not included in the specific purposes of ITER.
Among them we underline: (i) the operation in the ignition regime,
characterized by Q = f (Figure 3), that is in the situation when the plasma is
sustained only by the energy produced by the fusion reactions; (ii) the
efficient behaviour of the lithium breeding blankets, used in the tritium
generation inside the reactor; (iii) the conversion of energy fusion into
electricity; and (iv) the continuous operation of the reactor. The first two
problems imply R&D in physics and technology related to nuclear fusion.
The third one should not be a major obstacle as we may apply the same
technology used today in thermal power plants. The last problem implies the
development of new materials, able to stand in continuous operation, the
high fluences of energetic neutrons of a fusion reactor. Following a study
carried out in the European Union, a time schedule was approved [5] (Figure
5) which foresees a period of about 40 to 50 years until a fusion power plant
might be built (Figure 6).

F
igure .6 Schematic view of Nuclear Fusion Electrical Power Plant
170 C. Varandas and F. Serra

ACKNOWLEDGEMENTS

This work has been carried out within the framework of the contract of
Association between EURATOM and IST and the contract of Associated
Laboratory between IST and FCT. The content of this article is of the
responsibility of the authors and does not express any views or compromises
of EURATOM and FCT.

REFERENCES
1. Conference “As Energias do Presente e do Futuro”, in www.cfn.ist.utl.pt
2. Garry McCracken e Peter Stott, “Fusion the Energy of the Universe”, Elsevier Academic
Press, 2005.
3. www.jet.efda.org
4. www.iter.org
5. Chris Llewellyn Smith, “20th IAEA Fusion Energy Conference”, www.iaea.org
PART III

SOCIAL SCIENCES, ECONOMICS AND


MANAGEMENT SCIENCES
REGULATION POLICIES IN PORTUGAL

João Bilhim1, Luis Landerset Cardoso2 and Eduardo Lopes Rodrigues3


Centro de Administração e Políticas Públicas, Instituto Superior de Ciências Sociais e
Políticas, Universidade Técnica de Lisboa, Pólo Universitário do Alto da Ajuda, Rua
Professor Doutor Almerindo Lessa, 1349-055, Lisboa, Portugal

1
j.bilhim@iscsp.utl.pt, 2 landerset@iscsp.utrl.pt, 3lopes.rodrigues@iscsp.utl.p

Abstract: This paper analyses different public policies in the regulation area that may be
applied to competition as a reality that criss-crosses the whole of Portuguese
economy.
It also approaches, from the same optic but with a different focalization, the
issue of concentration in the media sector, as an economic and sociological
reality of growing, unambiguous importance in our present world.
It is concluded that the State should proceed with effective regulation policies
that may handle market failures without, however, generating State failures.

1. INTRODUCTION

The Constitution of the Portuguese Republic (hereafter CPR), in its


article no. 81, paragraph f, grants the State under title, “Priority Charges”,
the duty to “ensure the efficient working of markets, so as to guarantee
balanced competition among companies, oppose any form of monopolistic
organization and repress abuse of dominant position and other practises that
may endanger general interest.”
Meanwhile, the 6th, 2004, Constitutional revision, foresees in CPR
article no. 39, that within the scope of regulation of the media, the
independent administrative entity created for that effect must ensure the right
to information and press freedom, non-concentration of ownership of the
media, independence before political and economic powers, respect for
personal freedom and guarantees, respect for norms regulating media
activities, as well as expression and debate of lines of opinion.

173
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 173–181.
© 2007 Springer. Printed in the Netherlands.
174 J. Bilhim et al.

Being clear as it is, that it is up to Public Administration to create and


accomplish the constitutional imperatives demanded from the State, this
norm, by itself, is clearly revealing of the extensive and complex domains of
scientific research that lie ahead of Public Administration (Bilhim, 2000).
In fact, either efficient market working, or balanced competition, or even
monopolistic organization forms, including abuse of dominant position and
other practices that may harm general interest, provide an almost limitless
source of raw material filled with questionings prone to be answered by
scientific research projects, that abound in the main university centres for
public policies to be found in the developed world.
All these projects have revealed two main inspiring theoretical beacons:
On the one hand, the developments ensuing the first anti-trust federal
legislation, i.e., the Sherman Act (1890), and on the other hand, the
paradigm of undistorted competition, that acquired universal relevance
within a framework of international relations and public administration
existing in several States, and that has operated since the Treaty of Rome
(1957), displaying various positive aspects, out of which European
Construction is only the most obvious.
As for the media sector, pluralism in the mass media is an essential pre-
requisite to democracy, pluralism of the mass media and cultural diversity,
as plural mass media constitute the basis for any public policy of the sector.

2. STATE OF THE ART

Scientific research on public policies referred in the introduction place us


at the centre of all pluridisciplinary problematic associated with a shift in
paradigm from a state that renders service, to one that regulates, and has led
to proliferation of concepts, methodologies and conceptual models that
always beget critical controversy.
Ethimologically, the concept of regulation revolves around two main
concepts: the first linked to the establishment and implementation of rules or
norms, and secondly to the re-establishment of balanced working of the
system. This systemic approach will be constant throughout this paper.
This problematic displays great complexity, every time one tries to
integrate in a coherent vision, different action levels, from local, to regional,
to state (in the Westfalian sense), to the one of the European Communities
and the European Union, moving on to International Organizations with
intergovernmental genesis and nature, such as the most representative cases,
i.e., the World Trade Organization (WTO) and the Organization for
Economic Cooperation and Development (OECD).
Regulation Policies in Portugal 175

As opposed to that, and with great theoretical expression, we have a large


tendency displayed by several American authors affiliated with several
universities (Yale, Harvard, Chicago and Michigan among others)
subscribing to the imperative that the State should define regulation policies
within a diversified taxonomy, even if all of them share the common
denominator of efficiency.
However, from a pragmatic perspective of problem resolution and
insufficiency of resources, it is more and more fundamental to correctly
understand interactivity existing between dynamics of competition processes
and the global welfare of Portuguese economy and society, as well as its
insertion in long cycles of sustainable development.
Among the most promising approaches in terms of scientific research in
order to characterize this competitive order, one should mention the
methodologies aiming to typify the several competition situations depending
on possible behaviour of holders of market power, so as to have the ability to
increase prices and/or reduce offer, for a significant time and in a profitable
way.
Among pioneering works one must mention Frank Knight (1921), who
found near-perfect competition instances in situations of knowledge mobility
on the quality of traded goods, and on prices practised in free entry
environments.
In our days, Frederic Sherer from Harvard University (J.F. Kennedy
School of Government), Massimo Motta from the European Institute of
Florence and Pompeu Fabra University in Barcelona, and Manfred Neumann
from Erlangen-Nürnberg University have stood out in the analysis of
competition situations and in the elaboration of priority elements for any
Competition Policy.

2.1 Pluralism and diversity in the media

The problematic of inter-relation, on one hand among the various


elements of diversity, and on the other hand of the cause-effect relation of its
components, leads to several crossroads that are as many target problems for
academic research. Some of those crossroads go through the relation of
plurality of media owners towards contents offered by their titles, the
diversity of sources and contents made available, diversity among content
producers and their own diversity, the increase in the number of titles and
their relation to diversity and even the diversity of contents offered in face of
the variety of products consumed.
As central issue, we place the relation between media ownership plurality
and diversity of contents offered. Assuming beforehand, that significant part
of academic research in this area does not point out as essential the fact that
176 J. Bilhim et al.

we stand before cross-ownership or other forms of concentration, there are,


however, common feelings to the matter that there are co-relations between
media property structure, including the cases of acquisitions in situations of
cross-ownership, and contents. Other approaches point towards the existence
of factors outside media property structure that also play relevant roles in
contents, while there are others who advocate that these relations may be
disregarded.
A second theoretical approach to this issue remarks that media
concentration, namely cross-ownership, constitutes one of the concrete
aspects bearing on diversity.
Theoreticians who defend this approach generically perceive that there is
an unequivocal relation between media plurality and content diversity,
stressing the fact that emphasis must be put on arraying various fronts to
assure this relation.
Lastly, one should refer the existence of a third theoretical approach,
quite centred on David Demers (2000), who while analysing the
“competition paradox” points to the fact that increased concentration,
namely cross-ownership does not entail negative impact on diversity of
sources or viewpoints.
The present research project is placed within content identification that
structures the aforementioned competitive order mentioned by authors such
as Neumann (2001). This is the competitive order that displays higher
synergy with the competition paradigm consecrated by the Constitution of
the Portuguese Republic.
In fact, efficient market working and balanced competition among
companies is, unlike what one might extract from classics such as Stuart Mill
(1859) or authors close to the Austrian school, such as Hayek (1973), neither
spontaneously created by the market, nor a sub-product of the existence of
boundless contractual freedom.
Quite the contrary, any market on its own, or contractual freedom
exclusively in the hands of full, intrinsic sovereignty, lead to boundless
growth of market power which, within a short while, will distort and
eliminate competition itself.
Key questions that have guided research work bearing on conception,
design, implementation and evaluation of Regulation public policies in
Portugal, within a context of international benchmarking, are the following:

x On what terms does Market Power promote competition?


x Which limits should be set to Market Power to ensure the
inexistence of Entry Barriers?
x Which unbalances in Balanced Competition?
x Perfect Competition or Feasible Competition?
Regulation Policies in Portugal 177

x How to ensure the atomism of role players, product and service


homogeneity as well as mobility of knowledge that are best suited to
Portuguese competition geography?
x Which correlation among Dominant Positions, relevant markets and
competitiveness?
x Which behaviours may be deemed as abuse of dominant position?
x Which methodologies leading to realistic prognosis of companies’
concentrations and mergers?
x Under which terms is previous control of these operations
justifiable?
x Which tests should be applied to accept or refuse such operations?
x Which public measures may be termed State Aids within a paradigm
of undistorted competition?
x Which competitors’ reaction functions, within an environment of
State Aids?
x How to analyze plurality, both from the viewpoint of property or
title ownership, or from the content one?
x How to assess if concentration of property in the media market has
consequences that go beyond the economic domain, also attaining
consumers who, in a democratic society, are embodied as citizens?
x Which monitoring systems should be implemented to evaluate the
impact of concentration on pluralism and diversity?

2.2 Media concentration

The term ‘concentration’ indicates a process, whereas ‘concentration


level’ refers entrepreneurial structure. Thus, when using the term
concentration, both fields are touched.
Media concentration has a historical vision, a typology designated as
horizontal concentration where the phenomenon is analysed within a single
sector of the media market – for instance newspapers (in isolation) or radio
(also in isolation). The phenomenon of concentration is also to be found
vertically. That situation takes place when a company has control and
management of several links of the business chain under consideration. A
third form is diagonal, also known as crossing that is observed when the
provider of media services has hold of activities in diverse areas such as
edition and/or broadcasting.
In the media sector, the concept of concentration is two-folded, as
follows: concentration of company ownership and concentration of editorial
content, this latter often following the former.
178 J. Bilhim et al.

From a socio-political point of view, large transactions in media


companies raise concern over their potential on democratic debate, due to
hypothetical information control.
Opinion diversity associated with a healthy democracy is a natural
concern when one of the main operators in a media sector gains importance
in another sector.
The analysis of the impact of media ownership concentration on
information and opinion diversity may be faced as concentration of editorial
content, where the search for profit may reduce quality, namely through
standardization, that may have negative impact on pluralism.
Daniel Junqua (1999: 131-132) analyses this issue focusing on the view
that concentrations derive from industrial logic and are the consequence of
management concerns directed to profit search, eventually leading to
“disappearance of publications regarded as non-profitable or even
insufficiently profitable and in the regrouping under the same structure of a
more or less high number of titles.” The author, regarding this sector,
defends that thus “expression, freedom spaces close and pluralism
diminishes.”
In one of the most recent documents on media pluralism, and by that we
mean the report presented to the Swedish delegation to the 7th European
Ministerial Conference on mass media policies, held in Kiev (Ukraine), on
10th and 11th March 2005, it is mentioned that “competition among
different media types is on the rise, which leads to an increase of ownership
concentration overall in the media markets.”
Also mentioned in the same conference was the fact that it is acquired
knowledge that there is need for implementation of diversity in the media
sector, and it was added that without diversity there is the risk of losing
important aspects of the democratic process, while the most important
challenges for the media revolve around concentration and pluralism.
The positioning of the Council of Europe in the afore mentioned meeting
points to the fact that with pluralism of the media sector as the key issue of
debate, there is a majority feeling that diversity, both in terms of contents
and ownership, is of vital importance to the democratic process.
Thus, media diversity is essential within news coverage and the process
of opinion making. Lastly, it is referred that pluralism “a vast media sector,
representing a wide scope of opinion, is of fundamental importance for
democracy, where the ability of citizens to form their own opinion about the
society they live in is essential.”
Media contribute to pluralism making the highest possible diversity
available. This way, pluralism appears as a more general, and therefore more
inclusive notion than ‘diversity’, deriving from the fact that it (pluralism) is
Regulation Policies in Portugal 179

not solely linked to contents, but to the totality of social structures (Cavallin,
2000:126).
Different Social Sciences have approached democracy, pluralism and
diversity through diversified viewpoints and criteria. In a comprehensive
way, Georges Burdeau (1966) referred that pluralism existing in Western
societies was viewed, on the one hand through the sociological variety of
political environment as a natural phenomenon, while on the other hand
personal identity was an eminently respected value. This way, pluralism
simultaneously presents social and spiritual components. This relation with
the individual places the analysis of diversity back as one of the components
of pluralism and its relation to citizenship.
Media access and diversity are essential indicators for the assessment of
competition and performance within the media (McQuail, 1992: 65-80).
Competition in the media may be assessed from the proposition “free and
equalitarian access”, both by sellers and buyers in the media market, as
components of development of media diversity.
Competition is not only regarded as a guarantee for product quality, but
also as an agent for innovation and societal pluralism, turning competition
into a motor for Western societies in terms of economic, political and social
competition. These considerations are likewise valid for the media sector.
The same research project also points to several conceptual types of
diversity. The most common approach “of media diversity is made in terms
of reflective diversity”, that evaluates how the scope of populations
preferences is proportionally displayed by the media. This reflective
diversity means equal access to the public – if each individual or group has
equal access to the media to express their preferences, or even contribute to
media contents, then one may say that media are in a situation of reflective
diversity (Idem: 40.41).
The second way to position diversity in the media is developed from a
normative viewpoint. This reflects the idea that the media have penetrating
features in social phenomenology that may considerably influence
population. This way, to avert the appearance of pre-conceptions in public
opinion, media contents should express different opinions on a basis of
equality and in an obvious way. This type of diversity is open diversity: it is
an extension in which preferences and diversified opinions have an
equalitarian treatment in their media representation (Idem: 40.41). In a
nutshell, this open diversity may be regarded as equality of access of ideas to
the communication system.
180 J. Bilhim et al.

3. THREAT TO CONTENT DIVERSITY AND


MARKET FORCES

Shapiro and Varian (1999) consider two dominant structures in the


market of information products. On the one hand, the one of companies with
market control that do not necessarily offer the best products but that have
competitive edge in face of their smaller competitors, thanks to their
dimension and scale economies. These companies with dominant capacity
opt for strategies to reduce their average costs through increased volumes
obtained via the re-sale or re-utilization of the same product. On the other
hand there are companies that opt for a differentiation strategy that adds
value to information so as to distinguish it from competition. This appears
due to personalization of products based often on the creation of specific
sources of information and design.

4. CONSUMPTION

The introduction of consumer as the final link of value chains linked to


media industry has turned into an interaction pole of the operative structure
with the customer. This singularity results from a set of socio-cultural
changes originating in the last generation of the second millennium. While,
in the 1970’s, the focal point in terms of consume was placed on acquisition
power, in the 1990’s with the sequential densification of importance of
factor – consume, in the economy of company and country, a vision close to
the concept of citizenship arises. Consumer is regarded as a role player,
influencing with her/his decisions, policy, environment, and culture
(Boisdevesy, 1996).

5. CONCLUSION

Within these research projects there is a common denominator of specific


handling of competition regulation and the media sector. Therefore, and in
matters concerning competition regulation, research projects that have been
developed at ISCSP, in the Centre for Public Administration and Policies,
agree that the scope of public policies, that are to be well placed in
international benchmarking, involves suppressing market flaws without
generating State ones (Roger and Bruce, 1994). This is however a conclusion
that demands persisting continuity in research, delimitating contexts,
specifying premises prevailing upon each competition restriction and
exploring the several hypothesis that are present.
Regulation Policies in Portugal 181

As for the media sector and in view of its great tendency to mutate, both
economically and technologically, it is pointed to the creation of regulating
solutions, based on a harmonic interplay among regulation, self-regulation
and co-regulation, that without capture of regulator may ensure the
effectiveness and competitiveness of the media system

REFERENCES
1. BILHIM, João, Ciência da Administração, Universidade Aberta, Lisboa, 2000.
2. BOISDEVÉSY, Jean-Claude, Le Marketing Relationnel, Paris, Les Éditions
d’Organisation, 1996.
3. BURDEAU, Georges, La démocratie, Paris, Les Éditions du Seuil, 1966.
4. CAVALLIN, Jens, “Public policy uses of diversity measures”, in PICCARD, Robert G.,
Measuring media content, quality and diversity, Turku School of Economics and
Business Administration, Business Research and Development Centre, Media Group,
2000.
5. COUNCIL OF EUROPE, 7th European Ministerial Conference on Mass Media Policy,
Kiev (Ukraine), on 10th and 11th March 2005.
6. DEMERS, David, Why Do Media Merge? The Paradox of Competition of course, 1
Global Media, News, no 1, 2000.
7. JUNQUA, Daniel, “La presse, le citoyen et l´argent”, Paris, Gallimard/Le Monde, 1999.
8. MCQUAIL, Denis, Media Performance, Mass Communication and the Public Interest,
London: Sage, 1992.
9. NEUMANN, M., “Competition Policy”, ed. Edward Elgar, Cheltenham, UK, 2001.
10. NOLL G. ROGER and OWEN M. BRUCE, “The anti-competitive uses of Regulation”,
in KWOKA and WHITHE, ed. Harper Collins, 1994.
11. SCHERER, F. M., “Public Regulation Economics, Then and Now” in SCHERER, (col.)
Competition Policy, Domestic and International, ed. Elgar, 2000.
12. SHAPIRO, Carl, VARIAN, Hal, Information Rules: A Strategic Guide to the Network
Economy, Harvard Business School Press, 1999.
THE GROWING RELEVANCE OF AFRICA IN
CHINESE FOREIGN POLICY: THE CASE OF
PORTUGUESE SPEAKING COUNTRIES [1]

Ana Alves and António Vasconcelos de Saldanha


1
Instituto do Oriente - Instituto Superior de Ciências Sociais e Políticas (ISCSP),
Universidade Técnica de Lisboa, R. Almerindo Lessa, Pólo Universitário do Alto da Ajuda,
1349-055 Lisboa, Portugal,, ioriente@iscsp.utl.pt

Abstract: In recent years there has been growing evidence of a renewal of the Chinese
interest in Africa, not for ideological reasons, as in the past, but rather due to
economic motivations. In this new framework, African Portuguese speaking
countries (Angola, Mozambique, Guinea-Bissau and Cape Verde) are
interesting case studies. This paper argues that the PRC foreign policy to
Africa has substituted the ideological approach by an economic one, exporting
to the third world its development model. In the particular case of African
Portuguese speaking countries, China strategy takes advantage of a common
denominator between those countries and a Chinese territory that was formerly
under Portuguese administration: Macau..

Key words: Portugal, Africa, China, Angola, Mozambique, Macao, Foreign Policy,
Portuguese Speaking Countries, Economy, WTO.

1. CHINESE OUTWARD INVESTMENT FLOWS

China’s accession to the WTO in 2001 accelerated a trend that was


already taking place since the last decade, which was China becoming one of
the largest recipients of Foreign Direct Investment (FDI). In 2003 China
received the total sum of 53,505 billion Dollars [2], according to the World
Investment Report (UNCTAD) figures, which represents half of the total
FDI flowing to Asia, around 1/3 of the Investment going to developing
countries and almost 10% of the world investments flows [3]. In that same
year China overtook the US as a destination for FDI becoming the world

183
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 183–196.
© 2007 Springer. Printed in the Netherlands.
184 A. Alves and A. Vasconcelos de Saldanha

largest FDI recipient (excluding Luxembourg). This issue has nourished a


growing interest in the last few years among academia and Media around the
globe, diverting world’s attention from another emerging trend in the
meanwhile: that China is becoming itself a very strong investor abroad.
According to UNCTAD 2004 report, China is expected to become very soon
the world’s 5th largest outward foreign direct investor, displacing Japan [4].
Indeed, financial strength and exposure to international business have
encouraged China to venture abroad. The more it becomes integrated in the
world economy and Chinese firms become subject to international
competition through imports and inward investment, the more these firms
are pushed to invest abroad improving their international competitiveness.
In 2002 the Chinese government officially launched a programe to
internationalize public and private enterprises, which is usually referred as
the «going global strategy». At the end of 2003, 3,439 mainland enterprises
had established 7,470 companies in 139 countries around the world, being
that 43 % of registered investors are State Own Enterprises and that 48,4 %
of the total investment abroad went to the mining sector, mainly oil and gas
[5].
In the meanwhile a number of Chinese enterprises are emerging rapidly
in the course of internationalized competition. «While some have established
distribution networks for their own brands in the international market, others
have acquired or merged with multinational companies to embark on a truly
internationalized development strategy» [6]. Some Chinese enterprises, such
as Petro China, Sinopec, home-appliance maker Haier and energy giant
Huaneng Group, are becoming well known abroad by setting up joint
ventures or acquiring shareholdings in countries like the United States,
Australia, the Philippines, Malaysia and Indonesia [7]. Other remarkable
examples include the merger of TCL with the television division of
Thomson of France, the acquisition of Ssang Yong of South Korea by
Shanghai Automotive, and the acquisition of IBM's PC division by Lenovo.
In what concerns the geographical distribution of the Chinese investment
abroad, statistics from the Ministry of Commerce suggest that about 60 % of
the approved Chinese outward investment from 1979 to 2002 was destined
to Asia. North America was the second most popular destination, followed
by Africa and Latin America. However, this framework has been changing
recently [8]. In 2003, according to the ‘China Outbound Investment
Statistics Report’ [9], Asia was still the main destination, accounting for half
of the total of Chinese outward FDI (USD 1.5 billion) with Hong Kong
grabbing the biggest piece (USD 1,15 billion), followed by South Korea,
Thailand, Macao, Indonesia and Cambodia. North America lost the 2nd
place to Latin America, which absorbed 36.5 % (USD 1.04 billion) of the
total Chinese investment abroad in 2003, followed by Europe with 5.1 %
The Growing Relevance of Africa in Chinese Foreign Policy 185

(USD 150 million, mostly in Denmark, Russia and Germany), and Africa
with 2.6% (USD 75 million, USD 135 million in 2004) being the major
recipients Nigeria, Mauritius and South Africa. North America ranks 5th
with 2% (USD 58 million) and, last, Australia and New Zealand with 1.1%
(USD 34 million).
Observing these recent trends it is possible to say that Latin America and
Africa have been attracting growing attention from Chinese investors, even
though Asia remains by far its major destination. So what magnetizes
presently the Chinese interest on these regions? The obvious answer is that,
as some Asian countries, Latin America and Africa are advantageous regions
with vast natural resources and low-income levels.
The abundance of raw material plus the political and economic fragile
environment, particularly in the African continent, are indeed strong magnets
for the world fastest growing economy, which desperately needs to access
new sources of natural resources to fuel its growth and also needs markets to
export, more than its consumer goods, the technology and expertise that
sustain its development model. And in fact, Chinese success makes it a top
example for these countries as its recent history fits their background:
underdeveloped economies devastated by political and natural disasters.
The handy help of China comes in a particularly favorable time as the
African continent remains by and large marginalized in the world economy.
Over half of its population live under US$1 a day per person. According to
the world Bank [10], in Sub-Saharan Africa, the GDP per capita declined
from USD 525 in 1970 to USD 336 in 1997. In this framework, where there
is no room for generating domestic wealth, international financial aid and
foreign investment become vital. Furthermore, Western donors and
Multinational Corporations (MNC), the main sources of financial aid and
investment in the black continent, have been fleeing the region since the end
of the 90’s, exhausted by the null results of their long-lasting efforts because
of the constant wars, political arbitrarities and widespread corruption, now
aggravated by the pressures of economic globalization.

2. EVOLUTION OF PRC FOREIGN POLICY


TOWARDS AFRICA (1949-2005)

The China focus on Africa is not however, a brand new fact as it has been
paying great attention to the black continent since the founding of the PRC
in 1949. Nonetheless the nature of this interest has been changing overtime.
During most of the Cold War period the reasons of the Chinese interest in
Africa were mainly political. Even economic and technological aid pursued
strict ideological purposes, contending with the USSR for the international
186 A. Alves and A. Vasconcelos de Saldanha

leadership of socialism and also trying to gain support to recover the seat at
the United Nations Security Council. In this framework, throughout decades
China’s foreign diplomacy towards Africa was openly intended to «support
the African (…) people in their struggle to oppose imperialism and old and
new colonialism» as stated in the first of the five principles enunciated by
premier Zhou Enlai during his tour to Africa in January 1964 [11].
In the 70’s most African countries achieved independence and the
Chinese speech made a small adjustment, stressing then the
underdevelopment condition as the key issue linking Africa and China. In
this new framework, Mao Zedong, during a visit of President Kenneth
Kaunda of Zambia, made an appeal for the ‘three worlds’, classified as the
third world (Asia, Africa and Latin America), to unite.
In the 80’s, as the Cold War started fading away and China consolidating
the economic reforms initiated by Deng Xiaoping in 1978, Beijing start
looking to the African continent with a much more acute economic interest.
This was the turning point from where China’s foreign policy towards
Africa became more and more economic oriented. The new Chinese foreign
policy endeavour for Africa became even more visible in the second half of
the 90’s. During his visit to six African countries in May 1996, Jiang Zemin
proposed the development of a long-term and more structured cooperative
relationship between China and the African countries.
This idea gave birth to the China-Africa Cooperation Forum (CACF) and
the opening ceremony took place in Beijing in October 2000. The strong
commitment of the Chinese central government in the creation of the forum
clearly illustrates the growing importance of economic affairs in China’s
relation with Africa in the beginning of the 21st century. The ideological
issue was definitively left behind to be replaced by the purpose of exporting
its developing model to other 3rd world countries.

3. THE CHINA-AFRICA COOPERATION FORUM

The Forum is to be held every three years, with China and Africa taking
turns in hosting the event. The first CACF meeting was attended by
representatives of China and of 45 African countries (out of 53). The fact
that the Chinese president Jiang Zemin, the Vice President Hu Jintao and the
premier Zhu Rongji attended this first meeting confirms the importance of
this venture to China. The Forum stresses the need to develop a new
international political and economic order and strengthening cooperation and
trade between China and African countries based on equality and mutual
benefit. The meeting issued the ‘CACF Beijing Declaration’ and the
‘Programme for China-Africa Cooperation in Economic and Social
The Growing Relevance of Africa in Chinese Foreign Policy 187

Development’. A few ministerial commissions were established in order to


coordinate the implementation of the agreements reached.
The second meeting took place in Addis Ababa (Ethiopia) in December
2003, being China represented at the highest level by Premier Wen Jiabao.
The delegates agreed on the ‘Addis Ababa Action Plan (2004-2006)’ which
aims at promoting cooperation in a wide range of fields, namely, political
affairs, peace and stability issues, multilateral cooperation, economic and
social development [12].
Economy has been indeed the strongest wing of the forum since its
founding. In the last four years China cooperation has become widely visible
all around Africa, particularly in agriculture, infrastructure construction,
trade, investment, development aid, Human and natural resources
development and debt relief.
Since the creation of this forum China reduced and exempted a total of
1.3 billion USD (10.5 billion RMB) of debts owed by 31 African countries
and trade between both parts has rapidly increased since then. From 10.6
billion USD in 2000 it grew to 14.0 billion USD in 2003 and to almost 30
billion USD in 2004. China's exports to Africa amounted to 13.82 billion
USD (a rise of 35.7 % over the previous year) and its imports from Africa to
15.65 billion USD (an increase of 87.1 %), [13] meaning a trade deficit of
almost 2 billion USD for China. In January 2005 started the tariff exemption
policy for 190 goods coming from 25 least developed African states enacting
a promise made two years earlier by Premier Wen Jiabao, and this is
expected to further boost bilateral trade, which only in the first 3 months of
2005 amounted to 7.6 billion [14]. According to Chinese Ministry of
Commerce, half of Chinese exports to Africa are mainly finished products,
textiles, machinery, electronic products and hi-tech products. As for Chinese
imports from Africa, 87% are primary products like crude oil, iron ore, steel
and diamonds [15]. Despite the fast increase in bilateral flows, Africa still
represents only 2.6 % of China’s total foreign trade volume [16], meaning
there is a huge potential to grow.
Chinese investment in Africa is also increasing rapidly, accounting a total
of 135 millions USD in 2004 [17], the year of biggest investment in Africa.
By the end of 2004 there were 715 Chinese-funded enterprises operating in
Africa ranging from trade, processing, manufacture, transportation and
agriculture, to resources development. China has also been strongly
committed to train African human resources [18] and paying an important
assistance in healthcare and natural disasters. The main destinies of Chinese
cooperation and investment flows are Nigeria, Uganda and South Africa, and
more recently, Angola, all countries with large natural resources.
The Chinese investment in the African continent, as underlined by
President Hu Jintao during his visit to Gabon in January 2004, is mostly
188 A. Alves and A. Vasconcelos de Saldanha

directed to infrastructure, agriculture and natural resources development.


China not only provides the funding and the necessary equipment as,
sometimes, also the labour force. It has been building roads (Ruanda),
telecommunications networks (Ethiopia), airport terminals (Algeria), even
government facilities and convention centres as gifts (Gabon, Ivory
Coast…). It is becoming very common, even for a simple tourist, to find
nowadays this kind of Chinese trace everywhere in Africa.
China is slowly emerging as a major player in the black continent [19]
with an important political surplus over its western counterparts: it does not
impose human rights or democratisation restrictions. Not recognizing Taipei
government is the only pre requisite.

4. BRIDGING CHINA, MACAU AND THE AFRICAN


SPEAKING COUNTRIES: FORUM FOR
ECONOMIC COOPERATION AND TRADE
BETWEEN CHINA AND THE PORTUGUESE
SPEAKING COUNTRIES

The existence of the China Africa Cooperation Forum does not hamper
Beijing from developing additional strategies to deepen its relations with
some specific countries. That is the case of African Portuguese speaking
countries: Angola, Mozambique, Cape Verde and Guinea Bissau. Only São
Tome & Principe is out of the arrangement since it has not yet reestablished
diplomatic ties with Beijing as it is one of the few countries in the world that
does recognize the government of Taiwan.
Those four countries, that are also part of the CACF, have a very old
connection to China, which is Macau as they had in common the Portuguese
rule for centuries. That reality left behind a bond based on the Portuguese
language and culture as well as the administration structures and the legal
framework. In the early years of Macau’s self rule (1999-2002) the potential
of this common heritage was best avoided as it reminded the colonial past.
The Portuguese African colonies were the last to achieve independence in
that continent, right after the military coup in April 1974 which brought to
an end the dictatorship in Portugal. China, though, did not accept the
devolution of Macau at that time since it did not consider Macau to be a
colony but a territory under foreign rule as Hong Kong was. Sino Portuguese
negotiations to return Macau to Chinese sovereignty followed then the
pattern of Sino British ones. Under the principle of ‘one country, two
systems’, Macau’s Handover took place two years after Hong Kong, in
December 19th 1999.
The Growing Relevance of Africa in Chinese Foreign Policy 189

The new executive government of Macau was very cautious in handling


the Portuguese heritage in the following three years, fearful that it could
somehow damage its relation with the Central Government. However, it was
the central government itself that latter gave directions to the Chief
Executive to reinforce and take advantage of the Portuguese identity of
Macau which bonds it to several countries around the world. In fact, it was
under the initiative of the Chinese Ministry of Commerce that Macau
became the headquarters of the Forum for Economic Cooperation and Trade
between China and the Portuguese Speaking Countries in October of 2003.
The new framework after 1999 gave China the ability to maximize the
potential of Macau as a special channel to cooperate and invest in those
countries.
This new trans-regional forum aims at promoting mutual development by
enhancing economic cooperation and trade between China and Portuguese
speaking countries [20]. Eight countries were present in the founding
meeting that took place in Macau in 2003 from the 12th to the 14th of
October: China, Portugal Brazil, Angola, Mozambique, Cape Vert, Guinea
Bissau and East Timor, plus S. Tome & Principe as an observer and Macau
as part of the Chinese delegation. Each country had a delegation composed
by governmental and entrepreneurial representatives. China was represented
by the Vice Premier, Wu Yi, and the Vice Minister for Commerce, An Min.
The Ministerial meeting approved an action plan based on information
interchange, investment environment improving in accordance with
international rules, organisation of fairs, joint-ventures promotion, diversify
cooperation areas to agriculture, infra-structures construction, development
of natural and human resources. It was also established that the forum would
take place every three years and that the permanent secretariat would be
based in Macau. The Organising committee is jointly presided over by the
Chinese Minister for Commerce and Macau’s Chief executive, Edmund Ho.
Above them are the adjunct of the Minister of Commerce and Macau’s
Secretary for Economy and finance assisted by people from the Ministries of
commerce and foreign affairs of China, the State Council’s bureau for Hong
Kong and Macau affairs, and from Macau’s executive government.
Staff from Macau’s executive government exclusively composes the
permanent secretariat (PS) responsible for coordination and developing the
agreements. The PS is assisted in this enterprise by a support office specially
created for this purpose in 2004, which has been developing many activities:
promoting high level visits, entrepreneurial meetings, trade, Investment and
economic and technological cooperation, training for human resources,
advertising the Forum (in international fairs, internet site since 15 March
2005…). In the beginning of 2005 all countries agreed to create in the near
future a special bank within the forum.
190 A. Alves and A. Vasconcelos de Saldanha

For both historical and practical reasons Macau is, indeed, a privileged
platform to realise this endeavour as underlined by most of the official
representatives of the eight countries that attended the first meeting [21].
Macau has since the 16th century been in touch with the Portuguese
speaking world, accumulating human connections and through them trade
networks that survived the return to China. Portuguese language and culture
have also left behind Institutional ties as Macau retained the membership of
the ‘Union of Portuguese speaking cities’ (UCCLA), the ‘Meteorological
Organisation of Portuguese speaking countries’ and of the ‘Association of
Portuguese Speaking Universities’ (AULP). More recently (2003) Macau
has applied for an observer statute within the ‘Community of the Portuguese
Speaking Countries’ (CPLP). This fact is full of significance if we take into
consideration that during the Portuguese rule, and despite many efforts,
China never allowed Macau’s entry to CPLP.
On the other hand, Macau has acted as an open gate for China for several
centuries having today a special relation with the Central Government as part
of the ‘second system’ with a large economic and political autonomy.
Macau owes, in fact, much of its importance within the PRC to this
pivotal role between mainland China and the Portuguese speaking countries.
Furthermore, its location in the fringes of Guangdong province (where most
of the factories are located), its open economic regime and low tax rates
(15%), a well developed service sector, a large pool of qualified human
resources and an international standard higher education institutions, makes
it a very attractive platform for international business.
Portuguese language and culture, even if not as well disseminated in
Macau as it is in other former Portuguese colonies, have indeed acquired
great importance as a mean for bridging China with those countries. There
are presently twelve universities in Macau providing various grades in a
wide range of fields. Most of them have established cooperation protocols
with their counterparts in Portuguese speaking countries for interchange and
also with mainland Universities, mostly for Portuguese language teaching.
This trend has become even more notorious after the handover and much of
it under the initiative of the Chinese authorities.

5. THE IMPORTANCE OF THE AFRICAN


PORTUGUESE SPEAKING COUNTRIES FOR
CHINA

Despite the large potential of this new cooperation Forum there is still a
long way to go. Both parts are still quite ignorant in what concerns each
other markets, medium and small enterprises, legal framework, native
The Growing Relevance of Africa in Chinese Foreign Policy 191

languages and cultures, etc. In addition, the internationalisation of the private


sector is a very recent phenomenon in mainland China and it is at a very
embryonic stage in the African Portuguese speaking countries (PSC). We
can say then, that at the moment there is a strong will from both parts to
increase economic interaction but the structures are still very incipient.
In reality, the market represented by this forum has only a relative
importance to China as in terms of commercial volume it stands for an
almost irrelevant piece of the total foreign trade of China [22]. Nevertheless,
the total volume of trade between China and the PSC grew from 11,000
million USD in 2003 to more that USD 18.000 million in 2004, meaning an
increase of 64% [23]. The volume of Chinese imports of PSC largely
overcomes that of exports: USD 13,728 million and 4,543 million,
respectively.
If we take into consideration that Brazil account for more than half of
that trade volume and represent 176 million of the total of 220 million
people that speaks Portuguese in the world, added to the fact that it has a
strong and long established partnership with China (being China Brazil’s
second biggest commercial partner, after the USA), we may conclude that
Brazil was not the main target when China begun to organise this Forum.
Nor was Portugal, which owes its importance more to the fact of being
part of the European Union than as a strategic commercial partner to China.
What does matter are, indeed, the African Portuguese speaking countries and
East Timor, because despite their small population and broken economies,
they do represent a large pool of under explored natural resources that goes
from fisheries, agriculture, forestry and tourism to natural gas, coal, mining
and oil. Bearing this in mind and the fact that China has surpassed Japan in
2003 as the second world largest oil importer after the United States, the
increasing interest of Beijing in African countries becomes clearer.

6. CHINESE ECONOMIC COOPERATION WITH


AFRICAN PORTUGUESE SPEAKING
COUNTRIES

As we saw, two-way trade between China and African Portuguese


speaking countries have been increasing at a fast rate in recent years, being
Angola the largest trade partner of China.
Angola and the PRC established diplomatic ties on January 12, 1983.
One year after they signed a trade agreement and in 1988 a mixed economic
and trade commission was set up. But it was only after the end of the civil
war in 2002, that Chinese cooperation gain momentum. In fact, the bilateral
trade volume that was approximately 1,150 million USD in 2002 grew to
192 A. Alves and A. Vasconcelos de Saldanha

4,900 million USD in 2004 [24]. China exports are mainly constituted by
textiles, shoes and electrical equipment but bilateral trade is dominated by
Chinese oil imports.
Mozambique [25] come next. Bilateral trade volume doubled in two
years time: from 48.5 million USD in 2002 [26] to 119 million USD in 2004.
[27] Guinea Bissau [28] ranks third with 4.5 million USD in 2002 that grew
to 6.2 million USD in 2004. Cape Verde [29] is at the bottom since it is a
very limited market having less than half a million people. Bilateral trade
was 2.75 million USD in 2004. In Guinea-Bissau as in Mozambique and
Cape Verde the trade balance is dominated by Chinese exports.
In 2002, combined exports of China to these countries, mostly composed
by light consumer goods, totaled 94 million USD; and imports, dominated
by raw materials – mainly crude, accounted for 1,110 million USD, meaning
a heavy trade deficit for China [30]. In 2004 total bilateral trade with these
countries amounted to the astonishing figure of 18,200 million USD,
testifying the impulse given by the forum created in Macau in 2003.
Chinese investment in those countries is harder to track down. Nor China
nor the African countries concerned have structured statistics on the issue.
The scarce numbers available are mainly found in articles published in local
newspapers, which are not always reliable. Departing from the figures we
could get, we would like first to draw your attention to the fact that, in 2002,
Chinese investment in Angola (150 million USD) seems to be almost
equivalent to the one destined to Brazil (157 million USD). The other
countries seem to be receiving much less (10 million USD to Cape Vert, 4
million USD to Guinea Bissau and no figures for Mozambique).
The reason why these countries are attracting public and private Chinese
capital are related to the fact, as mentioned before, that all these countries
despite being underdeveloped economies are quite rich in natural resources.
Angola, is once again number one: has got petroleum, diamonds, gold,
uranium, phosphates, etc.; Mozambique has got coal, natural gas, titanium,
semi-precious stones, besides food crops (cashews, corn, cotton, sugar,
copra…) and fisheries…; Guinea Bissau has bauxite, phosphates and
offshore petroleum, food crops and fisheries as well; and Cape Vert, the
poorest, has salt, limestone, food crops and fisheries.
Aside private and public investment, China has also been granting
important financial aid to these countries, which can be translated to
additional negotiating power and an increasing leverage over those
countries. Not surprisingly, Angola has been the largest recipient. Apart
from relieving all the expired debt from Angola, China lend 300 million
USD in 2003 for the reconstruction of the railroad in Luanda, and in March
2004 China’s Exim Bank signed a credit line of 2 billion more, in very
advantageous conditions [31], to rebuilt infrastructures destroyed during the
The Growing Relevance of Africa in Chinese Foreign Policy 193

war. In May 16, 2005, 12 individual accords of credit were signed within
that loan destined to support projects in the fields of agriculture, energy and
water, education and mass media. Not only Chinese enterprises but also
Chinese labour force have been leading the reconstruction projects allover
the country: from schools and hospitals to roads, social housing and
telecommunications. Angola successfully hosted the second entrepreneurial
fair of the forum in March this year.
The same path can be observed in the other three countries although at a
smaller scale. China has exempted their debt, has been signing bilateral
agreements on economic and technological cooperation, and has been
financing infrastructures construction (roads, telecommunications, hospitals,
housing, Dams, hotels…), investing in agriculture and fisheries, and natural
resources development. By this means, Beijing has quickly turned into a
major cooperation partner of these countries and is becoming a very influent
agent in the region.

7. CONCLUSION

The present is best described as a time of fast growing globalisation where


wealth tends to be concentrated in regional blocks connecting the strongest
economies of the northern hemisphere. However, trans-regional cooperation
is also becoming a very important trend in this new framework. Even though
servicing the synergies generated by the interaction of less developed or
underdeveloped economies, trans-regional agreements are becoming an
important instrument for spreading development. It is within this framework
that China has been maximizing the potential of south to south cooperation.
Expansion of trade, investment and high tech cooperation are indeed
enhancing economic ties between China and Africa and boosting bilateral
relations, confirmed by increasingly frequent high level exchanges. As a
consequence, China and African countries are also increasingly enjoying
good cooperation in the international arena. China is slowly placing itself to,
with Africa and other developing countries support, act internationally as the
protector of the common interests of developing countries and the agent of a
new international economic order hopefully fairer. Successful examples of
this endeavour are the formation of G20 during the round of WTO talks in
Cancun (this year presided over by China); the creation of China Africa
Business Council within United Nations Development Programme in March
2005; and the Asia-Africa Summit that took place in April 22-23, 2005, to
revitalize the spirit and celebrate the 50th anniversary of the Bandung
Conference, issuing a declaration on ‘Asia Africa New Strategic partnership’
194 A. Alves and A. Vasconcelos de Saldanha

[32]. This leadership venture is of particular importance to China if we take


into consideration the international agenda of 2005 where the UN will
discuss coming September the first five years of implementation of
Millennium Development Goals (which specially affect African continent)
and the UK presidency of European Union, starting July 1st, has established
Africa as its main priority.
In what concerns the Portuguese speaking African countries, China has
been clearly trying to place Macau as an interface for this purpose. The fact
that it was the Chinese central government itself that took the initiative to
create and base in Macau a more exclusive forum with those Portuguese
speaking countries has a strong meaning: that China definitively wishes to
strengthen its relations with those markets and sees Macau as the best
instrument to promote it. Macau’s handover in 1999 gave China the ability
to maximize the potential of this platform as a special channel to cooperate
and invest in those countries.
However, Macau’s role should not be overstated, as the territory is not
the sole platform for China to cooperate, invest and trade with those four
markets. The role of Macau is, in reality, much more symbolic: it works as
an attraction pole right in the border of China for the countries that share
Portuguese language and culture and it works also as a guarantee for the
integrity of Macau within the region, resisting by this mean the assimilation
by Zhuhai or Hong Kong.
Macau has thus been given an instrumental role by Beijing within a wider
Chinese strategy for economic internationalisation to a continent where
economic opportunities are still plenty.

NOTES
1. This paper is part of a wider research on ‘Macau as a linkage platform between the PRC
and African Portuguese speaking countries’ financed by the Portuguese Foundation for
Science and Technology.
2. If we add the FDI inflows of HK SAR, that sum rises to 67.066 billion USD
3. Data collected from UNCTAD FDI Database online (www://stats.unctad.org/fdi/eng)
4. United States remains far and away the world biggest FDI source, followed by Germany,
United Kingdom and France.
5. «Outbound Investment Increasing», China Daily October 22, 2004 in
http://www.china.org.cn/english/BAT/110044.htm
6. Hong Kong Trade and Development Council «’Going Out’ via Hong Kong: Mainland
Enterprises' Fast Track to the International Market» 4 May, 2005,
http://www.tdctrade.com/econforum/tdc/tdc050501.htm
7. Ministry of Commerce of the PRC, «China Eying Regional Development», February 10,
2004, Official site.
8. Idem
The Growing Relevance of Africa in Chinese Foreign Policy 195

9. Jointly issued by the Chinese Ministry of Commerce and the National Bureau of
Statistics, cited in: «China pours more money overseas», China Daily, October 22, 2004
10. www.worldbank.org
11. ‘Chinese Leaders on Sino-African Relations’, www.china.com.cn/english/features/
China-Africa/ 82054.htm
12. ‘China-African Forum reaches action plan’, www.china.com.cn/english/international/
82640.htm
13. Liang Guixuan (Minister Counselor) «Perspectives on China-Africa Trade And
Economic Cooperation», Presentation by at the 4th Tswalu Dialogue, 2005/05/09,
Embassy of China in South Africa, http://www.chinese-embassy.org.za/eng/zfgx/
zgyfzgx/t194633.htm
14. «China-African trade tended to be balanced», Network center of MOFCOM, 17-05-
2005.
15. Idem
16. Liang Guixuan, idem
17. «China-African trade tended to be balanced», Network center of MOFCOM, 17-05-
2005.
18. China offers 1.500 scholarships every year to African students to study in China and
sends also many nationals experts in various fields to African countries.
19. ‘China emerges as a major player in African Politics’, in: Alexander’s Gas and Oil
Connections, vol 9, issue # 5, News and Trends: Africa, March 10th, 2004. online
version: www.gasandoil.com/goc/news/nta41000.htm
20. A few days after launching the Cooperation Forum China signed a ‘Closer
economic partnership arrangement’ (CEPA) with Macau executive government.
This agreement aims to strengthen the economic interaction between Macau and
China by further facilitating the entrance of Macau’s enterprises, goods and
services in the mainland. Full of significance is the fact that the Chinese vice-
premier, An Min, said that China wishes to double the trade volume with these
countries in the next five years and expects CEPA to serve also this purpose by
attracting capital from the Portuguese speaking countries
21. As said in public declarations during the Forum meeting, namely, by the Chinese Vice-
premier, Wu Yi; the Chinese Vice Minister for Commerce, An Min; Macau’s Chief
Executive, Edmund Ho Au-wah; the Portuguese representative of the Portuguese
Minister of Economy, Franquelim Alves. Not surprisingly, the African representatives
speeches underlined more the Chinese assistance during the anticolonial-war.
22. Total foreign trade in 2003 was 800 bn USD and total trade with PSC was only 11.000
mm USD, estimated
23. Official site of the Cooperation Forum between China and Portuguese speaking
countries.
24. Official statistics cited by Vicente Pinto de Andrade, «Angola na rota da Ásia»,
Courrier Internacional, Nº 2 Abril 2005, p. 28.
25. Maputo and Peking established diplomatic relations on Jun 25, 1975, signed a trade
agreement and an agreement on reciprocal investment protection and set up a joint
economic and trade commission in 2001.
26. Numbers in ‘Fórum para a cooperação Económica e Comercial entre a China e os países
de língua portuguesa (Macau)’, novos caminhos para a cooperação’ by Rudolfo Ascenso.
27. Vicente Pinto de Andrade, ob. cit., 2005
28. PRC and Guinea-Bissau established diplomatic ties on March 15, 1974 but on May 26,
1990, Bissau government set up diplomatic relations with Taipei leading to rupture with
Peking. Diplomatic relations were restored on April 23, 1998.1
196 A. Alves and A. Vasconcelos de Saldanha

29. The PRC established diplomatic relations with the Republic of cape verde on April 25,
1976. In 1998 the two countries signed the agreement for encouraging and mutual
protecting investment. The following year they signed an agreement for trade and
economic cooperation.
30. Numbers given in an article written by the President of Institute for the Promotion of
Commerce and Investment of Macau. Lee Peng Hong, ‘Plataforma Económica e
commercial entrea China e os países de língua portuguesa’, in Ponto Final, 24 de
Outubro de 2003, online version, www.pontofinalmacau.com/print.php?sid=2055
31. Repayable over 17 years with an interest rate of 1,5%. The contract is guaranteed by
assets from earnings from a contract for sale of oil equivalent to 10.000 barrels a day.
32. This Council comprises China and five African countries: Cameroon, Ghana,
Mozambique, Nigeria, and Tanzânia.
ECONOMIC GROWTH THEORY,
FIFTY YEARS AFTER

Paulo B. Brito
UECE and Dep. Economia, ISEG, UTL, Rua Miguel Lupi 20, 1249-078 Lisboa, Portugal;
email: pbrito@iseg.utl.pt

Abstract: Fifty year have passed since Solow’s [21] paper on economic growth theory
has been published. With a non-specialized reader in mind, we present the
main ensuing phases of the theory and the way our own research relates to it.
The history of growth theory is conventionally divided into two phases: until
early 1970’s, the research is labeled exogenous growth theory, and, starting in
late 1980’s until the present, the new growth or endogenous growth theory is
being developed. We present the main models of both theories, the stylized
facts of growth and a broad view on their compliance of theory with them. At
last, we report some avenues that we have been exploring, as well as their mo-
tivation and results. This research addresses the topics: existence of multiple
BGP’s, indeterminacy, non-monotonous transitions and an exploration on the
integration of spatial and growth theories using PDE’s.

Keywords: Exogenous growth theory, endogenous growth theory, multiple BGP’s, inde-
terminacy, non-monotonous transitions, spatial growth.

1. INTRODUCTION
Fifty year have passed since Solow’s [21] paper on economic growth the-
ory has been published. It was a seminal contribution, together with [9] and
[12] versions of [17] model, for not only the modern economic growth theory
but also to the modern macroeconomic theory.
Growth theory developed a clear object: to explain the level of the GDP
per capita and the rate of growth of countries. It is, possibly, the most im-
portant single variable in economics. To an even higher degree than several
other important core variables in economics (the rate of interest, the rate of
inflation, the rate of exchange), it synthesizes the evolution a large number of
underlying phenomena acting in the real world. Growth theory and growth
empirics have being working in tandem in order to characterize the most im-
portant phenomena and single out the variables which are more relevant in
explaining the actual growth history of countries.

197
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 197–216.
© 2007 Springer. Printed in the Netherlands.
198 P.B. Brito

Growth theory developed a clarity of purpose and a solidity of the theor-


etical structure that has made growth theory invade the traditional fields of
macroeconomics.
The array of issues studied in the field is so vast that we do not even try to
survey it. In this paper, we present a synthesis for a non-specialized public,
we list some unsolved problems and report both the aims and the results of
our own research.
The models presented in [9, 12, 21] started a voluminous strand of lit-
erature that is termed today as exogenous growth theory. It is concerned
with the convergence of the aggregate production towards a steady state or a
balanced growth path (BGP). The long run growth rate is either zero or exo-
genously determined by population or by technical progress growth, seen
as exogenous processes. Since the middle of the 1980’s [14, 19, 20] star-
ted the theory of endogenous growth (or new growth theory) giving birth
to the three most significant approaches. In these new generation of mod-
els, the long run growth rate is endogenous and positive and depends on the
existence of externalities, accumulation of human capital and research and
development. There has been an impressive increase in the research both at
the theory level and on its empirical testing. The two way communication
between growth theory and empirics of growth is a distinctive feature of the
modern approach, when compared with the exogenous growth theory phase.
The great amount of research, in several directions, may be appreciated in
the recent handbook [1].
In this paper, we will present the most important landmarks of the theory
of economic growth, emphasizing the formal structure of the models, in or-
der to locate the research that we have been conducting. We focused on the
topics that we think are more relevant for a country with the characteristics of
Portugal: non-monotonous growth may explain in which circumstances short
run divergence does not mean long run divergence; multiple growth paths
may explain under which circumstances the existence of a small short run
divergence could lead to long run divergence; and the interaction between
different geographical areas and its relationship with spatial agglomeration
and convergence.
In Section 2 a broad view on some concepts is presented. In Sections 3
and 4 the seminal exogenous growth models and endogenous growth models,
respectively, are described synthetically. In Section 5 some of our research
in growth theory is reported.

2. PRELIMINARIES
The object of the growth theory is to offer explanations for the time evol-
ution of the GDP per capita. Formally, GDP per capita, Y (t) = F (Z(t), ϕ),
Economic growth theory 199

depends on two types of magnitudes. Firstly, it depends on a set of state


variables that can be exogenous or endogenous and that adjust along time
Z(t) = (K(t), H (t), A(t), X(t)), (1)
where K(.) is the stock of physical capital, H (t) is the stock of human cap-
ital, A(t) measures technical progress, ideas, research and development, and
X(t) denotes natural resources institutions, demography. It also depends on
a vector parameters, ϕ, sometimes called the fundamentals, which include
preferences, technology, demographics, and other parameters that explain
long term growth.
Balance growth path (BGP) and convergence are two important concepts
in the theory. The BGP is a trajectory such as
Z(t) = zeγz t (2)
where the rates of growth, γz of all components of Z(t) are monotonously
related, and z is a level variable. The BGP involves a decomposition between
long run growth and transitional dynamics (which have a frequency spanning
several years). The level trajectories, not necessarily along the BGP, are
written as
Z(t) = z(t)eγz t (3)
where, generally limt →∞ z(t) = z. The most important aspects of the the-
ory are related with the determination of the long run growth rate and its
dependence on the fundamentals, γz (ϕ), with the factors affecting the level
of the variables in the long run, z = z(ϕ), and the transition or convergence
towards the BGP, dz(t) = z(t) − z.
With the application to actual growth experiences in mind, we say that
absolute β-convergence exists if the fundamentals are the same but the short
run growth rates differ as a result of the difference Z(0) − Z(t), for dif-
ferent countries. There is relative β-convergence if the fundamentals differ
between countries. In the first case, differences in growth rates have only a
transient nature, while in the second case they reflect both transient and long
run differences.

3. EXOGENOUS GROWTH THEORY


3.1 Founding contributions
The Solow–Swan model. The [21] and [22] model explain level growth
(z) by produced capital accumulation, but long run growth (γ ) by non-
produced factors such as population and technology growth seen as exogen-
ous processes. It is built from the following assumptions:
First, it models a closed economy, which makes is suitable for modeling
autarkic economies or the world economy. Second, the economy produces a
200 P.B. Brito

single good that can be used for investment and consumption purposes. The
technology of production uses labor and capital inputs and is neoclassical.
Formally, the production function is
Y a = AF (K a , N) (4)
where Y a and K a are the aggregate output and the aggregate capital stock,
respectively, and N is the labor input. As there is full participation and em-
ployment is equal to total population, labor input is equal to population. A
is an exogenous productivity parameter which captures all the other factors
changing production, with the exclusion of the stocks of labor and capital.
The production function is linear homogeneous, meaning that an equipro-
portial increase in the two inputs will increase output in the same proportion,
and is increasing and concave in the two inputs, meaning that a marginal
increase in one input will increase output but less than proportionally the lar-
ger the input is. That is, the technology is neoclassical. Then we can write
Y a = NY and Y = Af (K) where Y and K are the per capita output and
stock of capital and f (.) is an increasing and concave function.
A pervasive production function with those properties is the Cobb-
Douglas function Y a = A(K a )α N 1−α where α ∈ (0, 1) represents the share
of capital in national income. Then Y = AK α .
The third is that consumer behavior is represented by a keynesian savings
function in which savings is a fixed proportion of total income, S = sY ,
where S is per capita savings and s ∈ (0, 1) is the marginal propensity to
savings. There some other assumptions: (i) the aggregate gross investment is
equal to the aggregate net investment plus depreciation, I a = K̇ +δK, where
K̇ ≡ dK(tdt
)
and δ > 0 is the depreciation rate; and (ii) population grows
exponentially, Ṅ = nN(t) where n > 0 is constant. The model is closed
with an equilibrium condition in the goods market, which is represented by
the equality between aggregate savings and investment S a = I a .
Then, we get the celebrated [21] equation for the accumulation of the per
capita capital stock,
K̇(t) = sAf (K(t)) − (n + δ)K(t). (5)
As GDP per capita is monotonously related to the per capita capital stock
then, if we solve equation (5), we would get the behavior of Y as qualitatively
similar to K.
If the production function is neoclassical the marginal productivity

of capital verifies the Inada conditions: limK→0 f (K) = +∞ and

limK→∞ f (K) = 0. Then a unique steady state K > 0 will exist and,

as sAf (K) < n + δ, it will be asymptotically stable.
Some predictions of this theory, concerning economic growth, can be de-
rived:
Economic growth theory 201

First, the main factors of growth are savings applied in the accumulation
of physical capital, population growth and technology. That is, capital accu-
mulation is explained by the change in the parameters s, n and A. Second,
the mechanics of growth is the following: savings, which is non-consumed
income, finances investment expenditures, which are applied in accumulat-
ing physical capital; the expansion in capacity generates increases in pro-
duction, incomes and, therefore, further savings. This does not set in motion
a infinite increase in the capital stock because of decreasing marginal pro-
ductivity: increases in capital stock though increasing output will do it at
decreasing rates, which leads to the stalling of the process, when it is dom-
inated by the attrition generated by depreciation and the dilution effect of
population growth. Third, as

K(t) = k(t) ⇔ γ = γk = 0 (6)

then the long run growth rate of economic growth is zero 1 . The aggreg-
ate variables grow at the population growth rate. Fourth, the previous
factors affecting the capital accumulation only have level effects. That is
K = K(s, A, n + δ + n) the two first being positive and the third negative.
An increase in the population growth rate increases the variables in levels
but decreases the steady state GDP per capita.
At last, there is, in the wording of the new growth economics, β-
convergence. For the case of a Cobb-Douglas production function, let
β = −(1 − α)(n + δ). Then, as

∂ ln K̇
lim = lim β(t) = β < 0, (7)
t →∞ ∂ ln K t →∞

irrespective of the initial stock of capital, then the capital stock will converge
to the steady state with a speed of convergence measured by β. Applying
this model to actual growth experiences, the following explanation pops up:
if countries are qualitatively similar then the differences in growth rates is
related to the distance of each country respective to their (common) steady
state GDP per capita. This behavior adheres in part, to the transitional char-
acteristics of growth experiences, but misses the explanation of the long run
growth trend, as we will see next.

Kaldor stylized facts. [11] has been an important contribution in trying to


recast the economic growth research agenda to answer to actual growth data.
That paper pointed out the following stylized facts emerging from growth
data:
K1 GDP per capita, Y , grows across time, and its rate of growth does not
tend to zero,
202 P.B. Brito

K2 the per capita capital stock K is increasing across time;


K3 the rate of return on capital, r is practically constant. This fact raises
some doubts as r seems to display a decreasing trend, at least for de-
veloping countries.
K
K4 the capital-output ratio Y
is roughly constant,
K5 the labor and capital shares in national income are nearly constant,
K6 per capita growth rates widely vary between countries.

Problems with the Solow–Swan model. Two major shortcomings on the


theory were highlighted: the first was related to its adherence to Kaldor’s
stylized facts and the second was theoretical and was related to the possible
existence of dynamic inefficiency.
The stylized fact to which the Solow–Swan model seems to be clearly at
odds is fact K1: the model predicts zero growth rates while most economies
experience, except for rare occasions, positive growth rates. One way to
overcome this shortcoming is to introduce technical progress in the model.
This carries the implicit assumption that discoveries and R&D are largely
irresponsive to economic incentives.
Assume that the production function has labor-augmenting technical pro-
gress, that is Y a = F (K a , AN), and keep the same properties as in the
Solow’s model. Technical progress was identified with the growth in labor
productivity, such as A(t) ≈ ext , where x > 0 is the rate of technical pro-
gress. Then, the per capita capital stock can be separated into two compon-
ents, a long run trend and a transitional component. The long run component
displays a positive rate of growth

K(t) = k(t)ext , (8)

where k(t) is the solution of the equation

k̇(t) = sAf (k(t)) − (n + δ + x)k(t). (9)

It can be concluded that, although the growth rate is asymptotically positive,


γ = γk = x > 0, it is purely exogenous. In addition, the basic conclusions
as regards convergence are the same as in the original Solow model.
The existence of dynamic inefficiency means that there is overaccumu-
lation of capital. In Solow’s model the net rate of return for capital in the

long run, r(K) = Af (K) − δ, could have any sign. If r(K) < 0 the eco-
nomy would over-accumulate capital. As the maximum amount of net output
would be produced for the golden rule capital, K g such that r(K g ) = 0, then
if r(K) < 0 we will have K > K g as a result of the concavity properties of
Economic growth theory 203

the production function. Therefore, the economy could consume more with
less capital. This is a consequence of the absence of the explicit considera-
tion, in this model, of the incentives for accumulating capital.

3.2 Optimal economic growth


[17] is another seminal paper on the modern theory of economic growth,
but was, apparently, far ahead of its time to be incorporated in the main-
stream of growth modeling. The previous issues raised by the [21] model
and the availability of new results in the mathematical theory of optimal
control (see [16]) stimulated a reconsideration of that model that proved to
be foundational for modern economic theory.
That is what [9] and [12] and many others did afterwards. Their model
became the work horse of both growth theory and macroeconomics.2
The Ramsey–Cass–Koopmans (RCK) model has two basic components.
First, the assumptions regarding the structure of the economy and the tech-
nology are basically the same in the Solow model. Second, instead of hy-
pothesizing an ”ad-hoc” consumption function, the consumption behavior is
derived by explicitly introducing its microeconomic foundations. The rep-
resentative determines the optimal consumption path by maximizing an in-
tertemporal utility function subject to a sequence of budget constraints (and
a terminal constraint, in the modern presentation of the theory).
The assumption regarding preferences are the following: (i) the intertem-
poral utility function is additive and time-independent, with the exception of
a discount factor. That is, preferences are stationary, time-consistent and dis-
play impatience; (ii) the instantaneous utility function is increasing, concave
and displays no satiation 3 .
As there are no externalities, no imperfect markets and no distortions in-
troduced by government, the same formal apparatus represents two econom-
ies, which were proved to be equivalent: a centralized economy in which
there is a benevolent dictator or a decentralized equilibrium economy.
Using the same interpretation, the optimal pair of per capita consump-
tion and capital accumulation (C(t), K(t)) are derived by maximizing the
intertemporal utility function
 ∞
max u(C(t))e−(ρ+n)t dt (10)
C(t ) 0

where ρ > 0 is the rate of time preference, subject to the instantaneous


budget constraint, which is similar to equation (5)
K̇(t) = Af (K(t)) − C(t) − (n + δ)K(t). (11)
given the initial stock K(0) = K0 and assuming that K(t) is asymptotically
bounded.
204 P.B. Brito

The optimal consumption and capital stock (C ∗ (t), K ∗ (t)) should be ad-
missible, that is, they verify equation (11) and the initial condition, in addi-
tion to the Euler equation
C
Ċ = (R(K) − (ρ + n + δ)) (12)
σ (C)
and the transversality condition

lim u (C(t))K(t)e−(ρ+n)t = 0. (13)
t →∞

In equation (12), σ (C) ≡ u (C)C

u (C)
is the elasticity of substitution in con-

sumption and R(K) = Af (K) is the gross rate of return for capital. Equa-
tion (12) represents an arbitrage condition between benefits and costs of
changing consumption,

du (C) 1
+ ρ = R(K(t)) − (n + δ). (14)
dt u (C)
The benefits are measured by the rate of change in marginal utility plus the
rate of time preference and the costs are measured by the net rate of return
of capital.
The model has some of the features similar to the Solow model. First, as in
the Solow model, it displays only transitional dynamics. However, now the
solution trajectory is saddle-point stable converging to a unique equilibrium
point.4
Along the optimal adjustment to the steady state stock of capital, K, we
have convergence as

K ∗ (t) − K ≈ (K(0) − K)eλt , (15)

because
  12
ρ  ρ 2 C 
λ= − − Af (K) < 0, (16)
2 2 σ (C)
which is a measure of the speed of convergence. Second, as the per capita
variables converge to a steady state, the dynamics that are present are purely
transitional. There is no long run growth: γ = γk = 0. Third, the theory
also offers uniquely an explanation for the level dynamics, as affected by
preferences and technology parameters, K = g(ρ, σ, A). But, now, the
preference parameters have a richer economic intuition as they are related
to impatience and to the elasticity of intertemporal substitution, which mean
that, if the utility function is concave, consumers prefer smoother trajectories
of consumption.
Economic growth theory 205

However, the model is immune to one of the shortcomings of the Solow


model, as in the steady state we have r(K) = R(K) − δ = ρ + n > 0
which means that it is dynamically efficient. Agents will not choose a path
of capital accumulation leading to overaccumulation.

4. ENDOGENOUS GROWTH THEORY


Endogenous growth theories build upon the Ramsey–Cass–Koopmans
model but extend it in different ways all allowing for the existence of a pos-
itive long run growth rate. We emphasize three main ingredients in this body
of research: explicit optimization, constant returns to scale at the aggregate
level and the existence of a produced good that can growth without bound.
First, all models involve an explicit consideration of the consumers inter-
temporal optimization problem, either in decentralized economies or central-
ized economies. This is common to the RCK model. Second, the aggregate
production functions, which could involve externalities or not, have constant
returns to scale, over the produced inputs, and the instantaneous utility func-
tion is homogeneous. These are necessary conditions for the existence of a
balanced growth path (BGP). In this light, the RCK model can be reinter-
preted as having a BGP with a zero growth rate, and it became clear that
this feature is a consequence of the presence of decreasing returns in the
produced output. Third, in some models technical progress modeling has
two basic features: innovation is an endogenous process and can be seen as
generating an increase in the quantities or the qualities of the inputs used in
production, and, as all the types of capital which are produced face sooner or
later some form of limitation, then unbounded growth can only be sustained
by growth in ideas. However, although ideas face no known boundary, their
production is only sustainable if some form of imperfect competition exists
in their production. The main argument is that, differently from capital in-
puts, technology is non-rival implying that an inventor cannot avoid others
from using an idea at a cost lower than the cost of invention.

4.1 New stylized facts


Another characteristic of the modern growth theory is that it pays a close
attention to its empirical validity. An immense effort both in the areas of
growth accounting and on the empirics of growth led to a new body of styl-
ized facts (see a survey in [4]).
From a large list of new stylized facts, we stress the following:
N1 β-convergence is not observable when samples involving many coun-
tries are considered;
206 P.B. Brito

N2 however, β-convergence tends to hold in samples containing groups


of homogeneous countries, in terms of GDP per capita (ex. EU coun-
tries). In this case the rate of convergence is only of the order of 2%,
that is, it is a slow process;
N3 the following variables tend to have a higher correlation with the rate
of GDP growth: (i) with a positive correlation, the initial human cap-
ital (measured in terms of variables for education and health), insti-
tutions (in particular the rule of law) and the weight of investment
over GDP, and (ii) with a negative correlation, the fertility rate and the
weight of government expenditure over GDP.
N4 scale effects seem to be absent, that is, there is no correlation between
dimension of a country and the rate of growth;
N5 there is a spatial concentration of the economic activity in small areas,
within most countries.
The last two facts, if considered together, seem to be at odds. They allow
for a conclusion that no economies of scale at the international level, but
agglomeration could be the cause or the consequence of increasing returns
at the national level, thus generating economies of scale within economies.
Next we will present some of the reference models for the new growth
theory.

4.2 Benchmark models


The AK model (see [18]) is the simplest endogenous growth model. It
introduces constant returns to scale as a mechanism for generating long
run growth and can be seen as a benchmark for models including several
sectors and more complex mechanisms. Conventionally, the taxonomy of
new growth theory assembles these models into three categories: models
of growth driven by externalities, by human capital accumulation and by
Schumpeterian mechanisms. [19] was chronologically the first paper in the
new growth theory and introduces externalities as the main source of long
run growth. [23] and [14], present a theory on human capital accumulation
as an alternative way of generating unbounded growth. This model conveys
a plausible explanation for the lack of correlation between aggregate popula-
tion and GDP growth. Schumpeterian models (see a complete presentation in
[2]) specify in more detail the production of technology and its consequences
in growth.
We will present next a general view of the first two categories as they are
more related to our work.
Economic growth theory 207

The AK model. The AK model can be seen as a version of the RCK


model in which the technology is linear and the utility is a homogenous
function. Formally, it generates optimal consumption and capital paths
(C(t), K(t)) as a solution of the centralized problem
 ∞
C(t)1−θ − 1 −(ρ+n)t
max e dt, ρ > 0, θ > 0 (17)
C(t ) 0 1−θ
subject to
K̇(t) = AK(t) − C(t) − (n + δ)K(t). (18)
given the initial stock K(0) = K0 . This model has an explicit solution in
which optimal per capita consumption is proportional to the (optimal) per
capita stock of capital C ∗ (t) = βK ∗ (t) and

K ∗ (t) = K0 eγ t . (19)

The rate of long run growth is determined endogenously as


r −ρ
γ = > 0, r = A − (n + δ) (20)
θ
where growth rate is positive if the transversality condition holds.
The optimal capital accumulation is characterized by the following prop-
erties: First, there is a balanced growth path in which there is unbounded
growth, differently from the RCK model. Second, the rate of long term
growth is positive as the average net productivity should be larger than the
rate of time preference. Its level is decreased it the utility is more concave,
that is if the elasticity of intertemporal substitution is larger. Therefore, both
productivity and preference parameters affect growth. At last, it displays
no transitional dynamics, as the damping mechanism presented in the RCK
model, the existence of decreasing marginal returns, is not present.
Summing up, the AK model adheres to some stylized facts (the positive
growth rates) but not to others (the slow convergence to the BGP both for
countries individually and between countries).

Growth and externalities. The [19] model is an equilibrium model with


externalities, introducing a new interaction between technology and the insti-
tutional setting. The equivalence with a centralized economy does not hold,
as in the RCK and the AK models. In general, the equilibrium is not Pareto
optimal.
Externalities in production mean that the output of an individual firm is a
function of both the own capital of the firm and the aggregate capital. All the
rest constant, the productivity of an individual firm depends on the number
of firms in its neighborhood. We can think of several causes that produce
208 P.B. Brito

this effect: existence of a common pooling of knowledge, of processes of


production, of peer pressure, of infrastructures, etc.
Formally the production function is Y = f (K, K) where K is the own
capital and K is the aggregate capital. It is assumed that there are decreas-
ing returns as regards the own capital but constant or increasing returns as
regards the aggregate capital (both own and aggregate). The other assump-
tions regarding the structure of the economy and preferences are as in the
RCK model and the AK model. Again the representative agent is assumed
to perform jointly productive and consumption/saving activities.
The general equilibrium for this economy is represented by the functions
(C(t), K(t), K(t) such that: (1) the representative consumer solves his inter-
temporal problem, and (2) the micro-macro consistency condition, K(t) =
K(t), holds.
If we assume an isoelastic utility function and a Cobb-Douglas pro-
duction function with externalities, the representative agent determines
(C ∗ (t), K ∗ (t)) by maximizing the utility function
 ∞
C(t)1−θ − 1 −(ρ+n)t
max e dt, ρ > 0, θ > 0 (21)
C(t ) 0 1−θ
subject to
K̇(t) = Af (K, K) − C(t) − (n + δ)K(t) (22)
where aggregate capital, K, is given.
If we assume that the production function has the form
f (K, K) = K(t)α K(t)β , (23)
where 0 < α < 1, and β > 0, and that the condition for the existence of a
BGP α + β = 1 holds. Then we get the equilibrium BGP as
r −ρ
K(t) = K(t) = k(t)eγ t , γ = > 0, (24)
θ
where k(t) = k is constant.
The conclusion is that, even in the case in which we have decreasing re-
turns, to the private capital (as in the RCK model), the existence of external-
ities could generate unbounded growth (as in the AK model).
The model still has two shortcomings. First, in the simple version presen-
ted, it does not generate transition, second, in some versions the long run rate
of return r depends on scale variables, which is counterfactual.

Growth and human capital accumulation. [23] presented an extension


of the RCK model by including human capital accumulation. [14] by ex-
tending it into several directions, in particular by introducing human capital
externalities, placed it as a cornerstone of the new growth theory.
Economic growth theory 209

Next we consider a version of the Uzawa-Lucas (UL) model in which


there are no externalities. The main new assumptions are the following:
first, the labor input is disaggregated in the contributions of population and
human capital L = H.N, where human capital, H , is a measure of efficiency
per worker which is assumed to depend on education; second, there are two
sectors, manufacturing and education; third, education is introduced as a new
productive sector, whose output are increases in human capital; fourth, both
sectors have constant return to scale technologies. It is common to assume
that the only input used by the education sector is human capital. Therefore,
as human capital is used in both sectors, then we have to determine the inter-
sectorial allocation.
As we considered the absence of externalities, we have again an equi-
valence between centralized and decentralized economies. The centralized
version determines the paths (C(t), u(t), H (t), K(t)) that solves the prob-
lem:  ∞
C(t)1−θ − 1 −(ρ+n)t
max e dt, ρ > 0, θ > 0 (25)
C(t ),u(t ) 0 1−θ
where u(t) ∈ (0, 1) denotes the part of human capital which is used in man-
ufacturing, subject to
K̇ = Af (K(t), u(t)H (t)) − C(t) − (n + δ)K(t)
Ḣ = Ah (1 − u(t))H (t) − δh H (t),
where Ah in the average productivity of the education sector, δh is the depre-
ciation of human capital and f (.) is again linearly homogeneous.
The solution for the stocks of physical and human capital is of the form

r(ϕ) − ρ k(t) k
K(t) = k(t)eγ t , H (t) = h(t)eγ t γ = > 0, lim = (ϕ),
θ t →∞ h(t) h
(26)
where ϕ is a vector of parameters standing for the fundamentals related to
technology, productivity (A and Ah ) and preferences. Differently from the
AK and the Romer model, this model generates both endogenous growth
and transitional dynamics, because the allocation of human capital between
the two sectors only changes the ratio between physical and human capital
incrementally. If the initial ratio between those stocks is not consistent with
the BGP, they will adjust towards it gradually.
We learned the following: in a model with a technology similar to the
RCK model, if we introduce a produced input with a linear technology we
can generate both a transitional and long run growth dynamics. In addition,
this behavior seems to be consistent with stylized fact N3. However, the
transitional dynamics can only be monotone. This means that, if two coun-
tries start with different initial stocks of physical and human capital but have
210 P.B. Brito

the same fundamentals, ϕ, then their relative ranking will not change along
the transition to the BGP: the poorest country initially will be poorer all the
time, but will have always have higher rates of growth.

5. OUR RESEARCH
Off course, those models have been extended in several directions. How-
ever, we think that some issues, particularly those related with facts K6 and
N2 and the consistency between facts N2, N4 and N5, pose some problems
which still unsolved.
First, the divergence between growth experiences could be related to the
non-uniqueness or the indeterminacy of the BGP. The BGP is non-unique if,
given certain values for the initial conditions and the structural parameters
of the economy, the long run growth rate could be not only quantitatively but
also qualitatively different across countries. The BGP is indeterminate if, for
given initial conditions for the state variables (usually physical and human
capital), any expectations for the future path of consumption and investment
can be sustained, implying that the BGP is asymptotically stable and not
saddle path stable.
The second issue is related with the growing evidence that growth pro-
cesses may be non-monotonic. This can be seen from two different per-
spectives: the relative rankings of countries with similar structural properties
seem to change from time to time and second the rate of growth seems to be
a mean-reverting process. We ask the question: under which conditions can
growth be non-monotonic ?
The third issue is related with the interaction of growth in time and space.
In addition to the apparent contradiction between the stylized facts of con-
vergence between countries but of agglomeration within countries, there is
same uneasiness in the use of open economy growth models for explaining
the interaction between agglomeration and integration in space.5
Next we summarized some attempts from ours to tackle those issues.

5.1 Multiple balanced growth paths and indeterminacy


In [8] we model a two-sector economy, by generalizing both Romer and
UL models through the incorporation of externalities in the two sectors, such
that a BGP exists.
While those classes of models incorporate externalities only in the manu-
facturing sector, [19] class of models consider capital externalities and [14]
class of models consider human capital externalities, we assume that there
are also externalities in education. This seems an obvious assumption when
we look at the functioning of the actual schooling system. We assume that
externalities depend on the factors’ intensity at the aggregate level. At last,
Economic growth theory 211

we assume that the conditions for existence of a BGP hold, that is, the
two sectorial production functions display constant returns to scale in both
private and aggregate inputs. Population is assumed to be constant and nor-
malized to one.
Denoting the physical and human capital inputs in the manufacturing and
education sectors by Ki and Hi , i = 1, 2, respectively, we consider the
equilibrium model in which the representative agent solves the problem,
 ∞
C(t)1−θ − 1 −ρt
max e dt (27)
C,K1 ,K2 0 1−θ

subject to
β β
K̇ = A1 K1 11 H1 21 (K/H)b1 − C
β β
Ḣ = A2 K2 12 H1 22 (K/H)b2 ,
K = K1 + K2 ,
H = H1 + H2 ,

where B = [βij ], i, j = 1, 2, is a stochastic matrix 6 and b = [bj ]. The


representative consumer takes the aggregate variables as given, but the con-
sistency condition implies that

K(t) = K(t), H = H (t). (28)

The main results are the following. First, the long run growth rate is, again,
given formally by γ = r−ρ θ
, where the long run rate of return for capital
belongs to the set

r = {r : r µ(β,b,θ,ρ) = α0 (β, θ)r + ρα1 (β)} (29)

which can have zero, one or two elements. The existence of two BGP’s,
means that r has two elements, for a wide choice of factor intensity para-
meters (β, b). Local or global indeterminacy could exist. In this case local
indeterminacy means that the stable manifold of the system of first order
conditions does not have a eigenvector with positive real part. A sufficient
condition for local indeterminacy is that the production of the manufacturing
good should be more intensive in human capital than physical capital (which
is a realistic assumption given the shares of the two factors of production in
national income in actual economies). At last, we conclude that for coun-
tries which are heterogeneous regarding the factor intensities in production
and impatience, the rates of growth could eventually be permanently differ-
ent.
212 P.B. Brito

5.2 Non-monotonous convergence


Some growth experiences indicate that countries may not seem to con-
verge in the short run, but would converge in the long run. This property
does not hold when there is multiplicity as in the model presented in the last
section. Another avenue that worth exploring is possibility to generate non-
monotonous transitions. This means that an apparent temporary divergence
may only be transitory.
In order to get this behavior in models following [14] we need to introduce
a third sector, as these models only generate monotonous convergence. We
have done this in two papers, in which [6] explores the long run properties
and [7] deals with the dynamics. They are applied to an economy with an
housing sector, in addition to manufacturing and education, but the argument
is more general.
We consider an economy with three sectors, in which the production func-
tion for each sector uses inputs produced in all sectors and the goods are im-
perfect substitutes. This implies that the adjustment of their relative prices
is not instantaneous. We assume that there is a manufacturing sector whose
product is used in final consumption and the other two sector produce in-
vestment goods. At last, the necessary conditions for the existence of a BGP
are also posited. In particular, we assume that all the production functions
display constant returns to scale and there are no externalities
Then, we have a centralized economy represented by the following control
problem
 ∞
C(t)1−θ − 1 −ρt
max e dt (30)
C,K11 ,...,K33 0 1−θ

where Kij is the input produced in sector j = 1, 2, 3 which is used in the


production by sector i = 1, 2, 3, subject to the constraints


3
β
K̇1 = A1 K1j1j − C
j =1


3
β
K̇2 = A2 K2j2j
j =1


3
β
K̇3 = A3 K3j3j
j =1


3
Ki = Kij
j =1
Economic growth theory 213

where B = [βij ] is a stochastic matrix for the technology parameters and


A = [Ai ].
We got the following results. First, the long run growth rate depends on
the factor shares and the productivity parameters, as through the rate of return
which should be equal across sectors,

r(A, B) − ρ
γ = (31)
θ

and is positive when the transversality condition holds. Second, there is


transitional dynamics and the stable manifold has is, for most choices of
parameters, of dimension two. Third, the eigenvalues depend only on the
matrix of factor shares, B, which means that the structure of factor intensities
plays a major role in the dynamic short run adjustment of the economy. At
last, a necessary condition for the existence of a non-monotone transition
dynamics is that det(B) > 0. This means that non-monotonicity occurs for
the case in which each sector tends to use more intensely their own product as
factor of production. This condition is consistent with a generalized version
of the Solper–Samuelson theorem.

5.3 Optimal growth and distribution


In [5] we explore partial differential equations (PDE) as a tool for ad-
dressing growth in time and space. The use of PDE, is pervasive in other
field of science, but is oddly absent from economics (with the exception of
asset pricing). The motivation for this line of research is threefold. First,
the open economy models are problematic when we try to extend them to
endogenous growth contexts. The existence of a BGP is not robust because
the arbitrage condition between the (endogenous) domestic and the (exogen-
ous) world rates of return on capital, r(k) = r ∗ , has to hold instantaneously.
Second, they do not capture stylized fact N5: that is, the simultaneous ex-
istence of β-convergence between countries and agglomeration at a national
scale. And finally, there is the alluded insatisfaction with the way the integ-
ration between growth and spatial dynamics has been performed.
Our model has the following features. First, there is an infinity of house-
holds distributed across space, where c = c(t, x) and k = k(t, x) de-
note the densities of consumption and capital in moment t at location x.
Identical preferences and technology, with neoclassic properties, are as-
sumed throughout space. Second, labor is immobile but capital moves across
space. Capital flows against the spatial gradient of the capital density: re-
gions which are poor in capital, have an higher marginal productivity of
capital and therefore attract capital. Formally, we get the subject an instant-
214 P.B. Brito

aneous budget constraint

∂k ∂ 2k
= 2 + Af (k(t, x)) − c(t, x) (32)
∂t ∂x
At last, there is a Bergson-Samuelson central planner which determines the
optimal distribution in time and space of consumption in order to maximize
the average intertemporal utility functional
 x ∞
1
u(c(t, y))e−ρt dtdy. (33)
2x −x 0
He does it without having any spatial weighting mechanism, because that
would violate some conditions for the existence of an aggregate utility func-
tional.
Our model is an optimal control problem for a system PDE, in which the
central planer determines the optimal path of consumption densities c∗ (t, x)
by maximizing (33) subject to (32) plus terminal and boundary conditions.
The first order conditions are represented by the static condition

u (c(t, x)) = q(t, x) where q(t, x) is the co-state variable and the system
of parabolic PDE’s

∂q ∂ 2q 
= − + q(t, x)(Af (k(t, x)) − ρ), (34)
∂t ∂x 2
∂k ∂ 2k
= + Af (k(t, x)) − c(t, x) (35)
∂t ∂x 2
plus the boundary conditions for time and space7 .
We prove that, if both the production function and the utility functions
are strongly concave then, even in the case in which the initial distribution of
capital is spatially heterogeneous, the optimal distribution of both capital and
consumption converge asymptotically to a spatially homogeneous distribu-
tion. Also, and this result is full of implications, is both functions are mildly
concave (in fact close to linear) then we have potentially a spatial instabil-
ity mechanism which resembles the Türing bifurcation. That is, the spa-
tial contact could set into motion a self generating spatial pattern formation
mechanism, which has only a transient character. This seems to offer both a
modeling strategy and a formal metaphor for the convergence-agglomeration
paradox.

NOTES
1. We use our previous representation of X for level per capita variables and x for detrended per
capita variables.
2. For the state of the art research after the first wave see [3].
Economic growth theory 215
  
3. That is, u (c) > 0, it is a Inada function where limc→0 u (c) = ∞ and limc→∞ u (c) = 0, and

u (c) < 0.
4. The transversality condition, equation (13), plays an important role in the determination of the
solution. It introduces a terminal condition, which together with the initial condition for the stock of
capital allows for the determination of an initial optimal level of consumption C ∗ (0), usually only ana-
lytically. Therefore, the optimal trajectory, for any t ≥ 0, can be determined uniquely. Its nature as a
sufficient condition, not as a necessary condition, was understood much later (see [15]).
5. This is the object of the new economic geography, see [13, 10].
6. It is a matrix where βij ∈ [0, 1] and the lines or columns sum to one.
7. Technically, this system looks like an ill-posed problem. However, this is not the case, because
the forward PDE for capital is matched by an initial condition for capital and the backward PDE for the
co-state variable is matched by a transversality condition.

REFERENCES
[1] Ph. Aghion and S. N. Durlauf, editors. Handbook of Economic Growth,, volume 1A &
1B. Elsevier, 2005.
[2] Ph. Aghion and P. Howitt. Endogenous Growth Theory. MIT Press, 1998.
[3] K. Arrow and M. Kurz. Optimal Investment and Growth. MIT Press, 1971.
[4] R. J. Barro and X. Sala-i-Martin. Economic Growth. MIT Press, 2nd edition, 2004.
[5] P. Brito. The dynamics of growth and distribution in a spatially heterogen-
eous world. Working Papers of the Department of Economics, ISEG-UTL
http://ideas.repec.org/p/ise/isegwp/wp142004.html, November 2004.
[6] P. Brito and A. M. Pereira. Housing and Endogenous Long-Term Growth. Journal of
Urban Economics, 51:246–71, 2002.
[7] P. Brito and A. M. Pereira. Convergent cyclical behavior in endogenous growth with
housing. Unpublished, 2003.
[8] P. Brito and A. Venditti. Local and global indeterminacy in two-sector endogenous
growth models. Unpublished, 2002.
[9] D. Cass. Optimum growth in an aggregative model of capital accumulation. Review of
Economic Studies, 32:233–40, 1965.
[10] M. Fujita and J. Thisse. Economics of Agglomeration. Cambridge U. Press, 2002.
[11] N. Kaldor. Capital accumulation and economic growth. In Friedrich A. Lutz and
Douglas C. Hague, editors, Proceedings in a Conference Held by the International
Economics Association. Macmillan, 1963.
[12] T. Koopmans. On the concept of optimal economic growth. In The Econometric
Approach to Development Planning. Pontificiae Acad. Sci., North-Holland, 1965.
[13] P. Krugman. Space: the last frontier. Journal of Economic Perspectives, 12(2):161–
174, Spring 1998.
[14] R. E. Lucas. On the mechanics of economic development. Journal of Monetary Eco-
nomics, 22(1):3–42, 1988.
[15] Ph. Michel. Some Clarifications on the Transversality Condition. Econometrica, 58:
705–23, 1990.
[16] L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko. The
Mathematical Theory of Optimal Processes. Interscience Publishers, 1962.
216 P.B. Brito

[17] F. P. Ramsey. A mathematical theory of saving. Economic Journal, 38(Dec):543–59,


1928.
[18] S. Rebelo. Long run policy analysis and long run growth. Journal of Political Eco-
nomy, 99(3):500–21, 1991.
[19] P. Romer. Increasing Returns and Long-Run Growth. Journal of Political Economy,
94(5):1002–37, October 1986.
[20] P. Romer. Endogenous Technological Change. Journal of Political Economy, 98(5):
S71–S102, October 1990.
[21] R. Solow. A contribution to the theory of economic growth. Quarterly Journal of
Economics, 70(1):65–94, 1956.
[22] T. Swan. Economic Growth and Capital Accumulation. Economic Record, 32:334–61,
1956.
[23] H. Uzawa. Optimal Growth in a Two-Sector Model of Capital Accumulation. Review
of Economic Studies, 31:1–24, January 1964.
PART IV

LIFE SCIENCES AND BIOTECHNOLOGY


DNA VACCINES
Construction, Production and in vivo Testing

Duarte Miguel F. Prazeres and Gabriel Amaro Monteiro


Centro de Engenharia Biológica e Química (CEBQ), Instituto Superior Técnico,
Universidade Técnica de Lisboa, Av Rovisco Pais, 1049-001 Lisbon, Portugal;
e-mail: miguelprazeres@ist.utl.pt

Abstract: This chapter describes the R&D activities which have been carried in the last 9
years at the CEBQ, with the specific objective of tackling some of the new
scientific and technological challenges associated with the development of
DNA vaccines. Following a brief introduction on the DNA vaccine topic, the
research under way is described and some significant results are highlighted.

Key words: DNA vaccines, plasmid DNA, vaccinology, production, purification.

1. INTRODUCTION

The use of vaccines constitutes one of the major successes of modern


Medicine. Innumerous vaccines have been developed and introduced in the
wake of Edward Jenner’s pioneering work on smallpox immunization in
1790 [1] (Figure 1). These efforts, led by renowned scientists like Louis
Pasteur [2], Robert Koch [3] or Jonas Salk [4], have reduced the mortality of
several infectious diseases and contributed to the increase in life expectancy.
No other medical procedure has enabled the complete eradication of a
disease - the last case of smallpox was detected in 1977 - or is capable of
competing with vaccination in terms of cost/benefit - a risk analysis study
has attributed a median cost per life year saved of | $0 to pediatric vaccines
and vaccine strategies [5, 6]. Currently, twenty-26 infectious diseases can be
prevented by vaccination [6]. Although immunization is currently estimated
to save the lives of 3 million children every year [6], new (AIDS, avian flu)

219
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 219–232.
© 2007 Springer. Printed in the Netherlands.
220 D.M.F. Prazeres and G.A. Monteiro

and re-emerging infectious diseases (tuberculosis) continue to threat public


health, thus pushing scientists towards new vaccination strategies [7].

Figure 1. Rate of introduction of commonly used vaccines (based on André [2]).

1.1 Vaccines

Disease causing organisms induce an immune response in infected hosts


via proteins called antigens. Depending on its strength and promptness, this
response may be sufficient or not to cure or reduce disease symptoms. The
host response is mediated both by antibodies (humoral response) and
immune cells (cellular response). Memory cells are also produced which
remain in the blood stream ready to mount a quick protective immune
response against subsequent infections with the particular disease causing
agent which induced their production [8]. The goal of vaccination is to
prepare and train the host immune system to fight against specific
organisms. Generically speaking, vaccines mimic an infection and thus
improve the memory, strength and promptness of the immune system.
According to today´s definition, a vaccine is a “suspension of live (usually
attenuated) or inactivated microorganisms (e.g., bacteria or viruses) or
fractions thereof administered to induce immunity and prevent infectious
DNA Vaccines 221

disease or its sequelae” [9]. The production of vaccines thus relies on the
isolation of antigens capable of eliciting the immune response, but without
causing illness. As described in the above definitions, these antigens can be
present in whole organisms which are attenuated or inactivated to reduce
virulence (Table 1). Inactivated vaccines may use the whole or parts of the
infectious organisms such has flagella (acellular vaccines) and specific
molecules such has proteins or polysaccharides (subunit vaccines, toxoids)
which are antigenic, but unable to cause disease. The manufacture of these
vaccines is usually a complex, multi-step, costly process which requires
dedicated and certificated facilities. In the majority of cases the vaccines
require storage at low temperatures (4 ºC).

Table 1. Vaccine families.


Vaccine type Example and disease causing microorganism
Infectious organism Measles (Measles virus)
Attenuated

Mumps (Mumps virus)


Related organism BCG (Mycobacterium tuberculosis)

Whole Salk poliomyelitis (Polio virus)


Cholera (Vibrio cholerae)
Acellular Flu (Haemophilus influenzae B)
Pertussis (Bordetella pertussis)
Inactivated

Subunit Hepatitis B (Hepatitis B virus)


Meningitis (Neisseria meningitidis)
Toxoid Diphtheria (Corynebacterium diphtheriae)
Tetanus (Clostridium tetani)
DNA Malaria (Plasmodium falciparum)* [11]
AIDS (HIV)* [12]
Tuberculosis* [13]
* under development

1.2 DNA vaccines

Traditional vaccines rely on the administration of antigens produced ex-


vivo (e.g. proteins, polysaccharides) into hosts. However, the finding in 1993
that mice injected with a plasmid encoding a viral antigen could develop
both humoral and cellular responses [10], paved the way to the development
of a new type of vaccines [5, 7]. These rely on: i) the administration of the
genes which encode the antigens of interest into the host, ii) the expression
of the antigens in vivo by the host cells and iii) the triggering of the immune
response (Figure 2). On a meeting convened by the World Health
Organization in May 1994, the name DNA vaccine was selected among
others (genetic immunization, gene vaccines, polynucleotide vaccines) to
designate this new vaccination technology [7].
222 D.M.F. Prazeres and G.A. Monteiro

DNA vaccine antibodies


immune cells

antigen

i) administration ii) antigen expression iii) immune response

Figure 2. DNA vaccines and their mode of action.

Over the past years, the induction of both humoral and cellular immune
responses against numerous experimental infection and non-infection
diseases by DNA vaccines has been demonstrated in several mammalian
models (Table 1) [11-13]. Advantages which have been ascribed to this new
generation of vaccines include [7]:

i) no risk of infection
ii) responses raised may be long-lived
iii) multiple antigenic sequences can be rapidly screened
iv) the use of combination vaccines is facilitated
v) potential for rapid and generic production and manufacture
vi) good stability at high and low temperatures and
vii) low cost.

The development of a DNA vaccine is a complex, multidisciplinary


process, which requires basic and applied research and presents many R&D
opportunities to scientists and engineers.

2. RESEARCH AT CEBQ

The “Centro de Engenharia Biológica e Química (CEBQ)” is a research


centre based on Instituto Superior Técnico, the Engineering school of the
Technical University of Lisbon. In 1997, an R&D team (Nucleic Acids
Bioengineering) was formed within the Bioengineering Research Group
(BERG) of this Centre, with the specific objective of tackling some of the
DNA Vaccines 223

new scientific and technological challenges associated with the development


of DNA vaccine (Figure 3).

“CENTRO DE ENGENHARIA BIOLÓGICA E QUÍMICA (CEBQ)”

Figure 3. The Nucleic Acids Bioengineering team shown within the frame of the Centro de
Engenharia Biológica e Química.

2.1 Mission statement

The mission of the lab is to perform high quality research and train young
researchers (undergraduate, M. Sc.’s, PhD’s, post docs) in the area of DNA
vaccine development. Scientific rigor and teamwork are key values which
are promoted within the team. An integrated approach is pursued in which
the different aspects of the DNA vaccine development cycle (Figure 4) are
examined and covered, at least to some extent. Particular attention is given
to the design and construction of DNA vaccine prototypes and to the
development of manufacturing processes. This provides not only a global
vision of the subject, but also creates synergies among the team members.
Key goals are to publish the work in high-impact scientific journals and to
generate Ph.D. and M. Sc. Thesis.
224 D.M.F. Prazeres and G.A. Monteiro

host

cell culture
vaccine l l

gene
purification

disease

formulation
patient administration

Figure 4. The development of DNA vaccines at the Bioengineering Research Group.

2.2 Research topics

The research topics presently addressed by the Nucleic Acids


Bioengineering team are schematized in Figure 5. DNA vaccine prototypes
are being developed and tested in mice models against two model diseases,
sheep Maedi-Visna and African trypanosomiasis. This work is supported
both by more fundamental, biomolecular engineering studies which focus on
stability and cell trafficking issues, and by bioprocessing engineering
directed towards the development of manufacturing processes and quality
control analysis [14-46].

2.3 Team members, collaborations and funding

The team includes faculty members from the Chemical and Biological
Engineering Department, Ph.D and M.Sc. students and graduate and
postgraduate researchers. The background of the team members (around 20)
is varied, ranging from chemical and biological engineering, biochemistry
and biology. This diversity suits well with the multidisciplinary nature of the
research activities undertaken.
DNA Vaccines 225

trafficking

Structure efficiency

stability

adjuvants & delivery

design & construction

DNA immunization
pDNA vectors
vaccines

challenging

production

Manufacturing purification

QC/monitoring

Figure 5. Research topics currently addressed by the Nucleic Acids Bioengineering team.

The research topics which do not constitute traditional core activities of


the team are carried out in close collaboration with renowed Portuguese
Institutes and Laboratories. Such is the case of the development of DNA
vaccine prototypes against sheep Maedi-Visna (collaboration with Dr.
Miguel Fevereiro, Laboratório Nacional de Investigação Veterinária) and
African trypanosomiasis (Prof. Jorge Atouguia, Instituto de Higiene e
Medicina Tropical; Prof. Jeane Rodgers, Glasgow University). Other
collaborations are in place with Universidade da Beira Interior (Prof. João
Queiroz), Universidade do Minho (Prof. João Marcos) and Instituto
Gulbenkian de Ciência (Prof. Álvaro Tavares). Funding is essentially
secured by the Portuguese Ministry of Science and Education, either via
annual funding to CEBQ or via specific research projects (average 60
k€/year). Ph. D. students and Pos docs typically hold grants from the same
Ministry.
226 D.M.F. Prazeres and G.A. Monteiro

3. RESULTS

3.1 Overview and impact

The following sub-sections exemplify the results that have been obtained
in the context of the different subtopics of research on DNA vaccines.

3.2 Improvement of DNA vaccine structural stability

Nuclease degradation of DNA vaccines after delivery and during


trafficking to the nucleus constitutes a barrier to gene expression and
consequently to the elicitation of immune responses. This barrier may be
circumvented by shielding the DNA vaccines from the nuclease-rich cell
environment with adjuvants or by using nuclease inhibitors. A different
alternative that has been explored by the team is to make DNA vaccines
more nuclease-resistant a priori [39]. The replacement of labile sequences in
DNA vaccine vectors has shown to improve the resistance towards the attack
of endo/lysosomal, cytoplasmatic and blood plasma nucleases. The storage
stability at 4 ºC of modified DNA vaccine vectors was also significantly
improved by the replacement of the labile sequences (Figure 6).
pVAX-SV40
pVAX1/lacZ

pVAX-Synt
pVAX-SV40
pVAX1/lacZ

pVAX-Synt

OC

SC

Figure 6. Stability of DNA vaccine vectors (original: pVAX1/lac Z; modified: pVAX1-SV40


and pVAX1-Synt) stored at 4ºC. Left) Control material at storage time and Right) DNA
vaccine vectors stored for 5 months. The supercoiled (SC) isoform of the modified DNA
vaccine vectors is more stable when compared with the equivalent isoform in the original
vector, which readily denatures into the open circular (OC) and linear (L) forms [39].
DNA Vaccines 227

3.3 Development of processes for DNA vaccine


manufacturing

A patented process based on hydrophobic interaction chromatography


(HIC) has been developed and used for the production and purification of
plasmid DNA vectors (Figure 7). The process is robust, reproducible and
amenable to scale-up, and it has been extensively used in our laboratory to
produce milligram quantities of plasmid vectors for gene therapy [24, 38]
and DNA vaccines, including a prototype for immunization against rabies
[25,28]. DNA vaccines prepared by this process conform to FDA and EMEA
specifications in terms of impurities (endotoxins, genomic DNA, RNA,
proteins). This process has also been scaled-down and transformed into a kit
version which is extensively used in our laboratory for the isolation of small
amounts of DNA vaccines and plasmid DNA vectors [43].

Cell Cell Lysis KAc pp


culture recovery

DNA Dialysis HIC (NH4)2SO4 pp IsopOH pp


vaccine

Figure 7. Process flow-sheet for the large scale production of DNA vaccines (KAc -
potassium acetate, IsopOH - isopropanol, HIC - hydrophobic interaction chromatography, pp
- precipitation) [24-25, 28, 38].

Growth of the Escherichia coli cells that host the DNA vaccine
containing the target plasmid is carried out in a fermenter with a suitable
medium. After cell recovery by centrifugation, cells are lysed by alkaline
lysis. A precipitation step with isopropanol is included mainly to concentrate
the DNA vaccine (purification is not significant at this step). The subsequent
step is a precipitation with ammonium sulphate that significantly reduces
protein and endotoxin content, and also acts as a conditioning step for the
subsequent HIC step. HIC is carried out with a suitable support derivatised
with hydrophobic ligands. A typical chromatogram shows a first sharp peak
of DNA vaccine followed by a broader peak of weakly retained
contaminants: RNA, genomic DNA, proteins (Figure 8).
228 D.M.F. Prazeres and G.A. Monteiro

120
impurities

100
Relative Abs. 280 nm (%)

DNA vaccine

80

60

40

20

0
0 20 40 60
Time (min)
Figure 8. Purification of DNA vaccines by hydrophobic interaction chromatography.

The quality of the final DNA vaccines is determined by performing a


range of analytical techniques, some of which were developed within the
group: HPLC analysis [36], agarose gel electrophoresis, restriction analysis,
transformation and transfection experiments, Southern slot blotting,
quantitative Real-time PCR [37], LAL and protein assays.

3.4 In vivo testing of a candidate DNA vaccine

A series of DNA vaccine prototypes against African Trypanosomiais


have been constructed by cloning antigenic candidate gene sequences from
Trypanosoma brucei into a DNA vaccine vector (see Section 3.2). Following
production of the DNA vaccine (Section 3.3), CD1 and Balb/c mice were
immunized and humoral and cell immune responses were determined by
ELISA and cytokine assays. Challenge assays were then performed by
experimentally infecting mice with Trypanosoma brucei parasites.
Preliminary results show that a certain degree of protection is obtained by
DNA vaccine immunization (Figure 9, [unpublished results]).
DNA Vaccines 229

100
t
80

60
% of survival

40

20

0
5 15 25 35
Time post infection (days)

Figure 9. Challenge of CD1 mice immunized with a DNA vaccine prototype against
Trypanosoma brucei. The results show that a certain degree of protection is conferred to
immunized animals (closed symbols) when compared with animals which were not
immunized (open symbols) [unpublished results].

4. FUTURE PROSPECTS

R&D and training in the area of DNA vaccine development will continue
in the next years. The work on specific DNA vaccine prototypes is expected
to generate significant results (publications, thesis). The Structure and
Manufacturing topics will continue to be developed, providing a support to
the development of the DNA vaccine prototypes.

ACKNOWLEDGEMENTS

The authors would like to acknowledge all those who have directly
contributed to the success of the research described in this overview. Special
thanks are due to Professor Joaquim Cabral, the head of the Bioengineering
group, for his support and to the Portuguese Ministry of Science and
Technology for funding.
230 D.M.F. Prazeres and G.A. Monteiro

REFERENCES
1. Mullin D. "Prometheus in Gloucestershire: Edward Jenner: 1749-1823" Journal of
Allergy and Clinical Immunology, 112, pp. 810-814, 2003.
2. Bordenave G. "Louis Pasteur (1822-1895)", Microbes and Infection, 5, pp. 553-560,
2003.
3. Gradmann C. "Robert Koch and the White Death: From Tuberculosis to Tuberculin ",
Microbes and Infection, 8, pp. 294-301, 2006.
4. Pearce JMS "Salk and Sabin: Poliomyelitis Immunisation", Journal of Neurology and
Psychiatry, 75, pp. 1552-1552, 2004.
5. Liu MA. "Overview of DNA Vaccines", Annals of the New York Academy of Sciences,
772, pp. 15-20, 1995.
6. André FE. "Vaccinology: Past Achievements, Present Roadblocks and Future Promises",
Vaccine, 21, pp. 593-595, 2003.
7. Robinson, HL, Ginsberg, HS, Davis, HL, Johnston, SA, Liu, MA. The Scientific Future
of DNA for Immunization. Washington, American Society of Microbiology, 1997.
8. Kubi J. Immunology. 4th edition by Goldsby RA, Kindt TJ, Osborne BA. New York, W
H Freeman, 2000.
9. Stern AM, Markel H. "The History of Vaccines and Immunization: Familiar Patterns,
New Challenges", Health Affairs, 24, pp. 611-621, 2005.
10. Ulmer JB, Donnelly JJ, Parker SE, Rhodes,GH, Felgner PL, Dwarki VJ, Gromkowski
SH, Deck RR, DeWitt CM, Friedman A, Hawe LA, Leander KR, Martinez D, Perry HC,
Shiver JW, Montgomery DL, Liu MA. "Heterologous protection against influenza by
injection of DNA encoding a viral protein", Science, 259, pp. 1745-1749, 1990.
11. Moorthy VS, Imoukhuede EB, Milligan P, Bojang K, Keating S, Kaye P, Pinder M,
Gilbert SC, Walraven G, Greenwood BM, Hill AVS. "A Randomised, Double-blind,
Controlled Vaccine Efficacy Trial of DNA/MVA ME-TRAP Against Malaria Infection
in Gambian Adults", PLOS Medicine, 1, pp. 128-136, 2004.
12. Singh DK, Liu ZQ, Sheffer D, Mackay GA, Smith M, Dhillon S, Hegde R, Jia FL,
Adany I, Narayan O. "A Noninfectious Simian/human Immunodeficiency Virus DNA
Vaccine That Protects Macaques Against AIDS", Journal of Virology, 79, pp. 3419-
3428, 2005.
13. Yoshida S, Tanaka T, Kita Y, Kuwayama S, Kanamaru N, Muraki Y, Hashimoto S,
Inoue Y, Sakatani M, Kobayashi E, Kaneda Y, Okada M. "DNA Vaccine Using
Hemagglutinating Virus of Japan-liposome Encapsulating Combination Encoding
Mycobacterial Heat Shock Protein 65 and Interleukin-12 Confers Protection Against
Mycobacterium tuberculosis by T Cell Activation", Vaccine, 24, pp. 1191-1204, 2006.
14. Ferreira GNM, Cabral JMS, Prazeres DMF. "A Comparison of Gel Filtration
Chromatographic Supports for Plasmid Purification", Biotechnology Letters, 11, pp. 417-
420, 1997.
15. Prazeres DMF, Schluep T, Cooney CL. "Preparative Purification of Supercoiled Plasmid
DNA Using Anion-exchange Chromatography", Journal of Chromatography A, 806, pp.
31-45, 1998.
16. Ferreira GNM, Cabral JMS, Prazeres DMF. "Purification of Supercoiled Plasmid DNA
Using Chromatographic Processes", Journal of Molecular Recognition, 11, pp. 250-251,
1998.
17. Prazeres DMF, Monteiro GA, Ferreira GNM, Cooney CL, Cabral JMS. "Large-scale
Production of Pharmaceutical-grade Plasmid DNA for Gene Therapy: Problems and
Bottlenecks", Trends in Biotechnology, 17, pp. 169-174, 1999.
DNA Vaccines 231

18. Ferreira GNM, Cabral JMS, Prazeres DMF. "Development of Process Flow Sheets for
the Purification of Plasmid Vectors for Gene Therapy Applications", Biotechnology
Progress, 15, pp. 725-731, 1999.
19. Diogo MM, Queiroz JA, Monteiro GA, Prazeres DMF. "Separation and Analysis of
Plasmid Denatured Forms Using Hydrophobic Interaction Chromatography", Analytical
Biochemistry, 275, pp. 121-124, 1999.
20. Monteiro GA, Ferreira GNM, Cabral JMS, Prazeres DMF. "Analysis and Use of
Endogenous nuclease Activities in Escherichia coli Lysates During the Primary Isolation
of Plasmids for Gene Therapy", Biotechnology and Bioengineering, 66, pp. 189-194,
1999.
21. Ferreira GNM, Monteiro GA, Prazeres DMF, Cabral JMS. "Downstream Processing of
Plasmid DNA for Gene Therapy and DNA Vaccination Applications", Trends in
Biotechnology, 18, pp. 380-388, 2000.
22. Ferreira GNM, Cabral JMS, Prazeres DMF. "Studies on the Batch Adsorption of Plasmid
DNA onto Anion-exchange Chromatographic Supports", Biotechnology Progress, 16,
pp. 416-424, 2000.
23. Ferreira GNM, Cabral JMS, Prazeres DMF. "Anion Exchange Purification of Plasmid
DNA Using Expanded Bed Adsorption", Bioseparation, 9, pp. 1-6, 2000.
24. Diogo MM, Queiroz JA, Monteiro GA, Martins SAM, Ferreira GNM, Prazeres DMF.
"Purification of a Cystic Fibrosis Plasmid Vector for Gene Therapy Using Hydrophobic
Interaction Chromatography", Biotechnology and Bioengineering, 68, pp. 576-583,
2000.
25. Diogo MM, Ribeiro SC, Queiroz JA, Monteiro GA, Tordo N, Perrin P, Prazeres DMF.
"Scale-up of Hydrophobic Interaction Chromatography for the Purification of a DNA
Vaccine Against Rabies", Biotechnology Letters, 22, pp. 1397-1400, 2000.
26. Ribeiro SC, Monteiro GA, Martinho G, Cabral JMS, Prazeres DMF. "Quantitation of
Plasmid DNA in Aqueous Two-phase Systems by Fluorescence Analysis",
Biotechnology Letters, 22, pp. 1101-1104, 2000.
27. Monteiro GA, Cabral JMS, Prazeres DMF. “Essential Guides for Isolation/Purification
of Nucleic Acids” In: Encyclopedia of Separation Science, Wilson, ID, Poole, CF,
Adlard TR, Cooke M (eds.), London, Academic Press, Appendix I, pp. 4560-4568,
2000.
28. Diogo MM, Ribeiro SC, Queiroz JA, Monteiro GA, Tordo N, Perrin P, Prazeres DMF.
"Production, Purification and Analysis of an Experimental DNA Vaccine Against
Rabies", Journal of Gene Medicine, 3, pp. 577-584, 2001.
29. Prazeres DMF, Monteiro GA, Ferreira GNM, Diogo MM, Ribeiro SC, Cabral JMS.
"Purification of Plasmids for Gene Therapy and DNA Vaccination", Biotechnology
Annual Review, 7, pp. 1-30, 2001.
30. Ferreira GNM, Prazeres DMF, Cabral JMS. "Processo Cromatográfico para Purificação
de DNA Plasmídico Utilizando Suportes Superporosos", Portuguese Patent, no
PT102341, 2002.
31. Diogo MM, Queiroz JA, Prazeres DMF. "Studies on the Retention of Plasmid DNA and
Escherichia coli Nucleic Acids by Hydrophobic Interaction Chromatography",
Bioseparation, 10, pp. 211-220, 2002.
32. Ribeiro SC, Monteiro GA, Cabral JMS, Prazeres DMF. "Isolation of Plasmid DNA from
Cell Lysates by Aqueous Two-phase Systems", Biotechnology and Bioengineering, 78,
pp. 376-384, 2002.
33. Diogo MM, Prazeres DMF, Queiroz JA. "Hydrophobic Interaction Chromatography of
Homo-oligonucleotides on Derivatized Sepharose CL-6B - Application of the
Solvophobic Theory", Journal of Chromatography A, 944, pp. 119-128, 2002.
232 D.M.F. Prazeres and G.A. Monteiro

34. Prazeres DMF, Diogo MM, Queiroz JA "Processo Para Produção de DNA Plasmídico",
Portuguese Patent, no PT102491, 2003.
35. Diogo MM, Pinto N, Prazeres DMF, Queiroz JA. "Hydrophobic Interaction
Chromatography of Homo-oligonucleotides on Derivatized Sepharose CL-6B - Using
and Relating Two Different Models for Describing the Effect of Salt and Temperature on
Retention", Journal of Chromatography A, 1006, pp. 137-148, 2003.
36. Diogo MM, Queiroz JA, Prazeres DMF. "Assessment of Purity and Quantification of
Plasmid DNA in Process Solutions Using High-performance Hydrophobic Interaction
Chromatography", Journal of Chromatography A, 998, pp. 109-117, 2003.
37. Martins SAM, Prazeres DMF, Cabral JMS, Monteiro GA. "Comparison of Real-time
Polymerase Chain Reaction and Hybridization Assays for the Detection of Escherichia
coli Genomic DNA in Process Samples and Pharmaceutical-grade Plasmid DNA
Products", Analytical Biochemistry, 322, pp. 127-129, 2003.
38. Prazeres DMF, Diogo MM, Queiroz JA. "Purification of Plasmid DNA by Hydrophobic
Interaction Chromatography", United States Patent Application, US20040038393, 2004.
39. Ribeiro SC, Monteiro GA, Prazeres DMF. "The Role of Polyadenylation Signal
Secondary Structures on the Resistance of Plasmid Vectors to Nucleases", Journal of
Gene Medicine, 6, pp. 565-573, 2004.
40. Prazeres DMF, Ferreira GNM. "Design of Flowsheets for the Recovery and Purification
of Plasmids for Gene Therapy and DNA Vaccination", Chemical Engineering and
Processing, 43, pp. 609-624, 2004.
41. Diogo MM, Queiroz JA, Prazeres DMF. "Chromatography of Plasmid DNA ", Journal
of Chromatography A, 1069, pp. 3-22, 2005.
42. Sousa F, Tomaz CT, Prazeres DMF, Queiroz JA. "Separation of Supercoiled and Open
Circular Plasmid DNA Isoforms by Chromatography with a Histidine-agarose Support",
Analytical Biochemistry, 343, pp. 183-185, 2005.
43. Diogo MM, Queiroz JA, Prazeres DMF. “Purification of Plasmid DNA Vectors
Produced in Escherichia coli for Gene Therapy and DNA Vaccine Applications”, In:
Microbial Processes and Products. Methods in Biotechnology, Barredo, JL (ed.),
Totowa, Humana Press, 18, pp. 165-278, 2005.
44. Trindade IP, Diogo MM, Prazeres DMF, Marcos JC. "Purification of Plasmid DNA
Vectors by Aqueous Two-phase Extraction and Hydrophobic Interaction
Chromatography", Journal of Chromatography A, 1082, pp. 176-184, 2005.
45. Freitas SS, Santos JAL, Prazeres DMF. "Plasmid DNA Production", In: Development of
Sustainable Bioprocesses: Modeling and Assessment, Biwer, A, Heinzle, E, Cooney, C
(Eds), New York, Wiley, 2006 (in press).
46. Passarinha LA, Diogo MM, Queiroz JA, Monteiro GA, Fonseca LP, Prazeres DMF.
"Production of ColE1 Type Plasmid by Escherichia coli DH5 Alpha Cultured Under
Nonselective Conditions", Journal of Microbiology and Biotechnology, 16, pp. 20-24,
2006.
BIOTECHNOLOGY OF THE BACTERIAL
GELLAN GUM: GENES AND ENZYMES OF THE
BIOSYNTHETIC PATHWAY

Arsénio M. Fialho, Leonilde M. Moreira, Ana Teresa Granja, Karen


Hoffmann, Alma Popescu and Isabel Sá-Correia
Grupo de Ciências Biológicas, Centro de Engenharia Biológica e Química,
Instituto Superior Técnico, Universidade Técnica de Lisboa, 1049-001 Lisboa, Portugal;
E-mail: afialho@ist.utl.pt

Abstract: Bacterial exopolysaccharides (EPS) are a diverse and remarkably versatile


class of materials that have potential applications in virtually all sectors of
modern industry and economy. Currently, many biopolymers are still in the
developmental stage, but important applications are beginning to emerge in the
areas of food production and biomedicine. A few bacterial EPS can directly
replace synthetically derived material in traditional applications, whereas
others possess unique properties that can open up a range of new commercial
opportunities. This is the case of the commercial important Sphingomonas
elodea exopolysaccharide, gellan gum, one of the few bacterial gum with
gelling properties. In its native form, gellan is a linear anionic
heteropolysaccharide based on a tetrasaccharide repeat unit composed of 2
molecules of D-glucose, 1 of L-rhamnose and 1 of D-glucuronic acid. The
native gellan is partially esterified with acyl substituents (1 mole of glycerate
and 0.5 mol of acetate) per repeat unit. The significant changes in rheology
observed upon deacylation of gellan are essentially due to the glycerate
substituents. The potential for using gellan or gellan-like gums in industrial
applications is determined by their physical properties. Metabolic engineering
may be used as a tool to produce altered polysaccharides and/or to increase
gellan production. The eventual success of this approach requires a detailed
understanding of the molecular biology, biochemistry and physiology of its
biosynthesis. Gellan biosynthesis starts with the intracellular formation of the
nucleotide-sugar precursors, UDP-glucose, UDP-glucuronic acid and dTDP-L-
rhamnose, whose pathway was elucidated. The synthesis of the sugar
precursors is followed by the formation of the repeat unit, by sequential
transfer of the sugar donors to an activated lipid carrier by committed
glycosyltransferases, followed by gellan polymerization and export. Most of
these gellan specific processes are catalysed by enzymes encoded in the gel
cluster of genes. The identification of genes and the elucidation of crucial

233
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 233–250.
© 2007 Springer. Printed in the Netherlands.
234 A.M. Fialho et al.

steps in the pathway, indicate that possibilities now exist for trying exerting
control over gellan production, by modifying the expression of any of the
individual genes or of groups of genes.

Key words: Bacterial exopolysaccharides, gellan gum, biopolymers, Sphingomonas


elodea, Polysaccharide engineering.

1. THE COMMERCIAL BACTERIAL


EXOPOLYSACCHARIDE GELLAN:
COMPOSITION AND FUNCTIONAL
PROPERTIES

Microbial extracellular polysaccharides (EPS) are a class of high-value


polymers that may have many industrial applications in Biotechnology. This
is the case of gellan gum, a commercial gelling agent produced in high yield
by Sphingomonas elodea ATCC 31461. It has approval in the USA and EU
for food use as a gelling, stabilizing and suspending agent, either on its own
or in combination with other hydrocolloids [1]. In its native form, gellan is a
linear anionic heteropolysaccharide based on a tetrasaccharide repeat unit
composed of 2 molecules of D-glucose (Glc), 1 of D-glucuronic acid (GlcA)
and 1 of L-rhamnose (Rha). The native gellan is partially esterified with O-
acetyl and glyceryl moieties on the D-glucosyl residue adjacent to the D-
glucuronyl residue as side chains (1 mole of glycerate and 0.5 mol of acetate
per repeat unit) (Figure 1) [2]. Acyl substituents drastically affect the
rheology of the gels formed with small amounts of divalent cations;
chemical deacylation of the native form results in a change from soft, elastic
thermoreversible gels, to firm and more brittle gels [3]. This EPS has an
average molecular mass of about 500 kDa and has been shown to form a
double-helical structure in solution [4].

O-Ac
p
o3)-E-D-Glc(1o4)-E-D-GlcA(1o4)-E-D-Glc(1o4)-D-L-Rha(1o
n
L-Gly
Figure 1. Repeating unit of the exopolysaccharide gellan produced by S. elodea ATCC
31461. In the native polymer, O-acetyl (O-Ac) and L-glyceryl (L-Gly) are present at 0.5 mol
and 1 mol per repeating unit, respectively. D-Glc, D-glucose; D-GlcA, D-glucuronic acid; L-
Rha, L-rhamnose.
Biotechnology of the Bacterial Gellan Gum 235

Gellan gum is produced by CPKelco and is commercially available in


three chemical forms: no, low and high acyl content with the respective
denominations of Gelrite®, Kelcogel® F and Kelcogel® LT100. Gelrite“ is
used as a substitute of agar in microbiological and tissue culture media.
Kelcogels are food grade gellan gums mainly used as gelling agents in foods
and personal care applications. Blends of high and low acyl gellan gum can
produce intermediate gel textures. Other applications of gellan, in the
biomedical field, include capsules for drug delivery systems, particularly in
ophthalmologic formulations [5]. More recently, gellan gum was also proved
to be a suitable material for the construction of three-dimensional scaffolds
for Tissue Engineering [6].
Structure and rheological characterization of gellan – related
polysaccharides, differing essentially in the content of acetate and/or
glycerate, confirmed both predictions from X- ray studies and results from
rheological studies on chemically deacylated gellan; it is the glycerate
substituent the responsible for significant changes in rheology observed upon
deacylation [2]. These gellan - like polymers were produced by mutants
obtained by exposure of cultures of the producing strain to chemical
mutagens and to environmental stress, in particular to antibiotic stress,
followed by selection based on the distinct mucoid morphology of their
colonies [2], as exemplified in Figure 2-A. These variants of the gellan
structure contained both glycerate and acetate in different levels or only
acetate substitution. A gellan variant with acetate and lacking glycerate
(Mutant CRS2; Figure 2-B) was examined for the first time [2] because this
combination is not possible to obtain by chemical deacylation.

Figure 2. (A) – Mucoid colonial variants of S. elodea ATCC 31461 obtained by chemical
mutagenesis or exposure to antibiotic stress. These mutants produce gellan-related-
polysaccharides with different substitution paterns.
(B) – Content of L-glycerate and acetate (relative to 1 sugar unit) in wild-type and mutant
(CRS2 and POLIS) gellan polysaccharides [2].
236 A.M. Fialho et al.

The characterization of the different structure and properties of gellan


produced by the industrial strain and by a larger number of gellan - like
polysaccharides may provide further clues to understand the relationship
between gellan structure and properties — a prerequisite to success in
polysaccharide engineering. Another prerequisite to manipulate the gellan
pathway is elucidation of the molecular biology, biochemistry and
physiology of its biosynthesis in the producing strain, S. elodea ATCC
31461.

2. THE GELLAN GUM PRODUCING STRAIN


SPHINGOMONAS ELODEA ATCC 31461

The gellan gum producing strain Sphingomonas elodea ATCC 31461


(originally designated by Sphingomonas paucimobilis) was isolated from the
surface of a plant from the Elodea genus [7]. It is rod-shaped, non-
capsulated and forms round yellow carotenoid pigmented colonies on
nutrient agar plates (Figure 3-C). The producing strain belongs to the
Sphingomonas genus, which includes aerobic and yellow pigment-producing
bacteria and belongs to the D-4 subclass of Proteobacteria [8]. Like all the
other members of the Sphingomonas genus, it lacks the characteristic Gram-
negative bacteria lipopolysaccharide (LPS) on the outer membrane, which is
substituted by glycosphingolipids (GSL) [9, 10]. Strains of this genus
colonize different environments, including water, plant tissue, soil and
sediments. Some strains are opportunistic pathogens [11]. These bacteria
have great potential for biotechnological applications, in the biodegradation
of organic xenobiotic pollutants [12, 13, 14], as bacterial antagonists of
phytopathogenic fungi [15] and for the production of industrially useful EPS
[1]. Six other Sphingomonad bacteria secrete gellan-related EPS (known as
sphingans), which are classified into the gellan family (Figure 3-A) [16].
Sphingans share the same carbohydrate-backbone structure (-X-glucose-
glucuronic acid-X, where X is either L-rhamnose or L-mannose), to which
distinct side-groups are attached (Figure 3-A). These structural variations
have major effects on the physicochemical characteristics of the
polysaccharides, e.g., gelation, thermostability, and cation compatibility, and
have led to unique commercial food and industrial applications for three of
these polysaccharides (gellan, welan and rhamsan) (Figure 3-A). Sphingan-
producing bacteria were originally classified into diverse genera such as
Alcaligenes, Azotobacter, Pseudomonas, Xanthobacter and Xanthomonas.
However, the re-examination of their phenotypic characteristics indicated
that the sphingan producing bacteria are closely related to each other and to
Sphingomonas species [16]. A phylogenetic analysis based on the 16S
Biotechnology of the Bacterial Gellan Gum 237

ribosomal-RNA gene sequences of sphingan producing strains, reveals that


these strains do not cluster with the species S. paucimobilis or with most of
the other representative members of the genus Sphingomonas. The only
exception is the proposed species Sphingomonas pituitosa [17] (Figure 3-B),
which is very closely related to the gellan producing strain S. elodea ATCC
31461.

Figure 3. (A) – Structure of gellan and related polysaccharides; (B) - Unrooted phylogenetic
tree based on the nearly full-length 16S rRNA gene sequence data, indicating the position of
sphingan-producing bacteria within the radiation of the Sphingomonas species and other
reference organisms. The PHYLIP program Dnadist was used to align the nucleotide
sequence of 16S rRNA gene sequence and the tree was generated by the Neighbor-Joining
method. The corresponding nucleotide sequence data appear in the GenBank Database with
the following accession numbers: N. aromaticivorans (AB 025012), S. aerolata (AJ 429240),
S. aurantiaca (AJ 429238), S. asaccharolytica (Y 09639), S. mali (Y 09638), S. echinoids (AB
021370), S. parapaucimobilis (D 84525), S. sanguinis (D 84529), S. roseiflava (D 84520), S.
yabuuchiae (AB 071955), S. adhaesiva (D 84527), S. paucimobilis (D 84528), Sphingomonas
sp. ATCC 31555 (AF 503280), S. elodea ATCC 31461 (AF 503278), S. pituitosa (AJ
243751), Sphingomonas sp. ATCC 31961 (AF 503281), Sphingomonas sp. ATCC 21423 (AF
503277), Sphingomonas sp. ATCC 35159 (AF 503283), Sphingomonas ATCC 31554 (AF
503279), Sphingomonas sp. ATCC 31853 (AF 503282), Z. mobilis (AJ 554206); (C) –
Mucoid phenotype of the yellow carotenoid pigmented colonies of the gellan producing
Sphingomonas elodea ATCC 31461.
238 A.M. Fialho et al.

3. GENES AND ENZYMES INVOLVED IN GELLAN


BIOSYNTHESIS

Gellan biosynthesis represents a multi-step process and follows


essentially the mechanisms established for other acidic
heteropolysaccharides of Gram-negative bacteria. The pathway can be
divided into three different parts: i) the intracellular synthesis of sugar-
activated precursors; ii) the assembly of the tetrasaccharide attached to a
membrane anchored C55-isoprenyl pyrophosphate carrier; iii) the
polymerization of the repeat units and export of the polysaccharide.

3.1 Genes and enzymes involved in the formation of


nucleotide sugar precursors

Among the gellan biosynthetic enzymes, the early steps in gellan gum
biosynthesis, leading from the hexoses phosphate (glucose 6-phosphate and
glucose 1-phosphate) to sugar nucleotides (UDP-glucose - UDP-D-Glc;
UDP-glucuronic acid - UDP-D-GlcA and TDP-rhamnose - dTDP-L-Rha),
represent the interface between primary and secondary metabolism.
Therefore, they constitute important targets if metabolic engineering
strategies are to be implemented to increase gellan gum production and/or to
modify its composition.
The scheme of the pathway leading to the nucleotide sugar precursors
UDP-D-Glc, UDP-D-GlcA and dTDP-L-Rha, which are the donors of
monomers for the biosynthesis of the tetrasaccharide unit in gellan, is shown
in Figure 4 [18, 19]. The identification, sequence analysis and biochemical
characterization of genes/enzymes involved in the formation of the gellan
gum nucleotide precursors were carried out: the pgmG gene, encoding a
phosphoglucomutase (PGM; EC 5.4.2.2 ) which catalyses the reversible
conversion of glucose-6-phosphate into glucose-1-phosphate [20]; the ugpG
gene, encoding a glucose-1-phosphate uridilyltransferase [or UDPglucose
pyrophosphorylase (UGP; EC 2.7.7.9 )] which catalyses the reversible
conversion of glucose-1-phosphate and UTP into UDP-D-Glc and
diphosphate [21]; the ugdG gene encoding a UDP-glucose dehydrogenase
which converts UDP-D-Glc into UDP-D-GlcA (Granja, AT et al,
unpublished results) and rmlA, the first gene of the 4-gene rml cluster, which
encodes a glucose-1-phosphate thymidilyltransferase [or TDPglucose
pyrophosphorylase (TGP; EC 2.7.7.24)] that converts glucose-1-phosphate
and TTP into TDP-D-Glc, necessary for the formation of dTDP-L-Rha [22].
Biotechnology of the Bacterial Gellan Gum 239

Figure 4. Pathway leading to the nucleotide sugar precursors, UDP-D-glucose, UDP-D-


glucuronic acid and dTDP-L-rhamnose, involved in gellan gum biosynthesis. Abbreviations:
PgmG – Phosphoglucomutase (pgmG); UgpG – UDP-D-glucose pyrophosphorylase (ugpG);
TgpG – dTDP-D-glucose pyrophosphorylase (rmlA); UGD – UDP-D-glucose dehydrogenase
(ugdG); TRS –dTDP-L-rhamnose biosynthetic enzyme system [RmlB –dTDP-D-glucose-4,6
dehydratase (rmlB); RhsC – dTDP-4-dehydrorhamnose-3,5 epimerase (rmlC); RhsD – dTDP-
4- dehydrorhamnose reductase (rmlD)]. IM, inner membrane; OM, outer membrane.

3.2 Genes and enzymes involved in the formation of


gellan tetrasaccharide unit and in gellan
polymerization and export

3.2.1 The gellan cluster of genes

The gellan genes pgmG, ugpG and ugdG referred above do not map in
the same locus and are not present in the cluster of genes involved in gellan
synthesis [19]. The gel cluster contains 18 genes (gelQ,I,K,L,J,F,D,C,
E,M,N,B, rmlA,B,C,D and atrD,B) involved in the synthesis of dTDP-L-Rha,
glycosyltransferases and proteins required for gellan polymerization and
export (Figure 5-A) [19, 23]. The organization and the nucleotide sequence
of the gel cluster are highly similar to those described for the gene cluster
required for the synthesis of the sphingan S88 in Sphingomonas S88 (sps
cluster) [24].
The glycosyltransferases catalyse the sequential transfer of sugars from
the appropriate sugar donor to an activated lipid carrier. The gellan locus
includes gelB, homologous to spsB that encodes the priming transferase, a
glucosyl-isoprenyl phosphate-transferase that transfers glucose-1-phosphate
240 A.M. Fialho et al.

from UDP-glucose to the C55-isoprenylphosphate lipid carrier (PPL) [25].


The gelK gene was cloned and the encoded protein biochemically
characterized; GelK is a E-1,4-glucuronosyltransferase, which catalyses the
addition of GlcA, from UDP-glucuronic acid, into the glucosyl-D-
pyrophosphorylpolyprenol intermediate [26]. The other two
glycosyltransferases required for the assembly of the tetrasaccharidic unit are
possibly encoded in the gel cluster by the genes gelL and gelQ, putatively
involved in the addition of the third and fourth sugars of the repeat unit,
respectively [25]. However, at the present time, there is no experimental
evidence supporting this prediction.

Figure 5. (A) – Schematic representation of the S. elodea ATCC 31461 gel cluster [19, 23].
Putative or known gene functions are indicated. (B) – The predicted individual roles of the gel
genes in the repeat unit biosynthesis, and subsequent polymerization and export of gellan; the
biochemical function of gelK, gelC and gelE was established. IM, inner membrane; OM,
outer membrane.

3.2.2 Genes and enzymes predicted to be involved in gellan


polymerization and export

Gellan and related EPS, group 1 capsules, and lipopolysaccharide O-


antigen are assembled by a Wzy-dependent polymerization system. This is a
rather complex step, presumably involving many interactions between
several proteins whose biochemical characterization is still lacking.
Globally, in the model proposed for polymerization and export of bacterial
Biotechnology of the Bacterial Gellan Gum 241

EPS, lipid-linked repeat units are built upon undecaprenol diphosphate at the
cytoplasmic face of the inner membrane catalised by the sequential activity
of glycosyltransferases. The lipid-linked repeat units are then flipped to the
periplasmic face of the inner membrane by a putative flippase, where a
putative polymerase, generates a long-chain polymer that is exported
through the outer membrane by a protein channel [27, 28]. The products of
two other conserved genes are also required for the late steps of EPS
biosynthesis. These gene products are a protein tyrosine kinase (PTK) and a
phosphotyrosine protein phosphatase (PTP). Although the precise role of
tyrosine phosphorylation is still unclear, mutations in the PTK originated
oligosaccharides with low degrees of polymerization [29].
Computational analysis and experimental evidence indicate that in S.
elodea the genes involved in the regulation of the polymerization and export
of gellan are present in the gel cluster of genes (Figure 5-A), but the putative
genes for the flippase and polymerase proteins are elsewhere in the genome.
Indeed, Harding et al. (2004) [23] identified a second DNA region involved
in gellan biosynthesis not adjacent to the gel cluster. One of the genes from
this region, named gelS, encodes a protein with 10 predicted transmembrane
domains. The GelS protein has homology to membrane proteins involved in
polysaccharide export, namely to the Wzx protein from E. coli, a putative O-
antigen flippase [23]. Downstream of gelS is gelG; GelG has a segment
homologous to a consensus sequence from a putative O-antigen polymerase
(Wzy) from E. coli. Topology prediction indicates that Gel G is an integral
protein located at the plasma membrane, with 13 transmembrane segments
and two large periplasmic loops, which are characteristic of other putative
polymerases [30]. Although there is no phenotypic or biochemical data on
both GelS and GelG, they are postulated to be the flippase and the
polymerase, respectively, in our model for polymerization and export
(Figure 6).
Two other genes important for gellan biosynthesis are gelC and gelE
present in the gel cluster of genes (Figure 5-A). The gelC and gelE genes are
homologous to the activator domain and the kinase domain, respectively, of
several tyrosine autokinases involved in polysaccharide chain length
determination [31]. The pair GelC/GelE is quite different from other
homologues that have been described for Gram-negative bacteria which are
encoded by a single gene. Instead, they exhibit a genetic organization similar
to the one described for Gram-positive bacteria, being composed of two
independent polypeptides encoded by two sequential genes. Moreira et al.
(2004) [31] showed that deletion mutants of S. elodea for gelC or gelE
display a non-mucoid phenotype when compared to the mucoid phenotype
of the wild-type strain.
242 A.M. Fialho et al.

The biological role of the tyrosine autokinases in polysaccharide


polymerization in bacteria has been linked to their autophosphorylating
tyrosine kinase activity [29, 32, 33, 34, 35, 36, 37, 38]. A positive correlation
was observed between phosphorylation and high-molecular weight polymer
synthesis for Sinorhizobium meliloti succinoglycan and E. coli K30 or
Streptococcus pneumoniae D39 capsular polysaccharides. Contrasting with
this, phosphorylated tyrosine kinases act as negative regulators in colanic
acid biosynthesis in E. coli K12 and capsular polysaccharide biosynthesis in
S. pneumoniae Rx1. In order to determine if tyrosine phosphorylation is also
important in gellan biosynthesis, each of the four tyrosine residues present in
the C-terminal region of GelE was mutated and the results revealed that the
tyrosine residue at position 198 appears to be essential for the synthesis of
high-molecular-weight gellan. Also, GelE having the tyrosine at position 198
as the single tyrosine of the C-terminal region was sufficient to restore the
production of the high-molecular-weight fraction of gellan, although not up
to the same level as with native GelE [31]. These data suggest a positive
correlation between tyrosine phosphorylation and high-molecular-weight
gellan; however, in vitro, it was not possible to demonstrate the presence of
phosphorylated tyrosine residues in GelE.
In another analysis of GelE structure it was detected a potential
amphipatic helix between amino acids 225TNVIGCVLNG234 of the C-
terminal region which may be involved in the insertion of GelE within the
plasma membrane and therefore interacting with GelC. The exchange of the
hydrophobic amino acids V227 and I228 or V231 and L232 by hydrophilic ones
prevented complementation of the gelE deletion mutant (Hoffman et al.,
unpublished data). This result suggests that GelE may insert into the
membrane where it probably interacts with the activator domain GelC to
stimulate autophosphorylation (Figure 6).
In most of the gene clusters for polysaccharide biosynthesis a gene
encoding a phosphotyrosine protein phosphatase that has the protein tyrosine
kinase as an endogenous substrate is always present. However, this is not the
case in the gellan cluster of genes, since none of the gel genes of unknown
function are homologous to phosphatases. A similar situation was described
in succinoglycan biosynthesis by S. meliloti where no phosphatase was
identified in the exo gene cluster [36]. If such activity exists, in S. elodea
according to our model it would be involved in the dephosphorylation of
GelE and, in lowering the degree of gellan polymerization (Figure 6).
The very last step in gellan biosynthesis is the export of the polymer to
the cell surface. Accordingly to our model, the protein GelD is required for
this translocation to occur (Figure 6). GelD has a N-terminal secretion signal
and is homologous to several outer membrane proteins involved in the
surface expression of EPS and capsular polysaccharides. These proteins are
Biotechnology of the Bacterial Gellan Gum 243

surface-exposed outer membrane lipoproteins that form ring-like structures


[39]. The mutation of WzaK30 in E. coli severely restricted the formation of
K30 capsular structure on the cell surface [40]. The deletion of gelD in S.
elodea considerably reduces the production of gellan but, apparently, no
differences were observed in the viscosity of the polymer produced
(Hoffmann et al. unpublished data).
The model for polymerization and export of gellan here proposed (Figure
6) still requires experimental validation, in particular to unveil the molecular
mechanisms that, for example, interconnect protein tyrosine phosphorylation
and the regulation of polymer size and consequently the rheological
properties of gellan.

Figure 6. Location and proposed activities of the hypothetical biosynthetic complex required
for the coordinate synthesis and export of gellan in S. elodea. Step A, glycosyltransferases
assemble undecaprenol pyrophosphate (und-PP)-linked repeat-units at the cytoplasmic face of
the inner membrane. Step B, the repeat-units are flipped across the inner membrane by a
process involving GelS. Step C, the repeat-units are polymerized by GelG. Step D,
GelC/GelE are essential for high-level polymerization. To be active for gellan assembly, GelE
have to autophosphorylate and to interact with GelC. Step E, export of gellan to the cell
surface requires the outer membrane GelD multimeric complex, putative export channel. Step
F, dephosphorylation of GelE by a phosphatase is hypothesized to be involved in the
regulation of gellan chain length.
244 A.M. Fialho et al.

4. EFFECTS OF GROWTH CONDITIONS ON


GELLAN PRODUCTION

Although the production yield and the composition, structure and


properties of the gellan polymers produced by S. elodea ATCC 31461 are
genetically determined, it is possible to influence these factors by modifying
culture conditions as temperature [41], the level of dissolved oxygen [42, 43]
and growth medium composition, in particular the carbon [44] and nitrogen
sources [45].
Gellan gum biosynthesis is temperature-dependent, with a maximal
production yield at 20-25ºC, which is far bellow the optimal range for
growth and for maximal activity of biosynthetic enzymes (30-35ºC) [41].
Despite the uncertainties of the composition of the polymers synthesized at
the various temperatures, it was proposed that a more rapid turn-over of the
carrier lipid, at temperatures causing higher specific growth rates, may lead
to an earlier release of a polymer with a shorter chain length [41]. According
to this hypothesis, the carrier lipid available is preferentially used for the
simultaneous essential synthesis of peptidoglycan, thus limiting the level and
size of the gellan produced, as proposed for other EPS-producing bacterial
systems [46].
Cheese whey is an end product of cheese production. The most desirable
way of handling this waste, which is also a nutrient rich medium, is to utilise
it as a substrate for production of useful products of biotechnology, gellan
gum being considered one of them, with the simultaneous reduction of
cheese whey biological oxygen demand [44]. Comparison of gellan
biosynthesis by S. elodea ATCC 31461 in a laboratory medium, containing
glucose or lactose (5-30 g/l) as the carbon source, and in diluted cheese
whey indicates that the alteration of the growth medium markedly affects the
polysaccharide yield, acyl substitution level, polymer rheological properties
and susceptibility to enzymatic degradation [44]. The lower levels of gellan
production from lactose compared with the production from glucose
(approximately 30%) did not appear to occur at the level of sugar nucleotide
synthesis [44]. Lactose derived biopolymers exhibited the highest total acyl
content; the glucose and whey derived polymers had similar acyl contents
but differ in their acetate and glycerate levels. The polymer derived from
lactose showed the highest total acyl content but yielded the lower modulus.
Rheological studies showed that gellan viscosity appears to be directly
related with the level of glycerate present [2].
A number of complex nitrogen sources support gellan production by S.
elodea ATCC 31461, increasing the production yield when a complex
nitrogen source is available instead of ammonium sulfate [47]. Both the
nature and the concentration of the nitrogen source affect the gellan yield,
Biotechnology of the Bacterial Gellan Gum 245

being gellan and biomass production enhanced by yeast extract


supplementation [45].
The role of fermentor hydrodynamics on gellan fermentation kinetics and
the rheological properties of the culture broth are also important, increasing
gellan production when oxygen transfer capacity is improved [42].

5. GENETIC ENGINEERING OF THE GELLAN


PATHWAY

A few attempts to increase the relatively low conversion efficiency of


gellan from glucose in S. elodea ATCC 31461 (about 40–50%) [48]
compared with the nearly 60–80% of sugar conversion into xanthan gum
have been reported [48, 49]. By random mutagenesis, synthesis of the gellan
-competing poly -E-hydroxybutyrate synthesis was eliminated with no
positive effect on the efficiency of gellan production [49]. By site - specific
mutagenesis, the zwf gene encoding G6P-dehydrogenase was inactivated,
envisaging diverting carbon flow toward gellan synthesis, apparently without
the expected results [48].
Over the past few years, the production of gellan gum by S. elodea has
been extensively studied, and has led to important advances in our
knowledge about this biosynthetic process. Presently, most of the genes
encoding the enzymes involved in gellan biosynthesis are identified, as
described above. This gives the opportunity to evaluate the potential of
metabolic engineering strategies for the controlled modification of gellan
production yield and the chemical composition and properties of the
biopolymers produced by recombinant bacteria.
The early steps in gellan gum biosynthesis, leading from the hexoses
phosphate (glucose 6-phosphate and glucose 1-phosphate) to sugar
nucleotides (UDP-glucose, UDP-glucuronic acid and TDP-rhamnose),
represent the interface between primary and secondary metabolism and,
therefore, constitute an important target to implement the increase of gellan
gum production and/or to modify its composition. Since sugar nucleotides
are also used for maintaining primary cellular functions, they have to be
diverted from central metabolism to polysaccharide production. Biosynthesis
of gellan starts with the intracellular formation of the sugar nucleotides
catalysed by PgmG, encoded by pgmG (Figure 4). PgmG is a key enzyme in
the pathway and, apparently, an ideal target for metabolic engineering as it
represents a branch point in carbohydrate metabolism; G6P enters catabolic
processes to yield energy and reducing power whereas G1P is a precursor of
all the sugar nucleotides that are used in the synthesis of gellan and other cell
polysaccharides. Afterwards, the UDP-glucose-pyrophosphorylase, encoded
246 A.M. Fialho et al.

by the ugpG gene, is directly involved in the formation of UDP-glucose,


from G1P, and indirectly involved in the formation of another gellan
activated precursor, UDP-glucuronic acid, derived from UDP-glucose
through the biosynthetic step catalysed by UDP-glucose dehydrogenase
(Figure 4). Synthesis of the sugar precursors is followed by synthesis of the
repeat unit by sequential transfer of the sugar donors to an activated lipid
carrier by committed glycosyltransferases. Among them is the protein
encoded by the gelK gene, catalysing the step by which glucuronic acid is
linked to the first lipid - linked glucose.
In summary, metabolic engineering of the gellan pathway can now be
attempted. Preliminary results indicate that augmentation of the expression
of individual gellan biosynthetic genes, by increasing the number of pgmG,
ugpG or gelK gene copies in recombinant plasmids, has apparently no
positive effect on gellan productivity. However, the simultaneous increase of
pgmG and gelK expression led to a 20% increase in the final concentration
of gellan produced and to a polymer leading to aqueous solutions with
higher viscosity. A similar result was reported before for the genetic
manipulation of the sphingan S-7 production in Sphingomonas S7 [50];
while a sixfold increase of PGM activity (by augmentation of the
chromosomal gene copy with multiple copies of a plasmid carrying the
cloned gene) had a negligible effect in glucose conversion, multiple
biosynthetic genes from the S7 cluster, which code for assembly of the lipid
- linked carbohydrate repeat unit and secretion of the polymer, caused a 20%
increase in the yield from glucose and a larger increase in culture viscosity.
This increased viscosity was associated with a decrease in the ratio of
glucose to rhamnose compared with the original S-7 polymer [50]. In spite
of recent advances in the elucidation of the gellan biosynthetic pathway, a
better knowledge of the poorly understood steps and of the regulation and
bottlenecks of the pathway is crucial to the eventual success of the metabolic
engineering of gellan production.

ACKNOWLEDGEMENTS

This research is supported by FEDER and Fundação para a Ciência e a


Tecnologia (FCT), Portugal (grants: POCTI/BME/44441/2002, POCTI/BIO/
58401/2004 and PhD scholarships to ATG and KH and a BI grant to AP).
Biotechnology of the Bacterial Gellan Gum 247

REFERENCES
1. R. Chandrasekaran and A. Radha, Molecular architectures and functional properties of
gellan gum and related polysaccharides. Trends in Food Science Technology 6, 143-148,
1995.
2. A.J. Jay, I.J. Colquhoun, M.J. Ridout, G.J. Brownsey, V.J. Morris, A.M. Fialho, J.H.
Leitão and I. Sá-Correia, Analysis of structure and function of gellans with different
substitution patterns. Carbohydrate Polymers 35, 179-188, 1998.
3. M. Rinaudo and M. Milas, Gellan gum, a bacterial gelling polymer. Novel
Macromolecules in Food Systems, G. Doxastakis and V. Kiosseoglou Eds., Elsevier,
239-263, 2000.
4. J. N. Bemiller, Structure-property relationships of water-soluble polysaccharides, Journal
Applied Glycosciences 43, 377-384, 1996.
5. Rozier, C. Mazuel, J. Grove and B. Plazonnet, Gelrite: a novel, ion-activated, in-situ
gelling polymer for ophthalmic vehicles. Effect on bioavailability of timolol.
International Journal of Pharmacology 57,163-168, 1989.
6. G. Ciardelli, V. Chiono, G. Vozzi, M. Pracella, A. Ahluwalia, N. Barbani, C. Cristallini
and P.Giusti, Blends of poli-(caprolactone) and polysaccharides in tissue engineering
applications. Biomacromolecules 6:1961-1976, 2005.
7. K.S. Kang and G.T. Veeder, Fermentation process for preparation of polysaccharide S-
60. U. S. Patent: 4, 377, 636, 1981.
8. E. Yabuuchi, I. Yano, H. Oyaizu, Y. Hashimoto, T. Ezaki and H. Yamamoto, Proposals
of Sphingomonas paucimobilis gen. Nov. and comb. Nov., Sphingomonas
parapaucimobilis sp. Nov., Sphingomonas yanoikujae sp., nov., Sphingomonas
adhaesiva sp. Nov., Sphingomonas capsulata comb. Nov., and two genospecies of the
genus Sphingomonas. Microbiology and Immunolology 34, 99-119, 1990.
9. K. Kawahara, U. Seydel, M. Matsuura, M. Danbara, E.T. Rietschel and U. Zahringer,
Chemical structure of glycosphingolipids isolated from Sphingomonas paucimobilis.
FEBS Letters 292, 107-110, 1991.
10. S.R. Kawasaki, R. Moriguchi, K. Sekiya, T. Nakai, E. Ono, K. Kume and K. Kawahara,
The cell envelope structure of the lipopolysaccharide-lacking gram-negative bacterium
Sphingomonas paucimobilis. Journal of Bacteriology 176, 284-290, 1994.
11. J. Reina, A. Bassa, I. Llompart and D. Borrell, Infections with Pseudomonas
paucimobilis: Report of four cases and review; Revues Infected Diseases 13, 1072-1076,
1991.
12. J.K. Fredrickson, D.L. Balkwill, G.R. Drake, M.F. Romine, D.B. Ringelberg and D.C.
White,Aromatic-degrading Sphingomonas isolates from the deep subsurface. Applied
and Environmental Microbiology 61, 1917-1922, 1995.
13. T.K. Dutta, S.A. Selifonov and I.C. Gunsalus, Oxidation of methyl substituted
naphthalenes: pathways in a versatile Sphingomonas paucimobilis strain. Applied and
Environmental Microbiology 64,1884-1889, 1998.
14. S. Nishikawa, T. Sonoki, T. Kasahara, T. Obi, S. Kubota, S. Kawai, N. Morohoshi, and
Y. Katayama, Cloning and sequencing of the Sphingomonas (Pseudomonas)
paucimobilis gene essential for the O demethylation of vanillate and syringate. Applied
and Environmental Microbiology 64, 836-842, 1998.
15. G. Berg and G. Balin, Bacterial antagonists to Verticillium dahliae Kleb. Journal of
Phytopathology 141, 99-110, 1994.
16. T.J. Pollock, Gellan-related polysaccharides and the genus Sphingomonas. Journal of
General Microbiology 139, 1939-1945, 1993.
248 A.M. Fialho et al.

17. E.B.M. Denner, S. Paukner, P. Kampfer, E.R.B. Moore, W-R. Abraham, H-J Busse, G.
Wanner, and W. Lubitz, Sphingomonas pituitosa sp. Nov., an exopolysaccharide-
producing bacterium that secretes an unusual type of sphingan. International Journal of
Systematic and Evolutionary Microbiology 51, 827-841, 2001.
18. L.O. Martins and I. Sá-Correia, Gellan gum biosynthetic enzymes in producing and
nonproducing variants of Pseudomonas elodea. Biotechnology and Applied Biochemistry
14, 357-364, 1991.
19. Sá-Correia, A.M. Fialho, P. Videira, L.M. Moreira, A.R. Marques and H. Albano, Gellan
gum biosynthesis in Sphingomonas paucimobilis ATCC 31461: genes, enzymes and
exopolysaccharide production engineering. Journal of Industrial Microbiology and
Biotechnology 29, 170-176, 2002.
20. P.A. Videira, L.L. Cortes, A.M. Fialho and I. Sá-Correia, Identification of pgmG gene,
encoding a bifunctional protein with phosphoglucomutase and phosphomannomutase
activities, in gellan gum producing strain Sphingomonas paucimobilis ATCC 31461.
Applied and Environmental Microbiology 66, 2252-2258, 2000
21. A.R. Marques, P.B. Ferreira, I. Sá-Correia and A.M. Fialho, Characterization of the
ugpG gene from the gellan gum-producing Sphingomonas paucimobilis ATCC 31461
encoding a UDP-glucose pyrophosphorylase. Molecular Genetics and Genomics 268,
816-824, 2003
22. E. Silva, A. R. Marques, A. M. Fialho, A. T.Granja, I. Sá-Correia, Proteins encoded by
Sphingomonas elodea ATCC 31461 rmlA and ugpG, involved in gellan gum
biosynthesis exhibit both dTDP- and UDP- glucose pyrophosphorylase activities.
Applied and Environmental Microbiology, 71, 4703 – 4712, 2005
23. N. E. Harding, Y.N. Patel and R.J. Coleman, Organization of genes required for gellan
polysaccharide biosynthesis in Sphingomonas elodea ATCC 31461. Journal of Industrial
Microbiology and Biotechnology, 31, 70-82, 2004.
24. M.Yamazaki, L. Thorne, M. J. Mikolajczak, R. W. Armentrout, and T. J. Pollock.
Linkage of genes essential for the synthesis of a polysaccharide capsule in
Sphingomonas strain S88. Journal of Bacteriology 178:2676-2687, 1996
25. T.J. Pollock, W.A. van Workum, L. Thorne, M.J. Mikolajczak, M. Yamazaki, J.W. Kijne
and R.W. Armentrout, Assignment of biochemical functions to glycosyl transferase
genes which are essential for biossynthesis of exopolysaccharides in Sphingomonas
strain S88 and Rhizobium leguminosarum. Journal of Bacteriology 180, 586-593, 1998.
26. P.A. Videira, A.M. Fialho, R.A. Geremia, C. Breton and I. Sá-Correia, Biochemical
characterization of the E-1,4-glucuronosyltransferase GelK in gellan gum producing
strain Sphingomonas paucimobilis ATCC 31461. Biochemical Journal 358, 457-464,
2001.
27. C. R. Raetz and C. Whitfield, Lipopolysaccharide endotoxins. Annual Reviews of
Biochemistry, 71,635-700, 2002.
28. C. Whitfield and A. Paiment, Biosynthesis and assembly of group 1 capsular
polysaccharides in Escherichia coli and related extracellular polysaccharides in other
bacteria. Carbohydrates Research, 338, 2491-2502, 2003.
29. T. Wugeditsch, A. Paiment, J. Hocking, J. Drummelsmith, C. Forrester and C. Whitfield,
Phosphorylation of Wzc, a tyrosine autokinase, is essential for assembly of group 1
capsular polysaccharides in Escherichia coli. Journal of Biological Chemistry, 276,
2361-2371, 2001.
30. C. Daniels, C. Vindurampulle and R. Morona, Overexpression and topology of the
Shigella flexneri O-antigen polymerase (Rfc/Wzy). Molecular Microbiology, 28, 1211-
1222, 1998.
Biotechnology of the Bacterial Gellan Gum 249

31. L.M. Moreira, K. Hoffmann, H. Albano, A. Becker, K. Niehaus and I. Sá-Correia, The
gellan gum biosynthetic genes gelC and gelE encode two separate polypeptides
homologous to the activator and the kinase domains of tyrosine autokinases. Journal of
Molecular Microbiology and Biotechnology, 8, 43-57, 2004.
32. M.H. Bender, R.T. Cartee and J. Yother, Positive correlation between tyrosine
phosphorylation of CpsD and capsular polysaccharide production in Streptococcus
pneumoniae. Journal of Bacteriology, 185, 6057-6066, 2003.
33. J.K. Morona, J.C. Paton, D.C. Miller and R. Morona, Tyrosine phosphorylation of CpsD
negatively regulates capsular polysaccharide biosynthesis in Streptococcus pneumoniae.
Molecular Microbiology, 35, 1431-1442, 2000.
34. J.K. Morona, R. Morona, D.C. Miller and J.C. Paton, Mutational analysis of the carboxy-
terminal (YGX)4 repeat domain of CpsD, an autophosphorylating tyrosine kinase
required for capsule biosynthesis in Streptococcus pneumoniae. Journal of Bacteriology,
185, 3009-3019, 2003.
35. D. Nakar and D.L. Gutnick, Involvement of a protein tyrosine kinase in production of the
polymeric bioemulsifier emulsan from the oil-degrading strain Acinetobacter lwoffii
RAG-1. Journal of Bacteriology, 185, 1001-1009, 2003.
36. D. Niemeyer and A. Becker, The molecular weight distribution of succinoglycan
produced by Sinorhizobium meliloti is influenced by specific tyrosine phosphorylation
and ATPase activity of the cytoplasmic domain of the ExoP protein. Journal of
Bacteriology, 183, 5163-5170, 2001.
37. Paiment, J. Hocking and C. Whitfield, Impact of phosphorylation of specific residues in
the tyrosine autokinase, Wzc, on its activity in assembly of group 1 capsules in
Escherichia coli. Journal of Bacteriology, 184, 6437-6447, 2002.
38. Vincent, B. Duclos, C. Grangeasse, E. Vaganay, M. Riberty, A.J. Cozzone and P.
Doublet, Relationships between exopolysaccharides production and protein-tyrosine
phosphorylation in Gram-negative bacteria. Journal of Molecular Biology, 304, 311-321,
2000.
39. J. Nesper, C. M. D. Hill, A. Paiment, G. Harauz, K. Beis, J.A. Naismith and C.
Whitfield, Translocation of group 1 capsular polysaccharide in Escherichia coli serotype
K30. Journal of Biological Chemistry, 278, 49763-49772, 2003.
40. J. Drummelsmith and C. Whitfield, Gene products required for surface expression of the
capsular form of the group 1 K antigen in Escherichia coli (O9a:K30). Molecular
Microbiology, 31, 1321-1332, 1999.
41. L.O. Martins and I. Sá-Correia, Temperature profiles of gellan gum synthesis and
activities of biosynthetic enzymes. Biotechnology and Applied Biochemistry 20, 385-
395, 1994.
42. E. Dreveton, F. Monot, D. Ballerini, J. Lecourtier and L. Choplin, Effect of mixing and
mass transfer conditions on gellan production by Auromonas elodea. Journal of
Fermentation and Bioengineering 6, 642-649, 1994.
43. Giavasis, L.M. Harvey and B. Mcneil, The effect of agitation and aeration on the
synthesis and molecular weight of gellan in batch cultures of Sphingomonas
paucimobilis. Enzyme and Microbial Technology 38, 101-108, 2006.
44. A.M. Fialho, L.O. Martins, M.L. Donval, J.H. Leitao, M.J. Ridout, A.J. Jay, V.J. Morris,
and I. Sá-Correia, Structures and properties of gellan polymers produced by
Sphingomonas paucimobilis ATCC 31461 from lactose compared with those produced
from glucose and from cheese whey. Applied and Environmental Microbiology 65,
2485-2491, 1999.
45. T.P. West and B. Strohfus, Influence of yeast extract on gellan production by
Sphingomonas paucimobilis ATCC 31461. Microbios 97, 85-93, 1999.
250 A.M. Fialho et al.

46. W. Sutherland, Biotechnology of microbial exopolysaccharides. In: Cambridge Studies


in Biotechnology. vol. 9, Ed. Baddiley J., Carey N.H., Higgins I.J., e Potter, W.G.,
Cambridge University Press, Cambridge, 1990.
47. T.P. West and B. Strohfus, Effect of complex nitrogen sources upon gellan production
by Sphingomonas paucimobilis ATCC 31461. Microbios 94, 145-152, 1998.
48. N.B. Vartak, C.C. Lin, J.M. Cleary, M.J. Fagan and M.H. Saier Jr., Glucose metabolism
in Sphingomonas elodea: pathway engineering via construction of a glucose-6-phosphate
dehydrogenase insertion mutant. Microbiology 141, 2339-2350, 1995.
49. J.K. Baird and J.M Cleary, PHB-Free gellan gum broth US Patent 5300429, 1994.
50. L. Thorne, M.J. Mikolajczak, R.W. Armentrout and T.J. Pollock, Increasing the yield
and viscosity if exoplysaccharides secreted by Sphingomonas by augmentation of
chromosomal genes with multiple copies of cloned biosynthetic genes, Journal of
Industrial Microbiology and Biotechnology 25, 49-57, 2000.
EPIGENETICS: THE FUNCTIONAL MEMORY
OF RIBOSOMAL GENES

Wanda S. Viegas1, Manuela Silva1 and Nuno Neves1,2


1
Centro de Botânica Aplicada à Agricultura, Instituto Superior de Agronomia,Universidade
Técnica de Lisboa, Tapada da Ajuda, 1349-017 Lisboa, Portugal

2
Secção Autónoma de Biotecnologia, Faculdade de Ciências e Tecnologia, Universidade
Nova de Lisboa, Caparica, Portugal

Abstract: The functional importance of Epigenetics arise from DNA sequencing


programs that show the need for another code to explain the dynamics of gene
expression patterns observed along cell differentiation and organism
development. In this context, the study of ribosomal gene silencing is in fact
an excellent model to better understand the relationships that are established
between gene transcription and chromatin topology, and to unravel the
epigenetic switches evolved in the framework of gene expression.

Key words: Epigenetics, Nucleolar Dominance, DNA methylation, Histone code.

1. INTRODUCTION

During the last decade Genomics revealed the complete code of genetic
information of an increasing number of organisms. Although DNA
sequencing programs are giving us important catalogues of protein coding
genes, it is becoming increasingly evident that sequence information alone is
not sufficient to understand how the genome is interpreted in a living cell. In
this context, the study of functional information has emerged in a new mode
as Epigenetics. Epigenetics relies on the identification of heritable gene
expression patterns, and the mechanisms associated with their modifications
without changes at the DNA sequence level. This reflects the importance of
epigenetics, since chromatin itself carries additional information that does
not reside in the nucleotide sequence, as was postulated by Conrad

251
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 251–257.
© 2007 Springer. Printed in the Netherlands.
252 W.S. Viegas et al.

Waddington [1]. Since then, several studies in animals, plants and yeast
disclosed the basic “epigenetic rules’, where condensed heterochromatin
represents a potent gene silencing capacity due to its tight conformation, in
contrast to the relaxed configuration of euchromatin, available for
transcription (Figure 1).

Figure 1. Condensed heterochromatin corresponds to a gene silencing state and a tight


conformation, contrasting with euchromatin which is potentially active and showing
a relaxed configuration.

Those features set the idea of an epigenetic code that helps in shaping
chromatin topology and, consequently, gene expression patterns. In this
context the study of ribosomal RNA gene expression, and its organization
patterns, is fundamental to the growing understanding of epigenetic
pathways that rule chromatin remodeling events.

2. ORGANIZATION OF RIBOSOMAL
CHROMATIN: FUNCTIONAL AND
STRUCTURAL DOMAINS

Ribosomal RNA genes (rRNA) are considered as the genes coding for
three of the four RNA molecules needed to build up ribosomal sub-units, in
association with a large number of different proteins. Each ribosomal gene
encodes the information for a large 45S primary transcript which is further
processed into 18S, 5.8S and 25S rRNA molecules. Ribosomal genes are
present in multiple copies organized in tandem, with each gene unit
separated from the next by intergenic spacers (Figure 2A). Multiple
ribosomal DNA copies are clustered at particular chromosomal loci termed
NORs (Nucleolar Organizing Regions), since the transcription of the rDNA
units fabricates the most conspicuous nuclear compartment – the nucleolus,
Epigenetics: The Functional Memory of Ribosomal Genes 253

where the assemblage of ribosomal sub-units takes place. The analysis of


ribosomal chromatin organization soon suggested that only particular arrays
of rDNA units in a NOR are active, as demonstrated by classical studies
showing a sub-set of ribosomal RNA genes engaged in RNA polymerase I
elongation complexes [2]. Several studies using in situ hybridization (ISH)
with ribosomal probes extensively confirmed [3, 4] two distinct chromatin
domains within each NOR: a large condensed perinucleolar block followed
by thin intranucleolar strands (Figure 2B), representing the differential
regulation of the excessive number of rRNA genes per cell through internal
changes in chromatin organization.

Figure 2. Ribosomal genes are organized in multiple DNA copies clustered at particular
chromosomal loci termed NORs (Nucleolar Organizing Regions) (A). Distinct functional
chromatin domains are observed within each NOR (B).The condensed perinucleolar block
corresponds to the excessive number of inactive rRNA genes, and the thin intranucleolar
strands to the potentially active ones.

3. NUCLEOLAR DOMINANCE: A CASE STUDY IN


EPIGENETICS

Nucleolar dominance was initially described by Navashin [5] in Crepis


spp. hybrids, representing a genomic interaction where NORs of one
parental species are silenced; hence, these NORs comprise only one
continuous condensed domain. Navashin demonstrated that nucleolar
dominance is a reversible process, since when the hybrid is backcrossed to
the parent which contributed the chromosome with the silenced NOR, the
activity of this NOR is restored in the backcross plant in which it is carried.
This effect was later confirmed in a number of other plant and animal
species, showing that permanent damage or loss of silenced NORs does not
occur, prompting this phenomenon to an epigenetic interpretation. In this
context, two allopolyploid species, Triticosecale (triticale) and Arabidopsis
suecica, with marked differences in their DNA content and origin (Figure 3)
254 W.S. Viegas et al.

are currently used to disclose epigenetic marks and their developmental


dynamics. Triticale is a synthetic allopolyploid resulting from experimental
crosses between wheat (Triticum aestivum L.,2n=42) and rye (Secale cereale
L., 2n=14), both with large genomes. Arabidopsis suecica is a natural
allopolyploid with parental genomes originating from A. arenosa (4n=32)
and A. thaliana (2n=10), which have very small sizes.

Figure 3. Allopolyploid species – Triticosecale and A. suecica -are fertile hybrids with
different parental genomes sharing the same nuclear environment, but having a common
cytoplasm. Nucleolar dominance occurs with silencing of the rRNA genes from S. cereale and
A. thaliana origin respectively, although the DNA content of both species is dramatically
different. The DNA content of the haploid genome of A. thaliana corresponds to less than a
chromosome arm of S. cereale.

Analysis of the ribosomal chromatin organization and expression features


revealed silencing of ribosomal genes of rye origin in triticale, and of A.
thaliana origin in A.suecica [6]. Moreover a marked developmental
regulation of nucleolar dominance was disclosed, through the
characterization of the exact moments in which that process is established
and further reprogrammed, as exemplified for triticale in Figure 4. The
evaluation of rye NOR expression patterns in triticale was performed in
developing seeds, revealing that nucleolar dominance is simultaneously
established short after pollination both in the embryo and in the endosperm,
indicating a total independence of the number of previous cell cycles [7].
Silencing of NORs in triticale is maintained during the development of the
sporophyte and is then reprogrammed during meiosis.[8].
Epigenetics: The Functional Memory of Ribosomal Genes 255

Figure 4. In Triticale nucleolar dominance is established during embryogenesis


and erased at meiosis.

4. EPIGENETIC MODULATION OF NUCLEOLAR


DOMINANCE

The first epigenetic mark, shown to be responsible for particular


ribosomal chromatin states, and changes in gene expression patterns [9], was
the chemical modification of cytosines in CpG or CpNpG nucleotide
sequences, mediated by DNA methyltransferases. These enzymes are
capable of adding a methyl group de novo in both DNA strands, or
maintaining the previously established methylation pattern by methylation of
newly formed DNA strands after DNA replication. Several studies inducing
DNA hypomethylation in many hybrids demonstrated, at both the
cytological as at the molecular level, the erasing of nucleolar dominance and
the consequent activity of NORs from any parental origin [10]. This direct
correlation between the differential heterochromatinization of NORs of one
parental origin in hybrids, DNA methylation at cytosine residues and the
switching off of rDNA units, clearly establishes the epigenomic origin of
nucleolar dominance. Other epigenetic tags usually associated with
chromatin remodeling are the various histone post-translation modifications,
which can occur in nucleosomes. Histone modification occurs mainly on
their tails, and are associated with the acetylation, methylation,
phosphorylation, ribosylation or ubiquitination of particular aminoacids
residues. These histone marks lead to marked modifications in chromatin
organization patterns and to changes in nuclear topology of specific
chromatin domains. Disclosure of the “histone code” associated with
256 W.S. Viegas et al.

nucleolar dominance in hybrids was performed through identification of


distinct modified histones on NORs from both parental species that associate
with active or silent rDNA arrays.

Figure 5. Distinct epigenetic tags are associated with differential transcription states of
ribosomal chromatin.

These in-depth chromatin characterizations revealed that heterochromatic


rDNA domains display densely methylated DNA sequences, present low
levels of histones H4 acetylation, and also have an identifiable mark on
histones H3 which are methylated at lysine 9 residues. Conversely,
ribosomal euchromatin, where active rRNA genes reside, correspond to
decondensed chromatin with mostly unmethylated DNA sequences, enriched
in acetylated histones H4 and with distinctive methylation at lysine 4
residues of histones H3 (Figure 5) [11].
Interconversions between ribosomal genes expression patterns are
mediated by several chromatin remodeling enzymes which are being
searched for using RNA interference technology to generate loss-of-function
mutant lines. Some important enzymes responsible for the establishment and
the maintenance of nucleolar dominance were already identified and are
directly related with dynamics of epigenetic marks [11].

ACKNOWLEDGEMENT

We would like to express our sincere thanks to Margarida Delgado for


her inspirational help in producing the figures.
Epigenetics: The Functional Memory of Ribosomal Genes 257

REFERENCES
1. Waddington, C H. “Canalization of development and the inheritance of acquired
characters”, Nature, vol. 150, pp. 563-565, 1942.
2. Miller G, Berlowitz L, Regelson W. “Chromatin and histones in mealy bug cell explants:
activation and decondensation of facultative heterochromatin by a synthetic polyanion”,
Chromosoma, vol. 32, pp. 251-261, 1971.
3. Caperta A, Neves N, Morais-Cecílio L, Malhó R, Viegas W. “Genome restructuring in
rye affects the expression, organization and disposition of homologous rDNA loci” J.
Cell Sci vol. 115, pp. 2839-2846, 2002.
4. Pontes O, Lawrence RJ, Neves N, Silva M, Lee J-H, Chen ZJ, Viegas W, Pikaard CS
“Natural variation in nucleolar dominance reveals the relationship between nucleolus
organizer chromatin topology and rRNA gene transcription in Arabidopsis” Proc Natl
Acad Sci USA, vol. 100, pp.11418-11423, 2003.
5. Navashin M. “Chromosomal alterations caused by hybridization and their bearing upon
certain general genetic problems” Cytologia, vol. 5, pp. 169-203, 1934.
6. Viegas W, Neves N, Silva M, Caperta A, Morais-Cecílio L. “Nucleolar dominance: a
‘David and Goliath’ chromatin imprinting process” Curr Genomics, vol. 3, pp. 563-576,
2002.
7. Castilho A, Queiroz A, Silva M, Barão A, Neves N, Viegas W. “The developmental
stage of inactivation of rye origin rRNA genes in the embryo and endosperm of wheat x
rye F1 hybrids” Chrom Res, vol. 3, pp. 169-174, 1995.
8. Silva M, Queiroz A, Neves N, Barão A, Castilho A, Morais-Cecílio L, Viegas W.
“Reprogramming of rye rDNA in triticale during microsporogenesis” Chrom Res, vol. 3,
492-496, 1995.
9. Neves N, Castilho A, Silva M, Heslop-Harrison JS, Viegas W. “Genomic interactions:
gene expression, DNA methylation and nuclear architecture” Chrom Today, vol. 12, pp.
182-200, 1997.
10. Pikaard CS. “The epigenetics of nucleolar dominance” Trends Genet, vol. 16, 495-500,
2000.
11. Lawrence RJ, Earley K, Pontes O, Silva M, Chen ZJ, Neves N, Viegas W, Pikaard CS.
“A concerted DNA methylation/histone methylation switch controlling rRNA gene
dosage control and nucleolar dominance” Molecular Cell, vol. 13, 599-609, 2004.
BIOTECHNOLOGY OF REPRODUCTION AND
DEVELOPMENT: FROM THE BIOMEDICAL
MODEL TO ENTERPRISE INNOVATION

Luís Lopes da Costa, António Freitas Duarte and José Robalo Silva
Faculdade de Medicina Veterinária, Universidade Técnica de Lisboa, Rua Prof. Cid dos
Santos, Polo Universitário, Alto da Ajuda, 1300-477 Lisboa, Portugal,
e-mail: lcosta@fmv.utl.pt, aduarte@fmv.utl.pt, jrobalo@fmv.utl.pt

Abstract: Biotechnology methods provided a huge breakthrough in the knowledge of


reproduction and development processes in mammals and opened new
windows of opportunity for innovative enterprises in reproductive
technologies in livestock and, in the pharmaceutical and biotechnological
industries. In this paper we review our recent methodological and scientific
developments and their integration in private commercially oriented
enterprises. The first part of the paper contains examples on reproductive
technologies (semen cryopreservation in the national equine breed, the
Lusitano, and embryo transfer in dairy cattle) and the second part includes
developmental biology studies on gene expression and genetic manipulation of
the laboratory mouse and their prospects in the biotechnology industry

Key words: Reproduction, Development, Embryo, ES-Cells, Gene Targeting, Transgenesis

1. INTRODUCTION

Biotechnological methods have been of paramount relevance in the


exponential rise of our knowledge on the reproductive and development
processes of mammals. This allowed applications that in the foreseen past
were only mere fiction ante visions. On other hand, these de novo knowledge
and methods open new windows of opportunity for innovative enterprises in
the livestock, pharmaceutical and biotechnological industries.
In this paper, we review our recent developments in reproductive and
developmental biology methodologies and their integration into private

259
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 259–272.
© 2007 Springer. Printed in the Netherlands.
260 L. Lopes da Costa et al.

oriented enterprises. In the first part of the paper, results on semen


cryopreservation in the national equine breed, the Lusitano, and on embryo
transfer in dairy cattle are presented and discussed. In the second part of the
paper, we outline the evolution of ongoing studies on embryonic
angiogenesis in the mouse.

2. REPRODUCTIVE TECHNOLOGIES

2.1 Semen cryopreservation in the Lusitano stallion

Techniques for artificial insemination (AI) in horses were established


long ago, but its use remained low for several decades [1] mainly due to the
fact that equine breed associations did not allow registration of foals
produced by AI. This has progressively changed and presently most equine
breed associations accept registration of foals conceived through the use of
transported cooled or frozen semen [2]. These changes were followed by a
steady increase of AI with chilled and frozen semen in some countries [3, 4]
but the percentage of inseminated mares is still low and AI is still far from
being the main method of breeding horses. However, transportation of
semen for on-farm AI and access to semen of valuable stallions are
important advantages of AI acceptance. The main reason why AI is
increasing so slowly is lack of confidence on fertility dependant on quality
of cooled and frozen semen that is highly variable [5, 6]. This is due to
factors that are not as well studied as in other species like cattle, namely lack
of well standardized freezing procedures [7-8], individual stallion variation
of sperm to tolerate cooling and freezing [9], and the fact that in vitro semen
evaluation tests do not correlate well with fertility [10]. Fertility evaluation
through pregnancy rates of inseminated mares is the best measure of fertility
but gives a delayed answer and suffers from fertility factors dependant on
the mare like age, parity, and reproductive history [4-11].
The Lusitano is a Portuguese horse breed well accepted in Europe, in
the United States of America and in South America. The Portuguese
Association of Pure-Blood Lusitano approved registration of foals produced
by AI with fresh semen in 2001 and with frozen-thawed semen in 2004. This
is expected to result in increasing interest on chilled or frozen semen
commercialisation. However, it is scarce the knowledge on reproductive
function of Lusitano stallions, including seasonal variations of semen
production and quality. That is why it was decided to carry out a study that
included evaluation of sperm production and testicular size along a whole
year. Data were taken in five Lusitano stallions, 6 to 8 year old, during the
Biotechnology of Reproduction and Development 261

transition periods (spring and autumn), the breeding (summer) and the non-
breeding (winter) seasons. Testicular measurements were carried out by
ultrasonography, prior and after each period of semen collection. Ejaculates
were collected for 4 weeks in each of the sampling periods and frequency of
semen collection consisted of 2 ejaculates one hour apart every week.
Results obtained in this study [12, 13] showed that testicular volume and
semen production varied along the year and that testicular size was
positively correlated with total sperm number. Maximum testicular volume
and sperm production were reached in the spring, but while testicular
volume reached a minimum in the autumn, sperm production continued to
decrease and was minimum in the winter. Although the seasonal pattern of
variation was identical in all stallions there were significant differences
among stallions both for testicular size and for sperm production. These
results show that testes size and sperm production of Lusitano stallions are
depressed during the non-breeding season. This follows a pattern identical to
that observed in the rest of the northern hemisphere but the depression is less
than in higher latitude where daily sperm production during winter is
reported to be half of that obtained during the breeding season [14]. The
results also show significant variability in sperm production amongst
stallions, anticipating pronounced differences in fertility like observed in
other horse breeds after insemination with both fresh and frozen-thawed
semen [15].
Following the evidence that reproductive function of Lusitano stallions is
influenced by season, a study was undertaken on semen freezability along
the year. The aim was to test if spermatozoa collected during the non-
breeding season tolerate the freeze-thaw process as well as those collected
during the physiological period of breeding. Data were obtained from 6
Lusitano stallions, with ages ranging form 6 to 9 years, belonging to the
National Stud (Coudelaria Nacional) at Santarém. Four single ejaculates
with 15 days interval were collected for freezing during spring, summer,
autumn and winter. All ejaculates for freezing were collected on Thursday,
after a double collection on Monday and single collections on Tuesday and
Wednesday, in order to avoid problems due to low ejaculation frequency that
is suggested to be associated with low fertility and low quality semen [16].
Semen was evaluated for motility, morphology and concentration
immediately after collection, and again for motility after centrifugation and
immediately before freezing. Post-thaw semen evaluation was done after two
or more months of freezing and consisted on determination of progressive
motility and membrane integrity by the hypo-osmotic swelling test (HOS).
Sperm motility decreased with centrifugation and freezing but the decrease
was more pronounced after the freezing procedure. The effect of
centrifugation was pronounced in one of the stallions and generally there
262 L. Lopes da Costa et al.

was partial recovery of motility during the glicerolysation period. These


results show that motility is affected by freezing but the percentage of
progressively motile sperm after thawing is well above the 35 % generally
accepted as the minimum level for AI [3-5]. The efficacy of the freezing
procedure used is reinforced by the rate of spermatozoa with positive
reaction to the HOS test that evaluates whether spermatozoa membrane is
intact [17]. These results also show that parameters of sperm viability at
collection and decrease of those parameters during the freeze-thaw process
were identical in all seasons, suggesting that season affects semen
production but not semen freezability. This shows that semen can be
collected and frozen during the non-breeding season provided collection
frequency is adjusted to daily sperm production.
Progress in the use of Lusitano frozen semen is difficult to anticipate but
its evolution will require certification of AI centres and definition of criteria
for semen commercialisation. Having that in mind, technical assistance was
provided for organization of two AI centres that are awaiting official
approval. However, horse AI will also depend on development of
technologies able to allow good fertility with insemination doses much
smaller than the required presently [3-18]. Deep uterine, hysteroscopic and
oviductal inseminations are techniques under study that allow satisfactory
pregnancy rates with low doses of spermatozoa [19, 20] but these techniques
have limited practical application for routine artificial insemination.

2.2 Embryo transfer in dairy cattle

Embryo transfer (ET) has been an ancillary tool in the genetic


improvement of livestock and particularly of dairy cattle for the past 30
years. The technique involves the hormonal stimulation of a donor female
(usually of high genetic merit) to increase ovulation rate (superovulation),
the donor AI using frozen-thawed semen from a high genetic merit sire bull,
the recovery of early embryos from the donor uterus, the evaluation and
manipulation of embryos and, the transfer of embryos to recipient females
with their estrous cycle synchronized with that of the donor. This technology
still presents two main technical drawbacks for its wide use at the herd level:
a) the variation of response to the hormonal stimulation of donors (referred
as superovulatory response) and, b) the unsexed nature of the embryos
collected.
Although innumerous studies have enlightened our knowledge on the
ovarian follicular population dynamics and on the cell signaling pathways
and, despite of the several methods attempted to control ovulation rate, little
(if any) progress has been achieved regarding the number of good quality
(transferable) embryos produced per donor and per treatment basis. Present
Biotechnology of Reproduction and Development 263

embryo production efficiency has not changed, compared to that recorded


some 20 to 30 years ago [21, 22]. In a study in dairy cattle designed to
evaluate the relationship between plasma hormone (progesterone)
concentrations and embryo yield (n = 344 treatments), we reported an
average number of 5 to 6 good quality embryos in a per treatment and per
responsive donor basis [23]. We also reported a statistically significant
positive correlation between progesterone concentrations (in the luteal phase
of the preceding and of the treatment cycles) and embryo yield. In a large
data set generated from the commercial application of ET in 26 dairy herds,
we evaluated the factors related to the production of good-quality embryos
[24]. From the statistically significant factors that emerged from the analysis,
the herd of the donor was the most important source of variation in embryo
quality. These results as also pointed out by others [25, 26], reflect the
importance of animal management procedures that, in most cases, are
beyond the control of the ET technician, to the outcome of the operation.
The net result of this operation, i.e. the birth of a calf is affected by the
inter-relationship of several factors. Our retrospective analysis of field
embryo transfers (n = 1653) showed that the main factors involved were the
quality of the embryo, the technical quality of the transfer procedure and the
type of recipient used [24,27-28]. Pregnancy rate at day 45 of gestation
varied from 71 % (excellent embryo quality combined with a good quality
transfer to a recipient heifer) to 54 % (good embryo quality combined with a
good quality transfer to a recipient cow). Cryopreservation of embryos
induced about a 15 % decrease in pregnancy rate.
One issue that is still a matter of controversy in the scientific community
is the relationship between concentrations of hormones and the probability
of carrying a pregnancy to term in the recipient. The presence of such a
correlation would enable the testing of recipients prior to transfer, in order to
select those with the most favorable hormonal background. Some authors
have reported that females with higher blood progesterone concentrations are
more likely to become pregnant and maintain pregnancy than those with
lower concentrations. However, most of these results were obtained with
inseminated females. Applying this information to recipients might not be
straightforward. In fact, an inseminated pregnant female, if sampled at the
time a recipient would receive an embryo (around day 7 of the cycle) might
provide biased information, because the hormone concentrations might
reflect the endogenous hormone output and / or the embryonic output and /
or its stimulation. The presence of an embryonic stimulatory effect on
progesterone concentrations is well known, although starting around day 15
of gestation in cattle and day 13 in the sheep [29]. This is accomplished by
the embryonic product interferon-tau that avoids the regression of luteal
function, which maintains pregnancy, the basis of the so-called maternal
264 L. Lopes da Costa et al.

recognition of pregnancy [30]. The presence of other earlier stimulatory


effects of embryonic origin is suggested by in vitro studies [31, 32], but no
conclusive in vivo experiments have been undertaken.
With this in view, we have designed a series of experiments to evaluate
the presence of a systemically detectable early luteotrophic factor of
embryonic origin [33]. In a controlled prospective study involving 325 virgin
heifers, we compared the progesterone profiles of females that were
submitted to AI, to ET and that were not served (controls). Females pregnant
either by AI or ET presented similar progesterone concentrations until the
day of ET. This shows that no early luteotrophic effect of embryonic origin
was detected at the peripheral level. Furthermore, differences between
pregnant and non-pregnant females were only observed after day 15 of the
cycle, which is consentaneous with the timing of the maternal recognition of
pregnancy mechanism. In a previous study [34] we showed that plasma
progesterone concentrations were not related to embryo survival in heifers,
but lactating cows with lower progesterone concentrations were more prone
to embryo-fetal mortality. These data suggests that lactating cows are under
production stress and that the nutritional and metabolic imbalance is
responsible for abnormal (say subnormal) luteal function (progesterone
production), which affects embryo survival. Also, in sheep we have
evaluated the relationship between plasma progesterone concentrations and
embryo survival and found no statistically significant relation till the time of
the maternal recognition of pregnancy mechanism [35]. These results
evidence that hormonal testing of heifer recipients, except for the rejection
of females with obvious abnormal luteal function, is of no value in a
commercial context. Testing of cow recipients prior to ET might prove
beneficial in some occasions, especially if lactating cows in their lactation
peak are used, because the prevalence of luteal dysfunction can be not
negligible in some management practices.

2.3 Embryo sexing and splitting

As stated above, the second main drawback of ET is the unsexed nature


of the recovered embryos. This aspect is highly relevant in Portugal due to
the short market and the low value of male calves. High genetic-merit male
calves originated by ET could be a potential source of income to the dairy
herd, making these products available to AI Centres or the breeding sire
market. Unfortunately, in Portugal such breeding male market simply does
not exist. Therefore, the cost of the ET operation is severely jeopardized
because only female calves are of value, either to be incorporated in the herd
or to enter the breeding market. This represents a doubling of the ET
Biotechnology of Reproduction and Development 265

operation cost in a per female calve produced basis, since the natural
occurring sex-ratio is roughly 1:1 (male:female).
Sperm sexing would enable the AI of superovulated donors with sperm
of predetermined sex, thus resulting in all calves being of the desired sex.
However, the present low efficiency of the procedure, the high costs
associated with equipment and skilled personnel, the lower fertility of sexed
sperm, all of which are translated into a higher cost per insemination dose,
are main issues that need to be overcome in order to perspective the wide
dissemination of this technology [36]. While sperm sexing still awaits a cost-
effective method for the large-scale production of frozen fertile artificial
insemination doses, embryo sexing is the alternative of choice. Several
procedures for the sexing of embryos have been described but only those
using DNA probes proved to be reliable enough for being used in
commercial operations. In our lab, we have implemented the technology of
embryo sexing as a service to ET teams. The procedure involves performing
an embryo biopsy by micromanipulation techniques [37], the amplification
of Y-chromosomal DNA sequences by PCR and the visualization of specific
male-bands after electrophoresis of PCR products [38]. Embryos are
collected at the dairy herds (within 200 Km from the lab) by ET teams and
brought to the lab for the sexing procedure. After completion of the biopsies,
embryos are sent back to the herd and, at their arrival, sexing results are
usually already available and communicated to the farmer. Embryos are then
transferred to recipients by ET teams according to farmer instructions.
Alternatively, all embryos are frozen after being biopsied and sent back to
the herd for future direct transfer to recipients.
One problem faced by our lab was that farmers wanted all embryos to be
sexed, irrespective of their quality. The results show that, compared to good
quality embryos, poor quality embryos gave a higher percentage of no sex
diagnosis (23.1 % vs 6.8 %). Also, poor quality embryos originated a lower
calving rate after transfer to recipient heifers than good quality embryos (14
% vs 50 %). The calving rate achieved after transfer of fresh biopsied and
sexed good quality embryos to recipient heifers (50 %) and the accuracy of
sexing (93 %; obtained through the comparison of the diagnosed sex with
the actual sex observed after birth) are similar to those reported by other
teams [39, 40]. Errors in the sex diagnosis occurred only when the sexing
result (DNA bands in the electrophoresis) were not clear. In fact, it seems
that whenever the sex diagnosis is not straight forward, no attempt should be
made to assign a sex to the embryo [39].
There exists a short period of embryonic life where spontaneous or
induced division of the embryo will originate two (eventually up to four)
evolutive pregnancies carrying the same genomic and cytoplasm
components, leading to the potential birth of true (monozygotic) twins [41,
266 L. Lopes da Costa et al..

42]. These twins can be obtained through the division (splitting) or early
preimplantation embryos and are of great value for animal experimentation
and clinical trials. Their use reduces the genetic variation amongst the
control and treatment groups within an experiment, which strengths the
statistical approach and biological extrapolations. Also it allows reduction of
the number of animals needed for the experiment, which is of relevance in
the context of the ethics of animal experimentation.
This approach (monozygotic twins) is in our opinion, also of great
interest for the generation of biomedical models to study the interaction
between the genomics of the foetus and the uterine environment, including
epigenomics effects. We have been using this model to study embryo-mother
communication and embryo survival, as well as changes in fetal growth and
post-natal calf phenotype in the bovine species. Preliminary results
(manuscripts in preparation) show that even monozygotic twins incubated in
the same uterus will present phenotype variation, although at a lesser extent
than that observed in control unrelated populations.
Another application of this embryo technology is to increase the number
of embryos for transfer, particularly when the numbers collected are small.
This can be used at the commercial level by ET teams [43]. Together with
embryo sexing it can partially compensate for the discard of embryos of the
unwanted sex [44, 45]. We have been using embryo splitting in commercial
ET with 35 % pregnancy rate in heifers (single half-embryo transfer) and
50 % in cows (double half-embryo transfer; 30 % of twin pregnancies). This
might represent a higher efficiency per original embryo, since commercial
pregnancy rates upon transfer of one whole fresh embryo are around 70 % in
heifers and 50 % in cows.

3. DEVELOPMENTAL BIOTECHNOLOGY

We have identified a novel mammalian Delta homologue, Delta-like 4,


which is specifically expressed in arterial endothelial cells, a unique
expression that prompted us to address its function through the production
and analysis of loss- and gain-of-function mouse mutants. These have shown
that Dll4 alone is required in a dosage-sensitive manner for normal arterial
patterning in development, implicating it as the specific mammalian ligand
for autocrine endothelial Notch signalling. These mutants will now be used
to identify the downstream target genes responsible for the establishment of
the arterial endothelial phenotype. We are currently working towards the
characterization of the expression profiles of the endothelial cells from the
mutants and compare them with that of wild-type endothelial cells, an
approach that should lead to the discovery of genes implicated in arterial
Biotechnology of Reproduction and Development 267

specification downstream of Notch signalling and help clarify the


mechanistic basis of that process. In turn, our understanding of the
mechanisms by which Notch regulates arterial/venous specification should
provide insights into the pathological angiogenesis that support cancer
growth.
Notch signalling is a conserved pathway that functions to modulate cell-
fate decisions of a wide variety of cell types [46]. In mammals, there are 4
Notch genes, Notch1–Notch4, and 5 ligands, Jagged1, Jagged2, Dll1, Dll2
and Dll4. Notch signaling was shown to inhibit or induce differentiation,
induce proliferation and promote cell survival, modulating this array of cell-
fate decisions by regulating the expression of genes in a cell-type-specific
manner. Receptors and ligands are transmembrane cell surface proteins. The
receptors are heterodimeric with an extracellular and an intracellular peptide.
Upon ligand binding, the cytoplasmic domain is released from the cell
surface by proteolytic cleavage and translocates to the nucleus, where it
interacts with the CSL transcriptional repressor, converting it to a
transcriptional activator. The direct target genes of Notch/CSL signalling are
the Hairy/Enhancer of Split (HES) and HES-related transcriptional
repressors. Mutations in humans, mice and zebrafish demonstrated the
importance of Notch signalling in the regulation of vascular development
[47]. In zebrafish, Notch signaling is required for arterial identity by
suppressing the venous fate in developing artery cell [48]. In mice, Notch4
and Dll4 are specifically expressed in arterial endothelial cells, suggesting a
similar role. Deletion of the Notch4 receptor gave no obvious phenotype
alone, although Notch1/Notch4 double mutant embryos show severe
vascular remodelling defects [49], also observed in Hey1/Hey2 double
mutants [50]. In contrast, we have shown that the Dll4 ligand alone is
required in a dosage-sensitive manner for normal arterial patterning in
development [51]. We observed the death in utero of a variable proportion of
the Dll4+/- embryos. The Dll4+/- embryos showed defects in vascular
structures, most notably the reduction of the calibre of the dorsal aortae. The
homozygous mutant embryos were found at normal Mendelian frequencies
at E9.5 but at E10.5 no viable Dll4-/- embryos were identified. The Dll4-/-
embryos showed more severe and precocious vascular defects than the
heterozygotes. The correct migration and aggregation of the angioblasts
occurred to form the dorsal aortae, however, these showed a marked
reduction in diameter by E8.75. By E9.0, null embryos were highly delayed
and abnormal, with drastically reduced dorsal aortic diameter. By E9.5 the
dorsal aortae were absent or reduced to a rudimentary capillary plexus.
Although the major arteries and veins of the embryo form in the absence of
Dll4, their later development is severely disrupted. In the Dll4-/- embryos
none of the arterial markers studied were expressed in the endothelium,
268 L. Lopes da Costa et al.

consistent with the proposed pathway from zebrafish, where VEGF


signalling upstream of Notch promotes the arterial cell fate. In addition, the
venous marker EphB4 was ectopically expressed in the dorsal aortae. In
some null embryos the dorsal aorta fused with the anterior cardinal vein,
indicating loss of separate identity of the two vessels. These data strongly
suggest an involvement of the Notch signalling pathway, mediated through
the Dll4 ligand in a cell-autonomous manner, in the establishment of the
endothelial arterial cell phenotype in mice. The similarity of the Dll4-/-
phenotype with Notch1/4 double mutants is consistent with this, as is the
contrasting effect of endothelial-specific activated Notch expression. Loss of
arterial vascular identity, in Dll4-/- mutants, in turn, could cause angiogenic
defects leading to a generalized disruption of the vasculature and embryonic
death. This is the first case of lethal haploinsufficiency for any Notch
signalling component in mammals. However, vascular development-related
haploinsufficiency has also been reported for VEGF, which lies upstream of
Notch signalling in arterial development. It seems that the development and
patterning of the arterial system may be controlled by levels of availability
of critical ligands. Sensitivity of the embryonic vasculature to Dll4 levels
raises the possibility that Dll4 might be a good target for intervention in
adult neovascularization. This sensitivity to expression levels was also
observed in the gain-of-function mutants. In contrast with the atrophy and
atresia of the dorsal aortae observed in Dll4 +/- and Dll4 -/-, respectively, the
Dll4 overexpression transgenic embryos exhibited aortic hypertrophy [52]. A
variable degree of arteriovenous merging was also observed, probably as a
consequence of loss of venous endothelial identity. In fact, Dll4
overexpression caused ectopic venous expression of arterial markers. These
two complementary phenotypes are suggestive of an important role for Dll4
in the development of the vascular bed, in particular in arteriogenesis, where
it may be responsible for the activation of genes involved in the
establishment of the arterial endothelial cell phenotype. Although there is
compelling evidence for such a role, nothing is yet known on its mechanistic
basis. We will attempt to fill this void by discovering novel downstream
genes that are responsible for mediating the effect of Notch activation in the
endothelial cells.
Our preliminary data from gene expression profiling of the mutants is
very promising. A number of novel genes have been found to be
differentially expressed, either upregulated in the gain-of-function mutant
and down-regulated in the knockout or vice-versa. We are therefore
confident that this approach will lead to the discovery of novel genes
implicated in arterial specification downstream of Notch signalling and help
clarify the mechanistic basis of that process. In turn, our understanding of
the mechanisms by which Notch regulates arterial/venous specification
Biotechnology of Reproduction and Development 269

should provide insights into the pathological angiogenesis that support


cancer growth.
Our previous work suggests that the development and patterning of the
arterial system may be controlled by levels of availability of critical ligands.
Interestingly, both VEGF and Dll4 have been shown to be up-regulated by
hypoxia, which is one of the environmental factors that can impinge on
vascular patterning and growth. Presumably, exquisite sensitivity to ligand
levels helps to ensure appropriate vascular responses to changing external
environments. Sensitivity of the embryonic vasculature to Dll4 levels raises
the possibility that Dll4 will be a good target for therapeutic intervention in
adult neovascularization.

REFERENCES
1. Boyle MS. “Artificial insemination in the horse”, Annales de Zootechnie, 41, pp. 311-
318, 1992.
2. Loomis PR. “The equine frozen semen industry”, Animal Reproduction Science, 68, pp.
191-200, 2001.
3. Vidament M. “French field results (1985-2005) on factors affecting fertility of frozen
stallion semen”, Animal Reproduction Science, 89, pp. 115-136, 2005.
4. Loomis PR, Squires EL. “Frozen semen management in equine breeding programs “,
Theriogenology, 64, pp. 480-491, 2005.
5. Vidament M, Dupere AM, Julienne P, Evain A, Noue P, Palmer E. “Equine frozen
semen: freezability and fertility field results”, Theriogenology, 48, pp. 907-917, 1997.
6. Samper JC. “Management and fertility of mares bred with frozen semen”, Animal
Reproduction Science, 68, pp. 219-228, 2001.
7. Squires EL, Keith SL, Graham JK. “Evaluation of alternative cryoprotectants for
preserving stallion spermatozoa”, Theriogenology, 62, pp. 1056-1065, 2004.
8. Vidament M, Vincent P, Yvon JM, Bruneau B, Martin FX. “Glycerol in semen extenders
is a limiting factor in the fertility in asine and equine species”, Animal Reproduction
Science, 89, pp. 302-305, 2005.
9. Ecot P, Vidament M, deMornac A, Perigault K, Clément F, Palmer E. “Freezing of
stallion semen: interactions among cooling treatments, semen extenders and stallions”,
Journal of Reproduction and Fertility Supplement, 56, pp. 141-150, 2000.
10. Kirk ES, Squires EL, Graham JK. “Comparison of in vitro laboratory analysis with the
fertility of cryopreserved stallion spermatozoa”, Theriogenology, 64, pp. 1422-1439,
2005.
11. Watson PF. “The causes of reduced fertility with cryopreserved semen”, Animal
Reproduction Science, 60-61, pp. 481-492, 2000.
12. Robalo Silva J, Barbosa M, Agrícola R, Mateus L. “Effect of season on testicular size
and function of Lusitano stallions”, Proc. of the IV Congresso Ibérico de Reprodução
Animal, Arucas (Las Palmas), Spain, 2003, pp. 88.
13. Agrícola R, Barbosa M, Mateus L, Robalo Silva J. “Effect of season on testicular
dimensions and sperm production in Lusitano stallions”, Proc. of the Internacional
Veterinary Congress (Voorjaarsdagen congress), Amsterdam, The Netherlands, 2004,
pp. 223.
270 L. Lopes da Costa et al.

14. Picket BW, Amann RP. “Cryopreservation of semen”, In: McKinnon AO and Voss JL
(editors), Equine reproduction, Philadelphia, Lea & Febiger , 1993, pp. 769-789.
15. Backman T, Bruemmer JE, Graham JK, Squires EL. “Pregnancy rates of mares
inseminated with semen cooled for 18 hours and then frozen”, Journal of Animal
Science, 82, pp. 690-694, 2004.
16. Sieme H, Katila T, Klug E. “Effect of semen collection practices on sperm
characteristics before and after storage and on fertility of stallions”, Theriogenology, 61,
pp. 769-784, 2004.
17. Neild DM, Gadella BM, Chaves MG, Miragaya MG, Colenbrander B, Aguero A.
“Membrane changes during different stages of a freeze-thaw protocol for equine sémen
preservation”, Theriogenology, 59, pp. 1693-1705, 2003.
18. Gahne S, Ganheim A, Malugren L. “Effect of insemination dose on pregnancy rate in
mares”, Theriogenology, 49, pp. 1071-1074, 1998.
19. Morris LH, Tiplady CA, Allen WR. “Pregnancy rates in mares after a single fixed-time
hysteroscopic insemination of low numbers of frozen-thawed spermatozoa onto
uterotubal junction”, Equine Veterinary Journal, 35, pp. 197-201, 2003.
20. Lyle SK, Ferrer MS. “Low-dose insemination – Why, when and how”, Theriogenology,
64, pp. 572-579, 2005.
21. Hasler JF. “The current status and future of commercial embryo transfer in cattle”,
Animal Reproduction Science, 79, pp. 245-264, 2003.
22. Hasler JF. “The Holstein cow in embryo transfer today as compared to 20 years ago”,
Theriogenology, 65, pp. 4-16, 2006.
23. Chagas e Silva J, Lopes da Costa L, Robalo Silva J. “Embryo yield and plasma
progesterone profiles in superovulated dairy cows and heifers”, Animal Reproduction
Science, 69, pp. 1-8, 2002.
24. Lopes da Costa L, Chagas e Silva J. “Interacting factors affecting fertility following
embryo transfer in dairy cattle”, Proc of the II Congreso Ibérico de Reproducción
Animal, Lugo, Spain, 1999, pp. 68-78.
25. Hahn J. “Attempts to explain and reduce variability of superovulation”, Theriogenology,
38, pp. 269-275, 1992.
26. Stroud B, Hasler JF. “Dissecting why superovulation and embryo transfer usually work
on some farms but not on others”, Theriogenology, 65, pp. 65-76, 2006.
27. Chagas e Silva J, Cidadão MR, Lopes da Costa L. “Effect of parity and type of estrus of
recipient on pregnancy rate following embryo transfer in dairy cattle”, Proc. of the 15th
Scientific Meeting of the European Embryo Transfer Association, Lyon, France, 1999,
pp. 132.
28. Chagas e Silva J, Cidadão R, Robalo Silva J, Lopes da Costa L. “Effect of embryo and
recipient on pregnancy rate and embryo-fetal mortality following transfer in cattle”,
Proc. of the Third Conference of the European Society for Domestic Animal
Reproduction, Anger, France, 1999, pp. 47-48.
29. Mann GE, Lamming GE, Robinson RS, Wathes DC. “The regulation of interferon-t
production and uterine hormone receptors during early pregnancy”, Journal of
Reproduction and Fertility Supplement, 54, pp. 317-328, 1999.
30. Mann GE, Lamming GE. “Relationship between maternal endocrine environment, early
embryo development and inhibition of the luteolytic mechanism in cows”, Reproduction,
121, pp. 175-180, 2001.
31. Thibodeaux JK, Brossard JR, Godke RA, Hansel W. “Stimulation of progesterone
production in bovine luteal cells by coincubation with bovine blastocyst-stage embryos
or trophoblastic vesicles”, Journal of Reproduction and Fertility, 101, pp. 657-662,
1994.
Biotechnology of Reproduction and Development 271

32. Vasques MI, Marques CC, Pereira RM, Baptista MC, Horta AEM. “Luteotrophic effect
of bovine embryos and different sera supplementation on granulosa cell monolayers in
vitro”, Revista Portuguesa de Ciências Veterinárias, 525, pp. 25-30, 1998.
33. Chagas e Silva J, Lopes da Costa L. “Luteotrophic influence of early bovine embryos
and the relationship between plasma progesterone concentrations and embryo survival”,
Theriogenology, 64, pp. 49-60, 2005.
34. Chagas e Silva J, Lopes da Costa L, Robalo Silva J. “Plasma progesterone profiles and
factors affecting embryo-fetal mortality following embryo transfer in dairy cattle”,
Theriogenology, 58, pp. 51-59, 2002.
35. Chagas e Silva J, Lopes da Costa L, Cidadão R, Robalo Silva J. “Plasma progesterone
profiles, ovulation rate, donor embryo yield and recipient embryo survival in native
Saloia sheep in the fall and spring breeding seasons”, Theriogenology, 60, pp. 521-532,
2003.
36. Seidel Jr GE. “Sexing mammalian sperm – intertwining of commerce, technology, and
biology”, Animal Reproduction Science, 79, pp. 145-156, 2003.
37. Herr CM, Reed KC. “Micromanipulation of bovine embryos for sex determination”,
Theriogenology, 35, pp. 45-54, 1991.
38. Lopes da Costa L, Chagas e Silva J, Diniz P, Cidadão R. “Preliminary report on sexing
bovine pré-implantation embryos under the conditions of Portugal”, Revista Portuguesa
de Ciências Veterinárias, 97, pp. 95-98, 2002.
39. Thibier M, Nibart M. “The sexing of bovine embryos in the field”, Theriogenology, 43,
pp. 71-80, 1995.
40. Shea BF. “Determining the sex of bovine embryos using polymerase chain reaction
results: a six year retrospective study”, Theriogenology, 51, pp. 841-854, 1999.
41. Willadsen SM. “A method for culture of micromanipulated sheep embryos and its use to
produce monozygotic twins”, Nature, 277, pp. 298-300, 1979.
42. Bredbacka P, Huhtinen M, Aalto J, Rainio V. “Viability of bovine demi- and quarter-
embryos after transfer”, Theriogenology, 38, pp. 107-113, 1992.
43. Gray KR, Bondioli KR, Betts CL. “The commercial application of embryo splitting in
beef cattle”, Theriogenology, 35, pp. 37-44, 1991.
44. Bredbacka P, Velmala R, Peippo J, Bredbacka K. “Survival of biopsied and sexed
bovine demi-embryos. Theriogenology, 41, pp. 1023-1031, 1994.
45. Lopes RFF, Forell F, Oliveira ATD, Rodrigues JL. “Splitting and biopsy for bovine
embryo sexing under field conditions”, Theriogenology, 56, pp. 1383-1392, 2001.
46. Artavanis-Tsakonas S, Rand MD, Lake RJ. “Notch signaling: cell fate control and signal
integration in development”. Science, 284 (5415), pp. 770-776, 1999.
47. Shawber CJ, Kitajewski J. “Notch function in the vasculature: Insights from zebrafish,
mouse and man”. Bioessays, 26, pp. 225-234, 2004.
48. Lawson ND, Vogel AM, Weinstein BM. “Sonic hedgehog and vascular endothelial
growth factor act upstream of the Notch pathway during arterial endothelial
differentiation. Developmental Cell, 3, pp. 127-136, 2002.
49. Krebs LT, Xue Y, Norton CR, Shutter JR, Maguire M, Sundberg JP, Gallahan D,
Closson V, Kitajewski J, Callahan R, Smith GH, Stark KL, Gridley T. “Notch signaling
is essential for vascular morphogenesis in mice”. Genes and Development, 14, pp. 1343-
1352, 2000.
50. Fisher A, Schumacher N, Maier M, Sendtner M, Gessler M. “ The Notch target genes
Hey1 and Hey2 are required for embryonic vascular development. Genes and
Development, 18, pp. 901-911, 2004.
272 L. Lopes da Costa et al.

51. Duarte A, Hirashima M, Benedito R, Trindade A, Diniz P, Bekman E, Costa L, Henrique


D, Rossant J. “ Dosage-sensitive requirement for mouse Dll4 in artery development.
Genes and Development, 18, pp. 2474-2478, 2004.
52. Trindade A, Lopes da Costa L, Duarte A. Unpublished data.
PART V

ENGINEERING AND TECHNOLOGIES


EVOLUTION AND CHALLENGES IN
MULTIMEDIA REPRESENTATION
TECHNOLOGIES

Fernando Pereira, João Ascenso, Catarina Brites, Pedro Fonseca,


Pedro Pinho and Joel Baltazar
Instituto de Telecomunicações, Instituto Superior Técnico, Universidade Técnica de Lisboa,
Av. Rovisco Pais, 1049-001, Lisboa, Portugal, e-mail: fp@lx.it.pt

Abstract: Multimedia information has been conquering a central role in our society in
recent years. For multimedia services and applications to be possible,
multimedia information has to be efficiently and effectively represented. This
paper addresses three challenges in this area: coding, description and
adaptation. In order for multimedia information to be used, it is essential that it
is coded not only in an efficient way but also in an error resilient way due to
the growing importance of error prone channels such as mobile networks;
moreover it must allow random access, provide interactive capabilities, low
complexity implementation, etc. Also the huge amount of multimedia
information available requires efficient management in terms of identification,
filtering and retrieval. Information that cannot be found is the same as
information that does not exist. This capability asks for powerful ways to
describe multimedia information, this means to create data about the data
which allows to identify, filter and retrieve the data, making the data more
useful for the simple fact that it can be effectively found when there is a need.
Finally, the increasing heterogeneity of the multimedia consumption
conditions in terms of networks, terminal, environments and users, is asking
for content to be adequately adapted and customized in order the users get the
best possible multimedia experience for the conditions in place. This paper
describes research work developed at the Image Group of Instituto de
Telecomunicações at Instituto Superior Técnico in the areas of the three main
challenges above identified: coding, description and adaptation.

Key words: Multimedia representation, coding, description, metadata, summarization,


adaptation, MPEG standards.

275
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 275–294.
© 2007 Springer. Printed in the Netherlands.
276 F. Pereira et al.

1. INTRODUCTION

Producing content is nowadays easier and easier. Digital still cameras


directly storing in JPEG format have hit the mass market. Together with
digital video cameras directly recording in MPEG formats, this represents a
major step for the acceptance, in the consumer market, of digital audiovisual
acquisition technology. This step transforms every one of us in a potential
content producer, capable of creating content that can be easily distributed
and published using the Internet and even with mobile terminals. Moreover
more content is being synthetically produced – computer generated – and
integrated with natural material in truly hybrid audiovisual content. The
various pieces of content, digitally encoded, can be successively reused
without the quality losses typical of the previous analog processes. While
audiovisual information, notably the visual part, was until recently only
carried over very few networks, the trend is now towards the generalization
of visual information in every single network. Moreover the increasing
mobility in telecommunications is a major trend. Mobile connections are not
limited to voice, but other types of data, including real-time video, are
already available. The explosion of the Web and the acceptance of its
interactive mode of operation have clearly shown that the traditional
television paradigm does not suffice for audiovisual services. Users want to
have access to audio and video like they now have access to text and
graphics. This requires moving pictures and audio of acceptable quality at
low bitrates on the Web, and Web-type interactivity with live content.
For the scenario above to happen, some technologies in the area of
multimedia representation had to become mature and provide powerful
solutions. Among these technologies, compression, description and
adaptation are among the most relevant, still proposing technical challenges
that have to be addressed.
Content compression, also known as coding, has been the first big
challenge in the area of digital multimedia representation. This relates to the
fact that the straight digital representation associated to the sampling of the
analogue information using the Nyquist Theorem simply results in huge
amounts of data and bitrates which would not allow the viable deployment
of digital services and applications. This fact gave place to a growing
investment in technologies able to reduce the amount of data associated to
the representation of multimedia information with a certain quality,
exploiting concepts such as temporal and spatial redundancy, and
irrelevancy. The set of audio and video compression solutions associated to
the MPEG standards (MPEG-1, MPEG-2 and MPEG-4) allowed the
explosion at the end of the last century of important audiovisual digital
Evolution and Challenges in Multimedia Representation Technologies 277

services and products such as digital television, audio downloading, digital


cameras, DVD players, etc.
With the growing easiness in producing digital multimedia content,
notably images, music and video, it became evident that it would be essential
to develop technologies allowing to more efficiently manage all this content,
notably for filtering and retrieval. Content description or metadata, this
means data about the data, has been the second major challenge in the area
of content representation. It targets the representation of the audiovisual
information at the description (not reproduction) level, this means not for
seeing or hearing but for management, identification, retrieval and filtering.
The descriptors involved may be the more usual textual descriptors such as
the title, director and year of a movie, or the less common low-level
descriptors such as the melody of a song or the colors and motions in a
video. In this area, the MPEG-7 standard provides the most relevant
metadata solution available and, based on it, powerful applications are
already possible such as summarization.
Finally, a more recent challenge: content adaptation and customization.
The growing heterogeneity of networks, terminals, and consumption
environments as well as the consideration of the user preferences, needs and
handicaps asks for the adaptation of the available multimedia content to the
usage conditions in order to provide the user the best possible experience for
the content he/she wants in the conditions at hand. The adaptation process
may involve transcoding (content is transformed keeping the modality and
amount of information but changing fidelity, e.g., video to video conversion
changing spatial resolution, coding format or just quality), transmoding
(content is transformed changing modality but keeping as much as possible
the amount of information, e.g., video to picture, text to speech) or semantic
filtering (content is transformed keeping the fidelity and modality but
changing the amount of information, e.g., a video summary according to
some criteria, region of interest, adult/aggressive content filtering).
This paper presents research work developed at the Image Group of
Instituto de Telecomunicações at Instituto Superior Técnico [1] in the areas
of the three main challenges above identified. First, developments on the
hottest research issue in the area of video coding at the beginning of 2006,
distributed video coding, are presented. After, a summarization solution
based on the MPEG-7 standard is described and, finally, an image to video
transmoding system is presented.
278 F. Pereira et al.

2. ADVANCED VIDEO CODING

Nowadays, the most popular digital video coding solutions are


represented by the ITU-T and ISO/IEC MPEG standards, and rely on the
powerful hybrid block-based transform and interframe predictive coding
paradigm. In this coding framework, the encoder architecture is based on the
combination of motion estimation tools with DCT transform, quantization
and entropy coding in order to exploit the temporal, spatial and statistical
redundancy in a video sequence. In this framework, the encoder has a higher
computational complexity than the decoder, typically 5 to 10 times more
complex [2], specially related to the very computationally consuming motion
estimation operations. This type of architecture is well-suited for
applications where the video is encoded once and decoded many times, i.e.
one-to-many topologies, such as broadcasting or video-on-demand, this
means where the cost of the decoder is more critical than the cost of the
encoder.
In recent years, with emerging applications such as wireless low-power
surveillance, multimedia sensor networks, wireless PC cameras and mobile
camera phones, the traditional video coding architecture is being challenged.
These applications have different requirements than those related to
traditional video delivery systems. For some applications, it is essential to
have a low-power consumption both at the encoder and decoder, e.g. in
mobile camera phones. In other types of applications, notably when there is
a high number of encoders and only one decoder, e.g. surveillance, low cost
encoder devices are needed. To fulfill these new requirements, it is essential
to have a coding configuration with a low-power and low-complexity
encoder device, possibly at the expense of a high-complexity decoder. In this
configuration, the goal in terms of compression efficiency would be to
achieve a coding efficiency similar to the best available hybrid video coding
schemes (e.g. the recent H.264/AVC standard [2]); that is, the shift of
complexity from the encoder to the decoder should ideally not compromise
the coding efficiency. Although this is currently rather far from happening
and much research needs to happen in this area, the most promising solution
around is the so called distributed video coding (DVC) paradigm explained
in the following.
From the Information Theory, the Slepian-Wolf theorem [3] states that it
is possible to compress two statistically dependent signals, X and Y, in a
distributed way (separate encoding, jointly decoding) using a rate similar to
that used in a system where the signals are encoded and decoded together,
i.e. like in traditional video coding schemes. The complement of Slepian-
Wolf coding for lossy compression is the Wyner-Ziv work on source coding
with side information at the decoder [4]. This corresponds to a particular
Evolution and Challenges in Multimedia Representation Technologies 279

case of Slepian-Wolf coding, which deals with source coding of the X


sequence considering that the Y sequence, known as side information, is
only available at the decoder. Wyner and Ziv showed that there is no
increase in the transmission rate if the statistical dependency between X and
Y is only explored at the decoder compared to the case where it is explored
both at the decoder and the encoder (with X and Y jointly Gaussian and a
mean-square error distortion measure) [4].
Today, one of the most studied distributed video codecs uses a turbo-
based pixel domain Wyner-Ziv coding scheme [5], because of its simple and
low complexity encoder architecture. The decoder is responsible to explore
the source statistics, and therefore to achieve compression for the Wyner-Ziv
solution, which represents a major departure from current video coding
architectures. In the proposed solution, the video frames are organized into
key frames and Wyner-Ziv frames; the key frames are encoded with a
conventional intraframe codec and the frames between them are Wyner-Ziv
encoded. At the decoder, the side information is generated using previously
decoded key frames and motion interpolation tools, responsible to obtain an
accurate interpolation of the frame to be Wyner-Ziv decoded. The more
accurate the side information is the fewer are the Wyner-Ziv bits required to
provide a reliable decoding of the Wyner-Ziv frame. Thus, the rate-distortion
(RD) performance of such a Wyner-Ziv video coding scheme is highly
dependent on the quality of the side information and the challenge is how to
generate the best side information (a frame) as close as possible to the
current Wyner-Ziv frame to be decoded. In this context, this paper describes
a novel motion compensated frame interpolation scheme based on spatial
motion smoothing and evaluates the RD performance when this scheme is
used in a turbo-based pixel domain Wyner-Ziv video codec in comparison
with simpler motion estimation solutions.

2.1 Pixel Domain Wyner-Ziv Video Codec Architecture

The Pixel Domain Wyner-Ziv (PDWZ) solution here presented is based


on the pixel domain Wyner-Ziv coding architecture proposed in [5]. The
main advantage of this approach is the low computational complexity
offered, since it uses only a uniform quantizer and a turbo encoder for the
Wyner-Ziv frames; as an initial solution, the key frames are perfectly
reconstructed at the decoder. This scheme can provide interesting coding
solutions for some applications where low encoding complexity is a major
goal, e.g. video-based sensor networks.
Figure 1 illustrates the global architecture of the PDWZ codec. This
general architecture makes use of a quantizer, a turbo-code based Slepian-
Wolf codec, a frame interpolation module and a reconstruction module.
280 F. Pereira et al.

However, there is a major difference between the PDWZ solution proposed


here and the solution in [5] regarding the frame interpolation tools used to
generate the side information, which is the best estimate made at the decoder
of the Wyner-Ziv frame being decoded.
In a nutshell, the coding procedure illustrated in Figure 1 is described as
follows: each even frame of a video sequence X2i, called Wyner-Ziv frame is
encoded pixel by pixel. Over the resultant quantized symbol stream q2i
(constituted by all the quantized symbols of X2i using M levels) bitplane
extraction is performed and each bitplane is then independently turbo-
encoded. In the PDWZ solution, the turbo coding is performed at the
bitplane level. The decoder frame interpolation module generates the side
information, Y2i, which is then used by the turbo decoder to obtain the
decoded quantized symbol stream q’2i. The side information is also
necessary in the reconstruction module, together with the q’2i stream, to help
in the X2i reconstruction task. As shown in the architecture, the Slepian-Wolf
encoder includes a turbo encoder and a buffer and it produces a sequence of
parity bits (redundant bits) associated to each pixel bitplane; the amount of
parity bits produced for each bitplane depends on the turbo encoder rate.
Only the luminance or also the chrominances may be encoded in a similar
way. In this architecture, two identical recursive encoders of rate ½ are used.
The parity bits generated by the turbo encoder are then stored in the buffer,
punctured and transmitted upon request by the decoder; the systematic bits
are discarded. The puncturing operation allows sending only a fraction of the
parity bits and follows a specific puncturing pattern.

Figure 1. IST-PDWZ video codec architecture.

The feedback channel is necessary to adapt to the changing statistics


between the side information and the frame to be encoded, i.e. to the quality
(or accuracy) of the frame interpolation. In this way, it is guaranteed that
only a minimum of parity bits are sent in order to correct the
Evolution and Challenges in Multimedia Representation Technologies 281

mismatches/errors which are present in each bitplane, and thus a minimum


rate is achieved. An ideal error detection capability is assumed at the
decoder, i.e. the decoder is able to measure in a perfect way the current
bitplane error rate, Pe. If Pe • 10í3, the decoder requests for more parity bits
from the encoder. In the decoder, the iterative MAP (Maximum A Posteriori)
turbo decoder employs a Laplacian noise model to aid in the error correction
capability of the turbo codes. This model provides a good fit to the residual
distribution between the side information and the frame to be encoded. The
distribution parameter of the Laplacian distribution was found by
constructing the residual histogram of several sequences using the proposed
techniques to generate the side information, i.e. the frame interpolation tools.

2.2 Frame interpolation solutions

There are several frame interpolation techniques that can be employed at


the Wyner-Ziv decoder to generate the side information, Y2i. The choice of
the technique used can significantly influence the PDWZ codec rate
distortion performance; more accurate side information through frame
interpolation means fewer errors (Y2i is more similar to X2i) and therefore
the decoder needs to request less parity bits from the encoder and the bitrate
is reduced for the same quality. The side information is normally interpreted
as an “oracle”, this means an attempt by the decoder to predict the current
Wyner-Ziv frame based on temporally adjacent frames (key frames).
The simplest frame interpolation techniques that can be used are to make
Y2i equal to X2i-1, i.e. the previous temporally adjacent frame, or to perform
bilinear (average) interpolation between the key frames X2i-1 and X2i+1.
However, if these techniques are used to generate the side information in
medium or high motion video sequences, Y2i will be a rough estimate of X2i
since the similarity between two temporally adjacent frames will be rather
low. In this case, the decoder will need to request more parity bits from the
encoder when compared to the case where Y2i is a closer estimate to X2i and
thus the bitrate will increase for the same PSNR. Subjectively, these simple
schemes will introduce “jerkiness” and “ghosting” artifacts in the decoded
image X’2i, especially for low bitrates. These observations motivate the need
to use more powerful motion estimation techniques since the accuracy of the
decoder frame interpolation module is a key factor for the final compression
performance. However, the traditional motion estimation and compensation
techniques used at the encoder for hybrid video coding are not adequate to
perform frame interpolation since they attempt to choose the best prediction
for the current frame in the rate-distortion sense. For frame interpolation, we
need to find an estimate of the current frame, and therefore a good criteria is
to estimate the true motion, and based on that to perform motion
282 F. Pereira et al.

compensation between temporally adjacent frames. In this paper, block-


based motion compensated interpolation is proposed due to its low
complexity and the need to maintain some compatibility with the current
video compression standards. Figure 2 shows the architecture proposed for
the frame interpolation scheme. Besides the low pass filter and the motion
compensation modules which are always used, the three modules in the
middle are associated to increasingly more powerful motion estimation
solutions when 1, 2 or 3 modules are used (always starting from the first
module on the left, this means the forward motion estimation module). In the
following, all modules are described in more detail.

Figure 2. Frame interpolation framework.

2.2.1 Forward motion estimation

First of all, both key frames are low pass filtered to improve the
reliability of the motion vectors; this will help to estimate motion vectors
closer to the true motion field. Then a block matching algorithm is used to
estimate the motion between the next and previous key frames. The
parameters that characterize this motion estimation technique are the search
window size, the search range and the step size. The step size is the distance
between pixels in the previous key frame a motion vector is searched for,
and enables to reduce the computational complexity of the scheme and to
provide only a coarse approximation of the true motion field. However, this
rigid block based motion estimation scheme fails to capture all aspects of the
motion field, and if frame interpolation is performed, overlapped and
uncovered areas will appear. This is because the motion vectors obtained do
not necessarily intercept the interpolated frame at the center of each non-
overlapped block in the interpolated frame. The motion vectors obtained in
the previous step serve as candidates for each non-overlapped block in the
interpolation frame in such a way that for each block of the interpolation
frame is selected, from the available candidate vectors, the motion vector
that intercepts the interpolated frame closer to the center of block under
consideration. Now that each block in the interpolated image has a motion
vector, bidirectional motion compensation can be performed to obtain the
interpolated frame or further processing is done in the next modules.
Evolution and Challenges in Multimedia Representation Technologies 283

2.2.2 Bidirectional motion estimation

The bidirectional motion estimation module refines the motion vectors


obtained in the previous step by using a bidirectional motion estimation
scheme similar to the B-frames coding mode used in current video coding
standards. However, since here the interpolated pixels are not known, a
different motion estimation technique is used. This technique selects a linear
trajectory between the next and previous key frames passing at the center of
the blocks in the interpolated frame. The search range is confined to a small
displacement around the initial block position and the motion vectors
between the interpolated frame and previous and next key frames are
symmetric, i.e. (x1, y1) = (xi, yi) + MV(Bi) and (x2, y2) = (xi, yi) - MV(Bi),
where (x1, y1) are the coordinates of the block in the previous key frame, (x2,
y2) are the coordinates of the block in the next frame and MV(Bi) represents
the motion vector obtained in the previous section divided by half, since the
interpolated frame is equally distant to both key frames.

2.2.3 Spatial motion smoothing based estimation

Once the bidirectional motion field is obtained, it is observed that the


motion vectors have sometimes low spatial coherence; this can be improved
by spatial smoothing algorithms targeting the reduction of the number of
false motion vectors, i.e. incorrect motion vectors when compared to the true
motion field. The proposed scheme uses weighted vector median filters,
extensively used for noise removal in multichannel images, since all the
components (or channels) of the noisy image are to be taken into
consideration. The weighted median vector filter maintains the motion field
spatial coherence by looking, at each block, for candidate motion vectors at
neighboring blocks. This filter is also adjustable by a set of weights
controlling the filter smoothing strength (or spatial homogeneity of the
resulting motion field) depending on the prediction MSE (Mean Square
Error) error of the block for each candidate motion vector (calculated
between key frames). The spatial motion smoothing algorithm is both
effective at the image boundary, where abrupt changes of the direction of the
motion vectors occur, as well as in homogenous regions (with similar
motion) where the outliers are effectively removed.

2.2.4 Bidirectional motion compensation

Once the final motion vector field is obtained, the interpolated frame can
be filled by simply using bidirectional motion compensation as defined in
standard video coding schemes. The assumption is that the time interval
284 F. Pereira et al.

between the previous key frame and the interpolated frame is similar to the
time interval between the interpolated frame and the next key frame
interpolated frames, so each reference image has the same weight (½) when
is performed motion compensation.

2.3 Experimental results

In order to evaluate the rate-distortion performance of the proposed


PDWZ codec, four frame interpolation techniques will be considered in the
following to generate the side information: i) average frame interpolation; ii)
only forward motion estimation (FME), iii) forward motion estimation
followed by bidirectional motion estimation (BiME) and, finally, iv) forward
motion estimation followed by bidirectional motion estimation and spatial
motion smoothing (SS). Bidirectional motion compensation is always
performed to fill the interpolated frame.
The experiments will show the contribution of each functional block
proposed for the frame interpolation framework in the overall PDWZ
performance. In all the experiments, the block size is 8×8; the search range is
±8 pixels and the step size is 2 for the forward motion estimation; for the
refinement process the search range is adjusted by ±2 pixels. These
parameters were obtained after performing extensive experiments and are
those that better fit for QCIF resolution sequences. The PSNR versus bitrate
results for all the frames of the Foreman and Coastguard QCIF sequences are
shown in Figure 3. In the charts, only the luminance rate and distortion of
the Wyner-Ziv frames are included; the Wyner-Ziv frame rate is 15 fps. It is
assumed that the odd frames (key frames) are available at the decoder,
perfectly reconstructed. The results are compared against the H.263+
intraframe coding and the H.263+ interframe coding with an IBIB structure.
In the last case, only the rate and PSNR of the B frames is shown.

Coastguard sequence Foreman sequence


43.0
42.5

41.0
40.5

39.0
38.5
PSNR of even frames (dB)
PSNR of even frames (dB)

37.0
36.5

34.5 35.0

H.263+ Intra 33.0 H.263+ Intra


32.5
H.263+ I-B-I-B H.263+ I-B-I-B
30.5 PDWZ Average 31.0 PDWZ Average
PDWZ FME PDWZ FME
28.5 PDWZ BiME 29.0 PDWZ BiME
PDWZ SS PDWZ SS
26.5 27.0
0 50 100 150 200 250 300 350 400 450 500 550 600 0 50 100 150 200 250 300 350 400 450 500 550 600 650
Rate of even fram es (kbps) Rate of even frames (kbps)

Figure 3. PDWZ RD performance for the Coastguard and Foreman QCIF sequences.

The results obtained show that the proposed motion estimation


techniques improve significantly the PDWZ RD performance, especially
Evolution and Challenges in Multimedia Representation Technologies 285

when compared to the average frame interpolation solution. RD


improvements are observed for both sequences when the frame interpolation
solution is successively made more powerful by adding additional tools,
which validates the approach (and architecture) of the proposed frame
interpolation scheme. Bidirectional ME provides better gains for the
Coastguard sequence (up to 0.8 dB) than for the Foreman sequence (up to
0.3 dB) when compared to the forward ME scheme and spatial smoothing
has similar gains for both sequences when compared to the BiME scheme
(up to 0.8 dB). From the results, it is also possible to observe remarkable
gains over H.263+ intra-frame coding for all bitrates and sequences.
However, there is still a gap when compared to H.263+ interframe coding
(IBIB); as expected, this gap is smaller for sequences with well-defined
camera motion like Coastguard, since the interpolation tools can provide
better performance for this type of sequences. Additionally, when the results
of the first 100 frames are compared (the same number of coded frames in
[6]) with the most recent pixel domain Wyner-Ziv coding results available in
the literature [6], the PDWZ SS solution shows an improvement of up to 2
dB in coding efficiency, for the conditions stated above.
Experimental results prove that the described interpolation framework
improves the PDWZ coding efficiency compared to other similar solutions,
without sacrificing the encoder complexity. This way, it is possible to
approximate the PDWZ performance to the interframe H.263+ performance,
thus reducing the gap in quality. As future work, it is planned to further
enhance the RD performance of the codec with algorithms that take into
account the strong spatial correlation among neighboring pixels or by using
some iterative motion refinement approach using an intermediate decoded
frame; please check [1] for further developments.

3. MULTIMEDIA CONTENT SUMMARIZATION

Advances in the area of telecommunications have led, in the past few


years, to a boom in the production, transmission and availability of
multimedia content. Facing this scenario, there is an increasing need of
developing automatic systems allowing us to find and consume the
information considered ‘essential’ to each of us, adapted to our tastes, our
preferences, our time availability and our capacities of receiving, consuming
and storing that information – in other words there is an increasing need to
summarize and personalize multimedia content. Automatic video
summarization may be defined as the selection and representation of a
sequence of still images or video segments (with or without the
corresponding audio content) expressing the full content available in such a
286 F. Pereira et al.

way that only concise and relevant information (according to some criteria)
is presented to the user. The main novelty of the automatic video
summarization system described in this paper is that it uses previously
created MPEG-7 descriptions [7] of the video content to summarize it, rather
than directly analyzing the content in question. The proposed summarization
system consists of two applications: an automatic video indexing application
– which creates MPEG-7 descriptions for a given video content asset – and
an automatic summarization application – which, based on MPEG-7
descriptions, creates and describes or represents summaries of that content.
Due to space restrictions, this paper will concentrate on the description of the
summarization application.

3.1 Automatic video summarization system

A summarization process basically consists on the selection of the


elements (video segments, or keyframes) considered ‘essential’ to represent
the original video content according to some criteria. This selection is based
on the analysis of the video content’s features which is performed by
accessing a previously created MPEG-7 description of the content’s visual
features. The process of describing a given video content asset is called
description or indexing and is performed by a so-called indexing application,
which is controlled by an outside entity, the description producer. This entity
determines which are the most relevant features and therefore the features
that shall be instantiated in the content description using adequate (MPEG-7)
descriptors for later use in the creation of summaries. The summarization is
performed by a summarization application, which can be divided in two
parts: the summary description creation part, which creates an MPEG-7
description of the summary, based only on the MPEG-7 description of the
content and on criteria specified by a summary producer or by the consumer
himself, and the summary assembly part which, based on the created
description and on the original video content, creates the summaries and
presents or delivers them to the user. It should be emphasized that the
original video content is not used on the process of summarization but only
on the actual assembly of the summary. Since many times the creation of the
summary description does not happen at the same time or even at the same
place as the assembly of the summary based on the description, this split
between the summary description and summary assembly process is
important.
The summary description creation part can be further subdivided into
several parts with distinct functionality: after the MPEG-7 description
parsing block parses the description of the content, the summarization block
stores in its data structures the information about the video structure and
Evolution and Challenges in Multimedia Representation Technologies 287

features provided by the description schemes and descriptors instantiated in


the description. Based on this information and on criteria entered by the user
through the user interface, this block creates a description of the summary.
This description can then be stored in an MPEG-7 compliant file by the
MPEG-7 description creation block.
The summary assembly part, based on the description of the summary
created by the first part and on the original video content, can then or later
on assemble the summary using chosen frames and/or video segments from
the original content, presenting or delivering it to the user. Note that this
assembly process can be performed at a different time or location as the
summary description creation process.

3.2 Summarization examples

In this section, a summarization example using the summarization system


developed will be presented. In the examples shown, each image has two
indices: the first index indicates the ranking of the image in terms of the
query performed (e.g., keyframe with number 1 is the most relevant,
keyframe with number 2 is the second most relevant and so forth) and the
second index indicates its final combined relevance computed as described
before. In an example query, a frame is given as example. This frame can
either be an arbitrary video frame or a keyframe, i.e., a frame identified as
such in the MPEG-7 description of the content. The relevance in terms of
summaries will be determined by comparing each keyframe’s and shot’s
descriptor instantiations with the query description for the same descriptor
extracted from the example frame (if an arbitrary frame is used as example)
or directly available from the MPEG-7 description of the content (if a
keyframe is used as example). The user can specify which query descriptors
shall be used among those which are instantiated in the MPEG-7 description
of the video content (uninstantiated descriptors, obviously, cannot be used):
textual annotation (keywords), Motion Activity, Scalable Colour, Dominant
Colour and Colour Structure.
In this example query, the frame presented in Figure 4 is given as the
example. Figure 5 illustrates the keyframe summary obtained if the
Dominant Colour descriptor is the only descriptor used as query descriptor.
Figure 6 illustrates the keyframe summary obtained if the Dominant Colour,
Scalable Colour and Colour Structure descriptors are specified
simultaneously as query descriptors (to be extracted from the example
frame). Both summaries were generated with a number of keyframes
constraint of eight keyframes. As it can be seen, the simultaneous use of
three colour descriptors dramatically improves the quality of the results if the
goal is to find all the keyframes which are similar to the example frame not
288 F. Pereira et al..

only in terms of the global set of colors but also in terms of their structure
and spatial distribution. In fact, the Dominant Colour descriptor does not
express the way the colours are spatially distributed in the image; for this
reason, the keyframes in Figure 5 have similar colours to the example frame
but the colours are not spatially distributed in the same way as happens for
Figure 6 after using colour descriptors which also express the colour spatial
distribution.

Figure 4. Example frame.

Figure 5. Keyframe summary after an example query using only the Dominant Colour
descriptor.

Figure 6. Keyframe summary after an example query using the Dominant Colour, Scalable
Colour and Colour Structure descriptors.
Evolution and Challenges in Multimedia Representation Technologies 289

The novelty of the summarization system described here is that it can be


used to create summaries of any given content and also the fact that the
summary creation process requires solely the availability of an MPEG-7
description of the content. Although the summarization system described
includes an indexing application, the summarization application can work
with MPEG-7 descriptions created by any indexing application provided that
they are MPEG-7 compliant. The approach to use MPEG-7 descriptions as
the base for the summarization process has revealed itself quite flexible and
effective since it avoids that feature information extraction is repeatedly
performed whenever a summary is created. Finally, since the summary
description creation process is based on a description that can exist
physically separated from the video content, this type of summarization
process has the advantage of not requiring the availability of the content
itself.

4. MULTIMEDIA CONTENT ADAPTATION

Networks, terminals and users are becoming increasingly heterogeneous.


In this context, the growing availability and usage of multimedia content
have been raising the relevance of content adaptation technologies able to
fulfill the needs associated to all usage conditions, without multiplying the
number of versions available for the same piece of content. This means that
adaptation tools are becoming increasingly important to provide different
presentations of the same information that suit different usage conditions.
Furthermore, the importance of the user and not the terminal as the final
point in the multimedia consumption chain is becoming clear.
Nowadays people share many of their important moments with others
using visual content such as photographs, that they can easily capture on
their mobile devices anywhere, at anytime. Therefore, images are very
important in mobile multimedia applications. However, mobile devices have
several limitations, notably regarding computational resources, memory,
bandwidth, and display size. While technological advances will solve some
of these limitations, the display size will continue to be a major constraint on
small mobile devices such as cell-phones and handheld PC’s. Currently, the
predominant methods for viewing large images on small devices are down-
sampling or manual browsing by zooming and scrolling. Image down-
sampling results in significant information loss, due to excessive resolution
reduction. Manual browsing can avoid information loss but is often time-
consuming for the users to catch the most crucial information in an image. In
[8] an adaptation tool that allows the automatic browsing of large pictures on
mobile devices is proposed by transforming the image into a simple video
290 F. Pereira et al.

sequence composed of pan and zoom movements which are able to automate
the scrolling and navigation of a large picture.
In this paper, an adaptation system (called Image2Video) whose major
objective is to maximize the user experience when consuming an image in a
device with a small size display is described. The processing algorithms
developed to reach this purpose imply determining the regions of interest
(ROIs) in an image based on knowledge of the human visual attention
mechanisms, and generating a video sequence that displays those regions
according to certain user preferences, while taking into consideration the
limitations of the display’s size, as shown in Figure 7. User preferences refer
to the video display modes the user can choose for the visualization of the
adapted video sequence, e.g. the duration of the video. The created video is
intended to provide a final, better user experience, compared to the down-
sampled still image or the manual scrolling alternatives.

Figure 7. Image2Video adaptation system.

4.1 Image2Video system architecture

The developed adaptation system is inspired on the knowledge of the


human visual system (HVS) attention mechanisms to determine ROIs in the
image, and it uses a multi-stage architecture to perform all the necessary
tasks to transform the original image into a (more interesting) video clip. The
multi-stage architecture proposed for the adaptation system is presented in
Figure 8; it was conceived to produce a high-level description of the most
interesting contents in the image and then combine that description with the
user preferences and terminal device limitation descriptions to perform the
image-to-video transmoding. Transmoding refers to all media adaptation
processes where content in a certain modality is transformed into content in
another modality, e.g. video to images, text to speech. The proposed
architecture includes four stages, with the first two being responsible for the
determination of a map that identifies all the ROIs of the image, and the
remaining two responsible for the generation of the video that displays the
image following a path that conveniently links the various ROIs to each
other.
Evolution and Ch
allenges in uMltimedia Representation Tech
nologies 291

F
igure .8 Image2Video system architecture.

The main objectives of the four main architectural stages are presented in
the following:
x Composite Image Attention Model - The HVS attention mechanism is
able to select ROIs in the visual scene for additional analysis; the ROIs
selection is guided by bottom-up and top-down approaches. Based on the
knowledge of the human visual attention mechanisms, a composite image
attention model has been developed to detect ROIs and provide a
measure of the relative importance of each one. The model integrates
three elementary visual attention models:
o Saliency attention model - The objective of this model is to
identify ROIs without specific semantic value associated objects,
i.e. regions with different statistical properties from the
neighboring regions are considered ROIs [9].
o F ace attention model - The objective of this model is to identify
ROIs that contain faces. The detection of faces is a task
performed daily by humans since they are one of their most
distinctive characteristics, providing an easy way to identify
292 F. Pereira et al.

someone. Therefore faces are one of the semantic objects present


in an image that are more likely to captivate human’s attention
[10].
o Text attention model - The objective of this model is to identify
ROIs that contain text. People spend a lot of their time reading,
may it be newspapers, e-mails, SMS, etc. Text is a rich font of
information, many times enhancing the message that an image
transmits. Therefore text is a kind of semantic object that attracts
viewer’s attention [11].
x Attention Models Integration - The ROIs computed by the three
elementary attention models in the previous stage are integrated into a
single image map, which contains all the ROIs, their location and type
(saliency, face, or text). A novel method has been implemented to solve
the cases where overlapping exists between the different types of ROIs.
x Optimal Path Generation - This stage is responsible for generating the
path used to display with video the whole image, i.e. the path that
transforms the image into video. Two mechanisms are used for this:
o Display Size Adaptation - Given the display size limitations, the
ROIs are adapted to fit into it, i.e. they can be split into blocks
that fit into display, they can be grouped with other ROIs to
increase the presented information or simply remain as they are.
o Browsing Path Generation - This mechanism is responsible for
the determination of the browsing path, which takes into
consideration the spatial distribution of the ROIs and their type,
targeting the provision to the user of the best possible video
experience.
x Video Generation - The last stage of the adaptation system is
responsible for creating a video sequence based on the previously
calculated browsing path, a set of directing rules and user preferences.

4.2 User evaluation study

The purpose of the user study that has been conducted is not only to
evaluate the global performance of the developed adaptation system, but also
to assess the performance of the developed algorithms. As there is no
objective measure to evaluate the performance of the developed adaptation
system, the user study provides a subjective evaluation of the results
provided by the Image2Video application. Using the developed application
interface to show the original image and the video clip in a small size
display, a group of 15 volunteers were invited to give their subjective
judgments regarding the following three questions:
Evolution and Challenges in Multimedia Representation Technologies 293

x Question 1: How good is the video experience regarding the still image
experience ?
a) Very bad b) Bad c) Reasonable d) Good e) Very good
x Question 2: Are all the interesting regions of the image focused on the
video?
a) None b) Some c) Almost all d) All
x Question 3: How well does the focused regions order reflect their real
relative importance?
a) Very bad b) Bad c) Reasonable d) Well e) Very well.
For this performance evaluation, a set of 8 images with a resolution of
352×288 pixels was selected. The images are divided into four classes:
saliency, face, text and mixed. Based on these 8 images, the adapted video
clips where produced with a resolution of 110×90 pixels (corresponding to
the selected display resolution), to simulate viewing the image and the video
clip in a display size constrained device.
Regarding Question 1, the average results show that 39% and 33% of the
inquired consider the video experience compared to the still image
experience, good and very good, respectively. These results allow
concluding that the majority of the users prefer the adapted video clip
instead of the still image. Regarding Question 2, the average results show
that 59% of the inquired consider that all of the interesting regions of the
image are focused in the video. The average results for Question 3 show that
the 41% and 33% of the inquired consider that the ordering of the focused
regions reflects their real relative importance, well and very well,
respectively.
Based on the evaluation study results, it is possible to conclude that the
developed Image2Video application achieves its main objective, i.e., the
quality of the experience provided by the adapted video clips created with
the proposed application is better than the experience provided by the down-
sampled still image experience.

5. CONCLUSIONS

This paper describes research work developed at the Image Group of


Instituto de Telecomunicações at Instituto Superior Técnico in three of the
most important areas in multimedia representation: coding, description and
adaptation. Far from closed, these areas still propose many challenges to be
solved and thus they will remain interesting research areas for the years to
come.
294 F. Pereira et al.

REFERENCES
1. Image Group Home Page, http://www.img.lx.it.pt/
2. J. Ostermann, J. Bormans, P. List, D. Marpe, M. Narroschke, F. Pereira, T. Stockammer
and T. Wedi, “Video Coding with H.264/AVC: Tools, Performance, and Complexity”,
IEEE Circuits and Systems, Vol. 4, No. 1, 2004.
3. J. Slepian and J. Wolf, “Noiseless Coding of Correlated Information Sources”, IEEE
Trans. on Information Theory, Vol. 19, No. 4, July 1973.
4. A. Wyner and J. Ziv, “The Rate-Distortion Function for Source Coding with Side
Information at the Decoder”, IEEE Trans. on Information Theory, Vol. 22, No. 1,
January 1976.
5. A. Aaron, R. Zhang and B. Girod, “Wyner-Ziv Coding for Motion Video”, Asilomar
Conference on Signals, Systems and Computers, Pacific Grove, USA, November 2002.
6. A. Aaron, S. Rane, E. Setton and B. Girod, “Transform-Domain Wyner-Ziv Codec for
Video”, VCIP, San Jose, USA, January 2004.
7. B. S. Manjunath, P. Salembier and T. Sikora (editors), ‘Introduction to MPEG-7:
Multimedia Content Description Language’, John Wiley & Sons, 2002.
8. H. Liu, X. Xie, W.Y. Ma, H.J. Zhang, “Automatic browsing of large pictures on mobile
devices”, ACM Multimedia’2003, Berkeley, CA, USA, November 2003.
9. L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid
scene analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence
(PAMI), Vol. 20, No. 11, pp. 1254-1259, 1998.
10. J. Ascenso, P. L. Correia, F. Pereira, “A face detection solution integrating automatic and
user assisted tools”, Portuguese Conf. on Pattern Recognition, Porto, Portugal , Vol. 1,
pp. 109 - 116 , May 2000.
11. D. Palma, J. Ascenso, F. Pereira, “Automatic text extraction in digital video based on
motion analysis”, Int. Conf. on Image Analysis and Recognition (ICIAR’2004), Porto -
Portugal, September 2004.
BIOINFORMATICS:
A NEW APPROACH FOR THE CHALLENGES
OF MOLECULAR BIOLOGY

Arlindo L. Oliveira1, Ana T. Freitas1 and Isabel Sá-Correia2


1
IST/INESC-ID, Universidade Técnica de Lisboa, R. Alves Redol, 1000 Lisboa, Portugal,
{aml,atf}@inesc-id.pt

2
IST/CEBQ, Universidade Técnica de Lisboa, Av. Rovisco Pais, 1000 Lisboa, Portugal,
isacorreia@ist.utl.pt

Abstract: We describe the research being undertaken by the ALGOS/KDBIO and


Biological Sciences groups of Instituto Superior Técnico on the field of
bioinformatics and computational biology, with emphasis on the efforts under
way to develop new approaches, methods and algorithms for the determination
of gene regulatory networks. We put the field in perspective by first looking at
recent developments in the field of bioinformatics, and how these
developments contributed to the advance of science. We then describe the
approach that is being followed, based on the development of algorithms and
information systems for the problems of motif detection, gene expression
analysis and inference of gene regulatory networks. We conclude by pointing
out possible directions for future research in the fields of systems biology and
synthetic biology, two critical areas for the development of science in the
coming years.

Key words: Bioinformatics, Molecular Biology, Biotechnology, Regulatory Networks.

1. INTRODUCTION

The analysis of biological systems using computational tools has


emerged as an autonomous area that is central to the advancement of
science. The ability to sequence efficiently the genomes of many organisms,
including Homo sapiens, can be seen as the first significant impact of

295
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 295–309.
© 2007 Springer. Printed in the Netherlands.
296 A.L. Oliveira et al.

bioinformatics in the study of complex biological systems. However, this


ability represents only the first step in a long journey that will, in the end,
contribute to the development of algorithms, methods and systems that will
be able to simulate, in-silico, complex biological systems.
The techniques required to achieve this ambitious goal will come from
the confluence of a number of fields of knowledge. Bioinformatics has
emerged as a field that integrates contributions from many different fields to
analyze efficiently the large volumes of data made available by modern
technologies for sequencing and global expression analysis.
This article starts, in section 2, by putting in perspective the recent
history of Bioinformatics, giving a necessarily brief overview of the
techniques developed in the past decades, and the most important results
obtained.
It then goes on to describe, in section 3, the approaches that are currently
being taken by two research groups at IST in a number of important areas,
which include the development of biological information systems, methods
for the analysis of promoter regions of genes and algorithms for the
treatment of results obtained from global expression analysis. These
approaches have one major goal in common, that of modeling in a precise
and effective way the dynamics of gene and metabolic networks that govern
the life cycles of biological systems.
Finally, section 4 lists a number of promising areas for future research
and provides a number of educated guesses on the possible impacts of this
areas of research in the future of biotechnology and the natural sciences.

2. COMPUTERS, SEQUENCES AND GENOMES

When, in 1953, Watson and Crick [1] identified the double helix of the
DNA as the repository for genetic information, stored in digital format in a
molecule that, until then, was relatively uninteresting, it did not become
immediately obvious that computers would play such a fundamental role in
understanding biological systems. At the time, computers were so rare and
unknown that the parallel between the encoding of genetic information in
DNA and the stored programs of digital computers did not become
immediately obvious.
Roughly 25 years after the discovery of the DNA, advances in
technology lead to what can be considered the first convergence between
biology and computation. Soon after the first sequencing of a fragment of
DNA, the first algorithms that aimed at reconstructing DNA sequences from
fragments were developed [2].
BioInformatics 297

Far from being consensual, the projects of sequencing whole genomes


have met a number of objections [3]. These objections were centered not
only on the technical difficulties perceived in such an undertaking, but also
on the perception that this information would not be specially useful, given
the challenges that would face a biologist interested in exploring it in an
effective way.
The first realistic proposal for the sequencing of the human genome
appeared in 1985. However, only in 1988 did the Human Genome Project
[4] make clear that such an ambitious objective was within reach. The
selection of model organisms with increasing degrees of complexity enabled
not only the progressive development of the technologies necessary for the
sequencing of the human genome, but also the progressive development of
research in genomics of the communities involved in the study of those
model organisms.
A number of model organisms was selected to be sequenced, including,
Haemophilus influenza (1.8Mb), Escherichia coli (5Mb), Saccharomyces
cerevisiae (12Mb), Caenorhabditis elegans (100Mb), Arabidopsis thaliana
(125Mb), Drosophila melanogaster (200Mb).
With the development of better sequencing techniques, the genome
sequences of these organisms were successively obtained; H. influenza in
1995 [5], S. cerevisiae in 1996 [6], E. coli in 1997 [7], C. elegans in 1998
[8] and A. thaliana and D. melanogaster in 2000 [9] [10].
The final steps of the process that lead to the simultaneous publication of
the human genome papers in Science [11] and Nature [12], using two
different (but not totally independent) approaches are well known. The
sequence of the human genome brought the additional somewhat surprising
information that our genome is not larger (3000Mb) nor more complex than
that of the mouse. The approach being taken for human genomic sequencing
was originally the same as the one used for the S. cerevisiae and C. elegans
genomes, based on the serial sequencing of overlapping arrays of large DNA
inserts in E. coli clones. The alternative approach, taken by Celera, was to
apply shotgun sequencing to the whole human genome, an approach made
possible only by advances in algorithms and bioinformatics [13].
At the end of 2005, there are 321 completely sequenced genomes, of
which 281 are bacterial and 40 are eukaryotes. At the time of this writing,
more than 1350 new sequencing projects are currently under way to obtain
the genomes of more than 800 prokaryotes and 550 eukaryotes.
Having the sequences available, however, is only the beginning of the
long path that will lead to the full understanding of biological systems.
Researchers in bioinformatics have been concerned with the development of
algorithms for the analysis, manipulation and annotation of biological
sequences since the connection between biology and computation became
298 A.L. Oliveira et al.

obvious. In the last 20 years, a number of important algorithms for sequence


analysis were developed, in order to cope with the increasingly large
amounts of data available. Amongst these, alignment methods, both exact
[14, 15] and approximate [16, 17] have become an indispensable tool for
scientists, and are now part of the toolbox of every molecular biologist.
Gene finding in sequences, using either automatic or semi-automatic
procedures, is another field where algorithms have come to the rescue of
biologists. Accurate identification of genes is dependent on the accurate
identification of signals, namely on the accurate identification of splice sites
and of start and stop codons. A large number of methods has been proposed
for this problems, and they can be classified into two broad classes:
homology based methods and ab initio methods.
Homology based methods use the fact that conservation of DNA
sequences leads to strong similarities in the coding sequences of related
genes, a fact that can be used to predict coding regions in unlabeled DNA
sequences. They use sequence alignment methods like BLAST [16] or
FASTA [17] to identify highly conserved regions in the DNA that provide
evidence of the existence of signals and/or coding regions. These methods
are limited in their ability to discover new genes encoding proteins that are
not sufficiently similar to those encoded by already annotated DNA
sequences.
Ab initio based methods use pattern recognition algorithms to learn, from
labeled sequences, the rules that can be used to recognize the relevant
signals. Among the pattern recognition algorithms that have been used,
neural networks, decision trees and hidden Markov models have been
extensively applied.
In practice, effective gene finding methods combine both approaches,
and are usually difficult to classify squarely as either homology based or ab
initio. A number of well known gene finding programs have been made
available, amongst which GRAIL [18], MZEF [19], GENSCAN [20] and
GENEID [21] deserve to be mentioned. Performance of the different
methods varies with a number of factors [22] but, when coupled with human
based annotation, these methods are adequate to perform the identification of
the coding regions in the genomes, and do not represent the limiting factor
for the advancement of biological knowledge.
The first years of the XXI century have, therefore, led to the sequencing,
mapping and annotation of the genes that control the development of all life
forms. Using all this information to understand biological phenomena
appears now as the major challenge of this coming century.
BioInformatics 299

3. UNDERSTANDING REGULATORY NETWORKS

As the knowledge about biology advanced, pushed by advances in


sequencing and annotation, it became clear that the next steps in the
understanding of biological systems depend critically on the identification of
the complex regulatory networks that are present in every living organism.
In fact, knowing where the genes are, and the proteins they code for, is of
little help if one does not know the relations between genes, proteins and the
other chemical compounds that are present in every cell. Understanding
biology from a systemic standpoint came to be known as Systems Biology,
and is the main challenge of the next decades in the natural sciences.
Information needed for this task comes from two main sources: genomic
sequence data and whole-genome measures of gene expression obtained
using microarrays and quantitative proteomics. Using sequence and gene
expression data together with phylogenetic information to infer network
structures has emerged as the only realistic avenue that can be pursued in
order to address the challenge of identifying, mapping and documenting the
complex architectures of gene regulatory networks of a living organism, the
central goal of systems biology. To achieve this goal, we need to be
successful in, at least, four major areas.
In the first place, we need to successfully integrate, curate and make
widely available existing knowledge about regulatory mechanisms in
different organisms. Model organisms, such as S. cerevisiae, represent an
important intermediate step, because they can more easily be used in
experiments than more complex organisms. To support the experimental
procedures needed to validate mathematical models, we need to develop
information systems that make available, in an integrated and uniform way,
the vast amounts of knowledge that already exist about specific gene
regulatory mechanisms.
In the second place, we need to develop algorithms, methods, models and
systems that can guide researchers in their search for still unknown
regulatory mechanisms. This will help us completing our understanding of
the rules that govern gene regulation and the interaction between the many
other components of living organisms.
In the third place, we need to develop effective methods for the analysis
of gene expression data, in order to help researchers in their quest for
understanding the massive amounts of data generated by modern
technologies such as microarrays.
Finally, we will need to integrate all this information into a coherent
model of biological networks that can be used to model, simulate and predict
responses of organisms to specific conditions. Although the complete
achievement of such a goal is still far off in the future, the development of
300 A.L. Oliveira et al.

these enabling tools stands in the critical path, and has emerged as a central
goal in systems biology.
The next sections describe some of the efforts currently being undertaken
at IST in each of these areas, while also providing some broader perspective
on the important issues at stake.

3.1 Databases for gene regulatory information

The availability of the complete sequence of a number of organisms


implied a significant change in biology and biotechnology in the last decade.
Making these sequences available to the scientific community has emerged
as an important objective in itself. Creating and making widely available
biological databases with sequence and annotation information represents
now a very significant part of the activity of the scientific community. More
than 850 biological databases are listed in one single source of information
[23], and many more are available and listed in other places.
No single group can contribute with a significant fraction of the total
information that is made available and every research community focuses in
specific sub-fields. The IST groups have focuses on documenting,
organizing and making publicly available information about gene regulation
mechanisms in Yeast, complementing the many available databases
available for this organism.
Since the release of the complete genome sequence of S. cerevisiae, a
number of computational methods and tools have become available to
support research related with this organism. Most significant for the Yeast
community, the Saccharomyces cerevisiae database (SGD) [24], and other
databases specialized in Yeast, like CYGD, the Comprehensive Yeast
Genome Database [25] or YRC, the Yeast Resource Center [26], make
available extensive information on molecular biology and genetics of S.
cerevisiae.
The precise coordinated control of gene expression is accomplished by
the interplay of multiple regulatory mechanisms. The transcriptional
machinery is recruited to the promoter leading to the transcription of the
downstream gene through the binding of transcription regulatory proteins to
short nucleotide sequences occurring in gene promoter regions. To support
the analysis of the promoter sequences in the yeast genome, a set of software
tools is provided by RSAT (Regulatory Sequences Analysis Tools [27]).
RSAT makes available pattern matching methods, supporting the search for
given nucleotide sequences (e. g. transcription factor binding sites) within
the promoter region of chosen genes, thus leading to the identification of
putative target genes for specific transcription factors. However, the
transcription factor binding sites, used for pattern matching in RSAT, have
BioInformatics 301

to be provided by the user, since RSAT does not hold a database of


transcription factor binding sites. Existing databases do not fill this gap. The
IST groups involved in this research have therefore developed a database of
known regulatory associations between genes in this organism,
YEASTRACT [28].
YEASTRACT (YEAst Search for Transcriptional Regulators And
Consensus Tracking; www.yeastract.com) database is a repository of more
than 12500 regulatory associations between genes and transcription factors
in S. cerevisiae and includes the description of 269 specific DNA binding
sites for 106 characterized transcription factors. This publicly available
database was developed to provide assistance in three major issues:
identification of documented and potential regulatory associations for an
ORF/Gene; microarray data clustering based on regulatory associations;
search for a DNA motif within known transcription factor binding sites and
promoter regions. In the first three months of 2006, more than 150 different
groups from 36 different countries have performed over 80000 queries using
YEASTRACT.
Currently, new algorithms for the analysis of biologically significant
over-represented motifs in promoter regions are under development. When
available, these tools will be integrated with YEASTRACT and will simplify
the analysis of the complex regulatory mechanisms underlying
transcriptional response in Yeast.

3.2 Detection of cis-regulatory modules in promoter


regions

Algorithms for identifying relevant motifs in promoter regions have been


known for a number of years. Two main types of models have appeared in
the literature: pattern and weight matrix. Each of these models was
developed together with corresponding methods for inferring regulatory
signals in DNA sequences. More recently, the sequencing of different
genomes from closely related species has enabled the use of evolutionary
information through genome comparative approaches to improve the
identification process.
The first methods for detecting promoter regions in DNA sequences
[29][30] looked for a binding site composed of adjacent nucleotides. In the
search for more complex cis-regulatory models methods have appeared that
extract DNA sites composed by non-contiguous nucleotides.
Existing methods obtain comparable results. However, although these
results have appeared promising for a long time, the algorithms have so far
fallen short of becoming a truly systematic and automatic method for
identifying signals, particularly at the scale of whole genomes.
302 A.L. Oliveira et al.

There are many difficult issues related with this problem, including the
ones around the computational hardness of most variants of the problem.
One major difficulty however is that it has not been able to arrive at any
accurate method for estimating, either probabilistically or combinatorially,
the parameters that should be passed to the algorithms, that is, the
characteristics, even roughly defined, of the motifs to be identified. This
problem is often alleviated by using available biological knowledge to do a
pre-processing of the data, thus reducing it to sizes and levels of noise that
are more manageable by current methods. Even in such cases, important
information may be missed leading to the repeated identification of what is
already reasonably well known. More importantly, biological knowledge,
even partial, is not always available, in particular at the scale of the full
regulatory system of an organism.
Research developed at IST has attacked this problem from a number of
different perspectives.
One approach is based on the development of more efficient algorithms
for the discovery of complex motifs [31][32]. These methods use
sophisticated string processing techniques to process large volumes of
sequence data, and efficiently identify over-represented motifs in promoter
regions. Current research is focused on applying these techniques to the
derivation of more general models, and on using additional biological
information related with the structure of the DNA to improve the quality of
the results obtained.
We have explored the use of massively parallel computation to achieve
significant speedups in this computationally difficult problem [33]. The
application of the Grid based computing paradigm enlarges the range of
problems that can be tackled by the algorithms, using available and unused
CPU time.
Current research is focused on the development of methods that can be
used to accurately identify the parameters that should be passed to motif
finding algorithms, relieving biologists from this difficult task. Preliminary
results obtained using this approach, coupled with a more accurate
assessment of the statistical significance of the motifs, have already led to
the discovery of previously unknown biological knowledge.

3.3 Analysis of gene expression data

Additional information about regulatory networks comes from gene


expression data. The experimental analysis of whole, complex genetic
networks was revolutionized about ten years ago by the development of
DNA microarrays [34]. This technology enables genome-wide analysis of
BioInformatics 303

gene expression and is now widely used to obtain data about the interactions
between genes.
Since then, microarrays have given thousands of snapshots of gene
expression in many organisms, including S. cerevisiae, using a broad panel
of experimental conditions [35]. The coupling of genetic engineering and
microarrays has also been extensively used to identify the target genes and
physiological impact of many yeast transcription factors. More than 100
million of individual expression data are now available just for S. cerevisiae.
These large amounts of data created the need for novel computational
tools that perform gene expression data analysis. Since microarrays can
provide a snapshot of the expression level of all the genes in a cell at a given
time, and since it has been demonstrated that gene expression is a
fundamental link between genotype and phenotype, the analysis of gene
expression data is bound to play a major role in our understanding of
biological processes and systems including gene regulation, development,
evolution and disease mechanisms. Other sources of gene expression data
like quantitative proteomics [36] also provide important information that
should be processed using adequate computational tools.
Clustering of genes and/or conditions has been used, with some success,
to pursue the objectives of understanding regulatory mechanisms, starting
from gene expression data. However, one should expect subsets of genes to
be co-regulated and co-expressed under certain experimental conditions, but
to behave almost independently under other conditions. Discovering such
local expression patterns may be the key to uncovering many genetic
regulatory pathways that are not apparent otherwise. It is therefore important
to move beyond the standard clustering paradigm, and to develop approaches
capable of discovering local patterns in microarray data and identifying
regulatory mechanisms that adequately model the observed patterns.
In this context, biclustering algorithms [37, 38] represent a powerful
mechanism for gene expression data analysis and, therefore, for the
identification of co-regulated genes and potential gene regulatory networks.
Unlike clustering algorithms, biclustering algorithms identify groups of
genes that show similar activity patterns under a specific subset of the
experimental conditions. Biclustering is thus particularly relevant when only
a subset of the genes participates in a cellular process of interest, when an
interesting cellular process is active only in a subset of the conditions or
when a single gene may participate in multiple pathways that may or not be
co-active under all conditions.
Data from time-series is specially interesting for researchers interested in
identifying regulatory networks, since it gives information not only about the
connections between genes, but also about the dynamics of such
connections. Algorithms developed specifically for the analysis of time-
304 A.L. Oliveira et al.

series gene expression data have been developed by our research group [39],
and will be used in an integrated system for the analysis of sequence and
expression information that will aim at identifying the complex gene
regulatory mechanisms present in higher organisms.

3.4 Modeling gene regulatory networks

Once high quality sequence and expression data is available, accurate


models for the transcription mechanisms and the promoter regions have been
identified, and appropriate techniques for the analysis of expression data are
in place, the next challenge faced by researchers is the assembly of this
information into a coherent representation of a biological interaction
network.
Biological networks are abstract representations of functional
interactions, and are usually modeled using a graph-based formalism. In the
case of genetic networks, the approaches more commonly proposed use
nodes to model genes, and links to model regulation between genes. Each
node represents a single variable which can have multiple values, from a
discrete or continuous range.
Many methods have been proposed for the identification of genetic
networks, using a number of heuristics and biases to guide and limit the
search, ranging from connectivity considerations [40] to differential analysis
using mutant strands [41]. A number of such partial networks have already
been identified for Yeast, but, to date, this analysis is partial and has been
done mostly by hand since we are still lacking the tools to perform an
automatic, high-confidence identification of such mechanisms. A complete
analysis of detailed interactions between transcription factors and genes in
Yeast has been performed [42] and has been instrumental in the
understanding of the regulatory mechanisms in this organism. However, it
has not contributed significantly to the advance of our models for these
regulatory mechanisms, since the methods used do not give additional
information about the reasons why a given transcription factor regulates a
particular gene.
Most current methods proposed to date (e.g., [43, 44]) work with single
sources of information, e.g., with transcriptional control deduced from large-
scale microarray experiments. However, integrating evidence obtained from
as many different sources as available, coming from sequence and
expression analysis, as well as from other data such as proteomics and
general metabolism, in such a way as to identify the characteristics, structure
and evolution of genetic regulation networks is the key idea that will support
the next quantum step in the understanding of living organisms. Such
BioInformatics 305

integration may also enable to considerably reduce the amount of noise


resulting from erroneous or incomplete data.
Approaches proposed to date fall short in their ability to seamlessly,
efficiently and accurately integrate evidence coming from many different
and heterogeneous sources. Indeed, the integration of information coming
from just gene expression and metabolism for the purpose of identifying
genetic networks is currently been addressed by only a few research groups
in the world. Even at this limited level, integration poses formidable
problems since determining if and when gene expression correlates with
metabolic flux, via translation, enzyme regulation etc., is already not a trivial
question.
A high-confidence, automatic identification of whole networks is
inaccessible with current methods due to the computational complexity of
the problem. One possible route to overcome this barrier is to arrive at a
better understanding of the structure of biological networks, in particular its
potentially modular characteristic.
Most existing methods that perform some integration, for instance of
sequence with expression data, or of expression data with metabolic
topology information, do not consider all available information together, or
assume analysis of part of the data has already been performed thereby
splitting the inference process into different, independent steps. We are
currently aiming at developing methods that will integrate seamlessly
information from several different sources.
A number of important gene regulatory networks in S. cerevisiae are
being used as test platforms by the teams involved in this project. For its
theoretical and practical importance, the regulatory networks related with the
activation of gene FLR1 in the presence of benomyl and mancozeb
fungicides [45] have been chosen as one experimental platform and are
being used to validate the hypotheses obtained using computational methods.

4. FUTURE WORK

The potential impact of this line of research in science and society is so


large that it is hard to estimate accurately at this point. Even partial success,
leading to methods that can infer, with high confidence, regulatory networks,
would greatly improve our knowledge of biological systems and open the
door to advances on such diverse fields as cancer treatment, drug design,
disease control and food technology, to name only a few application areas.
It is now commonly accepted that the complexity of living beings comes,
to a large extent, from the dynamic of biological networks, in general, and of
genetic networks, in particular. A clear understanding of the way the
306 A.L. Oliveira et al.

building blocks of genetic networks are reused by nature has the potential to
uncover many complex issues that are presently stopping us from fully
understanding biological systems. The potential advantages obtained by a
significant advance in this area far outweigh the risks inherent to the
development of radically new techniques and models.
Advances in our abilities to model, analyze and simulate genetic
networks will also lead to breakthroughs in an emerging field that will
become extremely important in the next decade, that of synthetic biology.
Using computer models of cells to design and fabricate biological
components and systems that do not already exist in the natural world is the
aim of this new discipline. Synthetic biology takes off where systems
biology ends, the study of complex biological systems as integrated wholes,
using modeling and simulation tools. The aim is to build artificial biological
systems using many tools and experimental techniques developed for
systems biology. The focus will shift to finding new ways of taking parts of
natural biological systems, characterizing and simplifying them, and using
them as components of a newly engineered biological system.
The research now under way at Lisbon Technical University, together
with that being developed in many other research centers worldwide will
contribute to an old dream of humanity, that of understanding the grand
design of life on earth.

ACKOWLEDGEMENTS

The authors would like to acknowledge the contributions of past and


present members of the ALGOS/KDBIO group of INESC-ID and the
Biological Sciences Group of CEBQ. In particular, we acknowledge the
contributions of Miguel Teixeira, Pedro Monteiro, Pooja Jain, Sandra
Tenreiro, Alexandra Fernandes, Nuno Mira, Marta Alenquer, Alexandra
Carvalho, Sara Madeira, Nuno Mendes, Ana Ramalho, Luís Coelho, Ana
Casimiro, José Caldas, Miguel Bugalho, Alexandre Francisco, Artur
Lourenço, Luís Russo, Christian Nogueira, Carlos Oliveira, Rodrigo Moisés,
Orlando Anunciação, Óscar Lopes and Susana Vinga. They also thank
external collaborators that have helped in a number of initiatives, including
Marie France Sagot and André Goffeau.
This research was supported by FEDER, FCT and the POSI, POCTI and
PDCT programs under projects POSI/EIA/57398/2004,
POSI/SRI/47778/2002, POCTI/AGG/38110/2001,
POCTI/AGR/45347/2002, POCTI/BME/46526/2002, POSI/EIA/57398/2004
and PDCT/BIO/56838/2004.
BioInformatics 307

REFERENCES
1. J. Watson and F. Crick, A structure for Deoxyribose Nucleic Acid, Nature, 171, pp.
737:738, 1953.
2. R. Staden, A new computer method for the storage and manipulation of DNA gel reading
data Nucleic Acids Research, 25;8(16), pp. 3673:94, 1980.
3. R. A. Gibbs, R.A. Pressing ahead with human genome sequencing. Nature Genetics,. 11,
pp. 121:125, 1995.
4. Commission on Life Sciences, National Research Council. Mapping and Sequencing the
Human Genome, National Academy Press: Washington, D.C., 1988.
5. R. D. Fleischmann, A. D. Adams, O. White, R. A. Clayton, E. F. Kirkness, A. R. Kerla-
vage, C. J. Bult, J. F. Tomb, B. A. Dougherty, J. M. Merrick et al. Whole-genome
random sequencing and assembly of Haemophilus influenzae Rd, Science 269(5223), pp.
496:512, 1995.
6. Goffeau, B. G. Barrell, H. Bussey, R. W. Davis, B. Dujon, H. Feldmann, F. Gali-bert, J.
D. Hoheisel, C. Jacq, M. Johnston, E. J. Louis, H. W. Mewes, Y. Muraka-mi, P.
Philippsen, H. Tettelin, S. G. Oliver. Life with 6000 genes. Science. 274(546), pp.
563:567, 1996.
7. Blattner et al.. The complete genome sequence of Escherichia coli K-12, Science,
277(5331), pp. 1453:1474, 1997.
8. The C. elegans Sequencing Consortium, Genome Sequence of the Nematode C. elegans:
A Platform for Investigating Biology, Science 282(5396), pp. 2012:2018, 1998.
9. The Arabidopsis Initiative, Analysis of the genome sequence of the flowering plant
Arabi-dopsis thaliana, Nature 408, pp. 796:815 2000.
10. M. Adams et al. Science, The Genome Sequence of Drosophila melanogaster, Science
287(5461), pp. 2185:2195, 2000.
11. J. C. Venter et al, The Sequence of the Human Genome, Science, 291(5507), pp.
1304:1351, 2001.
12. International Human Genome Sequencing Consortium, Initial sequencing and analysis of
the human genome, Nature 409, pp. 860:921, 2001.
13. J. L. Weber and E. W. Myers, Human Whole-Genome Shotgun Sequencing, Genome
Re-search, 7(5), pp. 401:409, 1997.
14. S. B. Needleman and C. D. Wunsch, A general method applicable to the search for simi-
larities in the amino acid sequence of two proteins. Journal of Molecular Biology 48, pp.
443:453, 1970.
15. T. F. Smith and M. S. Waterman, Identification of Common Molecular Subsequences.
Journal of Molecular Biology 147, pp. 195:197, 1981.
16. S. F. Altschul, W. Gish, W. Miller, E. W. Myers, D. J. Lipman, Basic local alignment
search tool Journal of Molecular Biology, 215(3):403:10, 1990.
17. W. R. Pearson and D.J. Lipman. Improved tools for biological sequence comparison.
Proceedings of the National Academy of Sciences, 85, pp. 2444:2448, 1988.
18. E. Uberbacher and R. Mural. Locating protein-coding regions in human DNA sequences
by a multiple sensor-neural network approach. Genetics, 88, pp. 11261:11265, 1991.
19. M. Zhang. Identification of protein coding regions in human genome by quadratic dis-
criminant analysis, Genetics, 94, pp. 565:568, 1997.
20. C. Burge and S. Karlin. Prediction of complete gene structures in human genomic DNA.
Journal of Molecular Biology, 268, pp. 78:94, 1997.
21. R. Guigó, S. Knudsen, N. Drake and T. F. Smith. Prediction of gene structure. Journal of
Molecular Biology, 226, pp. 141:157, 1992.
308 A.L. Oliveira et al.

22. P. Monteiro, A. Ramalho, A. T. Freitas and A. L. Oliveira, DECIDE A Gene Finding


Evaluation Tool, Proceedings of BKDB2005 - Bioinformatics: Knowledge Discovery in
Biology, pp. 68:72, 2005.
23. M. Y. Galperin, The Molecular Biology Database Collection: 2006 update Nucleic Acids
Research, 34: pp. D3:D5, 2006.
24. J. M. Cherry, C. Adler, C. A. Ball, S. A. Chervitz, S. S. Dwight, E. T. Hester, Y. Jia, G.
Juvik, T. Roe, M. Schroeder, S. Weng and D. Botstein, SGD: Saccharomyces Genome
Database. Nucleic Acids Research., 26, pp. 73:79, 1998.
25. U. Güldener, M. Münsterkötter, G. N. Kastenmüller, J. Strack, C. Lemer, J. Richelles, S.
J. Wodak, J. García-Martínez, J. E. Pérez-Ortín, H. Michael, A. Kaps, E. Talla, B.
Dujon, B. André, J. L. Souciet, J. D. Montigny, E. Bon, C. Gaillardin and H. W. Mewes
CYGD: the comprehensive yeast genome database, Nucleic Acids Research, 33, pp.
D364:D368, 2005.
26. M. Riffle, L. Malmström and T. N. Davis, The yeast resource center public data reposito-
ry, Nucleic Acids Research, 33, D378-D382, 2005.
27. J. van Helden, Regulatory sequence analysis tools. Nucleic Acids Research, 31, pp.
3593:3596, 2003.
28. M. C. Teixeira, P. Monteiro, P. Jain, S. Tenreiro, A. R. Fernandes, N. P. Mira, M. Alen-
quer, A. T. Freitas, A. L. Oliveira and I. Sá-Correia, The YEASTRACT database: a tool
for the analysis of transcription regulatory associations in Saccharomyces cerevisiae,
Nucleic Acids Research, 34, pp. D446:D451, 2006.
29. M. F. Sagot. Spelling approximate repeated or common motifs using a suffix tree,
Procee-dings of Latin'98, LNCS 1380, pp. 111:127, 1998.
30. J. van Helden, A. F. Rios and J. Collado-Vides. Discovering regulatory elements in non-
coding sequences by analysis of spaced dyads. Nucleic Acids Research, 28, pp.
1808:1818, 2000.
31. M. Carvalho, A. T. Freitas, A. L. Oliveira and M. F. Sagot, An efficient algorithm for the
identification of structured motifs in DNA promoter sequences, IEEE/ACM Transactions
on Computational Biology and Bioinformatics, 3(2), 2006.
32. M. Carvalho, A. T. Freitas, A. L. Oliveira and M. F. Sagot, A highly scalable algorithm
for the extraction of cis-regulatory regions, Proceedings of the 3rd Asia Pacific Bioin-
formatics Conference, pp. 273:282, 2005.
33. M. Carvalho, A. T. Freitas, A. L. Oliveira and M. F. Sagot, A parallel algorithm for the
extraction of structured motifs, Proceedings of the 19th ACM Symposium on Applied
Computing, pp. 147:153, 2004.
34. J. L. DeRisi, V. R. Iyer, P. O. Brown, Exploring the metabolic and genetic control of
gene expression on a genomic scale. Science. 278(5338). pp. 680-686, 1997.
35. S. le Crom, F. Devaux, C. Jacq, P. Marc, yMGV: helping biologists with yeast
microarray data mining, Nucleic Acids Research, 30, pp. 76:9, 2002.
36. J. E. Celis, M. Kruhøffer, I. Gromova, C. Frederiksen, M. Østergaard, T. Thykjaer, P.
Gromov, J. Yu, H. Pálsdóttir, N. Magnusson and T. F. Ørntoft, Gene expression
profiling: monitoring transcription and translation products using DNA microarrays and
proteo-mics. FEBS Letters 480, pp. 2:16, 2000.
37. Y. Cheng and G. M. Church, Biclustering of Expression Data, Proceedings of the Eighth
International Conference on Intelligent Systems for Molecular Biology, pp. 93:103,
2000.
38. S. C. Madeira and A. L. Oliveira, Biclustering algorithms for biological data analysis: A
survey. IEEE/ACM Transactions on Computational Biology and Bioinformics 1(1), pp.
24:45, 2004.
BioInformatics 309

39. S. C. Madeira and A. L. Oliveira, A Linear Time Biclustering Algorithm for Time Series
Gene Expression Data, Proceedings of the 5th Workshop on Algorithms in Bioinforma-
tics, LNCS 3692, pp. 39:52 , 2005.
40. X. Zhou, X. Wang, R. Pal, I. Ivanov, M. Bittner and E. Dougherty A Bayesian
connectivi-ty based approach to constructing probabilistic gene regulatory networks.
Bioinformatics, 20 pp. 2918:2927, 2004.
41. K. Kyoda, K. Baba, S. Onami S and H. Kitano, DBRF–MEGN method: an algorithm for
deducing minimum equivalent gene networks from large-scale gene expression profiles
of gene deletion mutants, Bioinformatics. 20, pp. 2662:2675, 2004.
42. T. I. Lee, N. J. Rinaldi, F. Robert, D. T. Odom, Z. Bar-Joseph, G. K. Gerber, N. M. Han-
nett, C. T. Harbison, C. M. Thompson, I. Simon, J. Zeitlinger, E. G. Jennings, H. L. Mu-
rray, D. B. Gordon, B. Ren, J. J. Wyrick, J. B. Tagne, T. L. Volkert, E. Fraenkel, D. K.
Gifford and R. A. Young, Transcriptional regulatory networks in Saccharomyces cerevi-
siae. Science 298, pp. 799:804, 2002.
43. J. Ihmels, S. Bergmann and N. Barkai, Defining transcription modules using large-scale
gene expression data, Bioinformatics, 20, pp. 1993:2003, 2004.
44. M. Zou and S. D. Conzen, A new dynamic Bayesian network (DBN) approach for
identif-ying gene regulatory networks from time course microarray data. Bioinformatics,
21(1), pp. 71:79, 2005.
45. S. Tenreiro, A. R. Fernandes and I. Sá-Correia Transcriptional activation of FLR1 gene
during Saccharomyces cerevisiae adaptation to growth with benomyl: role of Yap1p and
Pdr3p. Biochemical and Biophysical Research Communications, 280, pp. 216:222, 2001.
RESEARCH AND DEVELOPMENT IN METAL
FORMING TECHNOLOGY AT THE
TECHNICAL UNIVERSITY OF LISBON

Jorge M. C. Rodrigues1 and Paulo A. F. Martins2


Departamento de Engenharia Mecânica, Instituto Superior Técnico, Universidade Técnica de
Lisboa, Av. Rovisco Pais 1049-001 Lisboa, Portugal

1
jrodrigues@ist.utl.pt, 2 pmartins@ist.utl.pt,

Abstract: The research and development portfolio of the metal forming group of
Instituto Superior Técnico is driven by fundamental topics in the field of
computer simulation and by the needs of manufacturing industries. The group
was formed in the late 70’s and conducts activities in bulk forming, sheet
forming, tube forming and powder forming.
The purpose of this paper is to provide a brief overview of the activities of the
metal forming group. The paper is organized around a number of examples
that are representative of the activities of the group in the fields of numerical
and experimental simulation of forming processes and of technology
development and transfer into industrial companies. In what concerns
numerical and experimental simulation the paper includes three-dimensional
numerical modelling of the closed-die forging of spiders by means of the finite
element flow formulation and modelling of the backward extrusion process by
means of state-of-the-art meshless approaches. Collaboration with the industry
is illustrated by means of selected joint partnerships with the Portuguese Mint
House and with an industrial company producing intercoolers for automotives.

Key words: Metal Forming, Numerical Simulation, Experimentation, Technology


Transfer.

1. INTRODUCTION

The metal forming group of Instituto Superior Técnico has a long


pedigree of teaching, research and development activities going back to the

311
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 311–327.
© 2007 Springer. Printed in the Netherlands.
312 J.M.C. Rodrigues and P.A.F. Martins

end of the 70’s. The group conducts an integrated strategy of teaching,


research and development that offers students the opportunity to experience
and benefit from state-of-the-art lectures on numerical, experimental and
technological topics and stimulates the production of text books aimed to the
widest possible range of audiences, from university students of mechanical,
manufacturing, industrial and materials engineering to professional
engineers that design metal forming processes and parts in daily practice
(figure 1) [1, 2].

Figure 1. Integrated strategy of teaching, research and development.

Research and development has from the very beginning comprised two
complementary lines: numerical methods for computer simulation of
forming processes and experimentation under controlled laboratory
conditions. In the mid 80’s the metal forming group started to develop their
own two and three dimensional finite element computer programs for bulk
and powder forming (I-form2 and I-form3). Simulation of sheet metal
forming processes started in the late 80’s by means of the utilisation of state-
of-the-art commercial software. During the 90’s the group has increased
quite considerably its international recognition due to the diversity of
research work in the field of numerical simulation of forming processes. As
a matter of fact, the group was successfully involved in the majority of the
global research trends of the last decade: (i) the use of implicit finite element
methods, (ii) implementation of adaptive methods, (iii) simulation of two
and three-dimensional cold and hot metal forming processes, (iv) mixed
Eulerian-Lagrangian methods, (v) simulation of heat transfer during metal
forming processes, (vi) contact and friction models, (vii) error estimates and
control, (viii) mesh generation and adaptive remeshing, (ix) finite element
Metal Forming Technology at the Technical University of Lisbon 313

integration with CAD/CAM, (x) simulation of defects/material damage (xi)


interaction between deformation of materials and microstructure and (xii)
instability and flow localization.
Present numerical and theoretical developments within the metal forming
group are centred on enlarging the application of their own two and three
dimensional finite element computer programs to the computer simulation of
tube forming and metal cutting processes. More recently, the group started to
be involved in new metal forming simulation trends associated with the
utilization of meshless methods based on EFG and RKPM approaches. The
work in both finite elements and meshless methods is currently ongoing in a
wide range of subject areas in close connection with Post-Doctoral, Ph.D.
and MSc. research projects and within networks of international
collaboration.
In the last years the metal forming group has made considerable
investments in the establishment of adequate laboratory facilities for both
teaching and research which lead to the acquisition of new forming tools,
new data acquisition systems and new machine tools. In connection to this a
new hydraulic testing machine with a nominal capacity of 1200 kN is
expected to be installed during the next summer. This equipment is aimed to
eliminate a key-gap in the mechanical characterization of engineering alloys
using compression tests at warm and hot temperatures.
The purpose of this paper is to provide a brief overview of the activities
of the metal forming group. Because the group is engaged in a wide variety
of research activities it was decided to organize the presentation around a
number of examples that are representative of the activities in the fields of
numerical and experimental simulation of forming processes and of
collaboration with industrial companies.

2. NUMERICAL AND EXPERIMENTAL


SIMULATION

Selected examples comprise three-dimensional finite element simulation


of the closed-die forging of spiders, recently awarded with the A.M.
Strickland Prize for the best paper on a manufacturing industries mechanical
engineering subject published by IMECH-E (Institution of Mechanical
Engineers) [3], and the numerical modelling of a backward extrusion process
by means of meshless analysis [4]. This last work is part of an ongoing
collaboration with the North-western University of USA in the field of
meshless modelling of forming processes.
314 J.M.C. Rodrigues and P.A.F. Martins

2.1 Finite element simulation of the closed-die forging of


a spider

The material used in the experiments was a technically-pure Aluminium


(99.95%). The corresponding stress-strain curve was obtained by means of
compression tests carried out at room temperature on cylindrical specimens
using a Teflon lubricating foil (0.1 mm thickness) for ensuring homogeneous
deformation,
V 177.46 H 0.190 MPa (1)
The cold forming experiments were performed with lubrication (Castrol
Iloform PNW 124 mineral oil). The tribological conditions at the die-
workpiece contact interface were estimated by the ring-test technique. The
ratio of the outer diameter, the inner diameter and the thickness of each ring
specimen were taken as 6:3:2. Friction was specified in terms of a constant
friction factor m and the calibration curves relating the changes of
minimum internal diameter as a function of the reductions in height were
obtained by the finite element method. A value of m W / k 0.12 was
determined.
Cylindrical preforms of 14.54 mm radius and 17.70 mm height were
manufactured from the supplied stock so as to allow the plan of experiments.
Before being placed in position the preforms were degreased and then
lubricated by dipping into mineral oil. The experiments were conducted
using a 500 kN computer-controlled hydraulic press under a constant upper-
die displacement rate of 2.5 mm/s at room temperature. Details of the
process and specific tool parts are presented in figure 2, showing a tool and
the geometrical evolution of the forged part at different stages of
deformation.

Figure 2. Closed-die forging of a spider. Tooling and evolution of the geometry at different
stages of deformation (46%, 55%, 59% reduction in height).

Due to symmetry of the experimental set-up only one eight of the


preform and dies were analysed. The preform was modelled by means of a
structured mesh consisting of 4104 hexahedral elements. Larger elements
were used to fill the body and the initial free regions of the shape whereas
Metal Forming Technology at the Technical University of Lisbon 315

the surfaces of interest (where contact is expected to occur at the early stages
of deformation) were meshed with smaller elements in order to obtain more
accurate results on the filling behaviour.
The number of elements progressively increased during the seven
intermediate remeshing operations that were necessary to perform due to the
large amount of mesh distortion occurring in the simulation. The last mesh
consisted of 5800 hexahedral elements and each intermediate mesh density
conforms to the geometric features of the workpiece and dies at each step of
deformation. The semi-automatic remeshing procedure utilized by I-form3
involves four basic steps: surface recognition, volumetric meshing, nodal
reordering and transfer of time-integrated quantities from the old distorted to
the new mesh [5].
The distribution of effective strain illustrated in figure 3 presents the
highest values in the regions of the flash where the thickness is smaller as
well as in the transition regions between the core and the arms of the spider
forged parts. Results also indicate a progressive increase of these values as
the arms are being formed.

Figure 3. Computed mesh and predicted distribution of effective strain for the cold forging of
a spider part under different height reductions: initial shape, 29%, 41%, 52% and 62%.

The overall agreement between the predicted distribution of effective


strain in a cross section through an arm of the spider, with the
experimentally determined values is good (figure 4). The experimental
technique utilised for estimating the effective strain distribution was based
on micro-hardness tests performed on the metal component itself. The latter
was possible by applying a machining technique, (milling with low cutting
depth, followed by grinding and polishing), which allows the material to be
removed without causing significant mechanical and metallurgical changes
to the material placed in the immediate vicinity on which measurements are
to be made.
316 J.M.C. Rodrigues and P.A.F. Martins

A’

(a)
1.70
1.40
1.10
0.92
0.66
0.41
0

(b) (c)
Figure 4. Computed and experimental distribution of effective strain in a cross section
through an arm of the spider at 58% height reduction. (a) Detail of the experimental part
showing the micro-hardness indentations, (b) distribution predicted by finite element
computations and (c) distribution obtained from micro-hardness measurements.

The computed and experimental evolution forging load is shown in figure


5. Two different stages can be identified; an initial stage where the evolution
of the forging load is smooth and similar to what is generally obtained for
heading operations with flat punches, (although multidirectional flow into
the arms and the flash begins at the early stages of deformation), and a final
stage where the forging load increases sharply. The later starts when the
thickness of the flash is reduced and the die cavities corresponding to the
arms commence to be filled.
250

200 FEM
Experimental
Load (KN)

150

100

50

0
0 2 4 6 8 10 12 14
Displacement (mm)

Figure 5. Computed and experimental evolution of the forging load vs. displacement for
the cold forging of spider parts.
Metal Forming Technology at the Technical University of Lisbon 317

Having validated the computed forging load against experimental


measurements, the elastic analysis of the tools was performed by employing
an algorithm also developed by the authors [6]. The tool parts that were
utilized in the closed-die forging of the spiders were made of an ASP23
(powder metallurgy produced) high-speed steel at 60 HRC in order to
widthstand high pressures and to provide good fatigue strength and
toughness. From the hardness values measured at the surface of the tool parts
it was possible to estimate a yield strength of 1800~1900 MPa and a tensile
strength of 2400 MPa. The elastic modulus of the tool material was taken as
210 GPa.
The distribution of the applied pressure ( V z ) derived from the elastic
analysis of tooling is depicted in figure 6. The highest values, located at the
centre of the upper and lower tool parts, are below 30% of the yield stress of
the tool material allowing for safety and preventing excessive deformation of
the tooling.

Figure 6. Applied pressure (MPa) in the dies for the closed-die forging of a spider part at 58%
height reduction.

2.2 Meshless simulation of the backward extrusion with


a hemi ellipsoidal punch

The material employed in the experimental tests was an AlMgSi1


Aluminium alloy. The stress-strain curve was obtained by means of
compression tests carried out at room temperature,
318 J.M.C. Rodrigues and P.A.F. Martins

V 285.81 H 0.113 MPa (2)


The tests were performed on cylindrical preforms with initial dimensions
D0 u L0 19.4 u 8 (mm x mm) that were backward extruded up to 6 mm of
vertical punch displacement (figure 7). The cross section of the punch head
is an ellipse of semi-major axis 8.3 mm and semi-minor axis 7 mm. The
internal diameter of the die is 20 mm and the initial clearance between the
preform and die is equal to 0.6 mm.
The tribological conditions at the contact interfaces between the
workpiece and tooling were similar to those of section 2.1 and, therefore, the
friction factor m 0.12 .
The computer simulation was carried out under axisymmetric
assumptions and on account of symmetry conditions the RKPM model was
accomplished by discretizing only one half of the cross section of the billets
by means of 21x21 nodal points and 400 uniform quadrilateral background
cells. The models employing triangular background cells made use of 800
cells (figure 7).

0.2

0.4

1.6
2.1
1.8 0.4
1.3 0.8 0.3

(a) (b) (c) (d)


Figure 7. Numerical modelling of the backward extrusion of a cylindrical preform using a
hemi ellipsoidal punch (dimensions in (mm)). (a) Initial RKPM model (using arbitrary
adaptive triangular background cells), (b) deformed shape after 4.6 mm punch displacement,
(c) deformed shape after 6.0 mm punch displacement and (d) computed distribution of
effective strain after 6.0 mm punch displacement.

Figure 7a) to d) presents the initial RKPM model (using arbitrary


adaptive triangular background cells) as well as the corresponding
predictions of geometry and effective strain at different stages of
deformation. As it can be seen, a substantially part of the initial free surface
comes in contact with the punch during extrusion and, therefore, the
deformation mechanics of the process makes wrapping up to prevail against
surface expansion. This mechanism is opposite to that of the heading process
and suggests that material flow is predominantly conditioned by contact
instead of friction.
Metal Forming Technology at the Technical University of Lisbon 319

It is important to recognize that a finite element analysis of this process


would require remeshing after 4.6 mm of punch displacement while RKPM
modelling is able to simulate the entire extrusion procedure without
remeshing. Similarly, the utilisation of RKPM with uniform quadrilateral
background cells also suffered from some difficulties related to numerical
integration errors as soon as the cells became heavily distorted. This allows
us to conclude that the utilization of RKPM with triangular background cells
is a much better option for the numerical analysis of metal forming problems
involving large non steady-state plastic deformations.
The evolution of the extrusion load with respect to the displacement of
the punch is plotted in figure 8. The agreement between the numerical
estimates provided by RKPM, finite elements (FEM) and experimental data
is good if the case identified as ‘RKPM 3 – without smoothing’ is not taken
into account. This case was included with the objective of demonstrating the
importance of smoothing the average stresses when using triangular
background cells. Without smoothing, the triangular cells become somewhat
overly stiff and suffer from volumetric locking as soon as plastic
deformation reaches high levels (say, punch displacements above 2.5 mm in
figure 8). The consequence is a very bad estimate of the extrusion load.

Figure 8. Computed and experimental evolution of the load vs. displacement for the
backward extrusion of a cylindrical billet using a hemi ellipsoidal punch. Note: Insets present
the backward extrusion tool and a picture of the final part.
320 J.M.C. Rodrigues and P.A.F. Martins

The deviation that can also be observed between the estimates of the
extrusion load obtained from RKPM with adaptive and non-adaptive
triangular cells at large plastic deformations are probably due to some loss of
information caused by the algorithm for performing the adaptive redefinition
of the topology [7].

3. COLLABORATION WITH INDUSTRIAL


COMPANIES

Selected examples of collaboration with industry include load


certification of the 1 and 2 Euro bimetallic trade coins, development of an
industrial process for minting a new type of bimetallic coin [8] and a brief
overview of a recently completed engineering consultancy to an industrial
company that produces intercoolers for automotives [9].

3.1 Minting of bimetallic trade and proof coins

Coining is a closed-die forging operation, in which a prepared blank is


compressed between the dies whilst it is being retained and positioned by a
collar. The result of a coining operation is a well-defined imprint of the dies
on the disk (figure 9a)).
The first bimetallic coin to be produced on a large scale was introduced
by the Italian Government Mint in 1982 (500 Lire). The basic idea was not
new, since experiments with bimetallic coin technology had been already
reported in France and England during the 19th century. Bimetallic coins,
although being more complex and more expensive to manufacture than
conventional monometallic coins, offer several advantages over
conventional monometallic coins, such as: security against counterfeiting,
easy of distinguishing by humans and easy of recognition by machines.
Apart from these advantages, bimetallic coins may also provide enhanced
aesthetics features.
The commonly used technology for producing bimetallic coins, hereafter
named bimetallic ring technology, consists on the utilisation of an inner disk
(center) and an outside ring of different materials, generally with different
colours (please refer to figure 9b)). The coins are currently produced in two
stages: firstly the center is placed inside the outer ring with a little clearance,
and secondly the two parts are assembled together with the impartion of the
surface details by the minting dies.
Metal Forming Technology at the Technical University of Lisbon 321

(a)

(b) (c)
Figure 9. Minting process and terminology. (a) Die set-up for producing coins and
medallions. The two types of bimetallic coinage; (b) bimetallic ring technology and (c)
bimetallic foil technology.

The success of the 500 Lire bimetallic coin stimulated other countries to
develop their own bimetallic coins; e.g. France in 1988 (10 francs) and
Portugal in 1989 and 1991 (100 and 200 escudos), among others. On January
1st 2002, the Euro system was introduced in 12 member states of the
European Union, with seven different banknotes and eight coins: 1, 2, 5, 10,
20, 50 cent and two bimetallic trade coins of 1€ and 2€.
The European Central Bank established a destructive test to certificate
the industrial production of bimetallic trade coins by the Mint Houses all
around Europe (figure 10a)). In Portugal this task was performed by the
metal forming group of Instituto Superior Técnico in close collaboration
with the Portuguese Mint House. Figure 10b) shows typical load
displacement curves for separating the center disk from the outer ring of
bimetallic Euro trade coins.
322 J.M.C. Rodrigues and P.A.F. Martins

(a)
6000

5000
Separation Load (N)

4000

3000

2000

1000

0
0 0,5 1 1,5 2
Displacement (mm)

(b)

Figure 10. Bimetallic trade coins of 1€ and 2€. (a) Schematic representation of the destructive
test that was utilized for load certification and (b) typical load-displacement evolution
acquired in the destructive tests.

Until the mid 90’s bimetallic coins were exclusively manufactured by the
above described technology. In 1996 the metal forming group of Instituto
Superior Técnico in collaboration with the Portuguese Mint House
developed a new technology, hereafter named bimetallic foil technology,
Metal Forming Technology at the Technical University of Lisbon 323

which is based on a concept entirely different to that of bimetallic ring


technology. Bimetallic foil technology employs two disks of different
materials, one being very thin (foil) (please refer to figure 9c)) assembled
together, by mechanical means, during the impartion of the surface details by
the minting dies. The coin is produced in a multi-stage sequence consisting
of three metal forming operations (preforming, rimming and coining) and
one intermediate annealing before the coining operation (figure 11) [8].

Foil

(a) (b) (c)

Figure 11. Metal forming steps involved in bimetallic foil technology; (a) preforming, (b)
rimming, (c) coining. The foil (1) is to be assembled with the disk (2)

Bimetallic foil technology was specifically developed for the production


of collection coins and has been experiencing a great success worldwide.
The first series, dedicated to the 150th anniversary of the Bank of Portugal,
was awarded with the Mint Directors Conference prize for the most
technically advanced commemorative coin, in 1998 (figure 12). In fact,
bimetallic foil technology allows the coining of gold-silver specimens with
larger diameters and lower costs than those that would arise from the
utilisation of monometallic gold blanks. Therefore, new market opportunities
were opened for brilliant uncirculated coins (BU) and proof coins as well as
for medals.

Figure 12. First bimetallic foil proof coin – 150th anniversary of the Bank of Portugal.
324 J.M.C. Rodrigues and P.A.F. Martins

3.2 Assembling of intercooler parts

When the engine inlet air of an automotive is compressed by a


turbocharger its temperature increases and its density decreases (heated air
takes up more space than cooler air). Because air density refers to the
amount of oxygen per volume of air consumed by the engine it follows that,
by cooling the air, more oxygen can be delivered to the engine and,
therefore, more fuel can be supplied. More fuel means more power to be
generated from each explosion in each cylinder and, for reason of this,
turbochargers can significantly improve the power-to-weight ratio of the
automotives.
The intercooler is a heat exchanger that is utilized for cooling down the
inlet air flowing into each cylinder. A typical intercooler is composed by an
aluminium core and two plastic end tanks containing the air inlets and
outlets which are usually placed on opposite sides of the core (figure 13a)).
The core and each end tank are lock seamed together in a tool system that is
capable of horizontally-moving a cam slide die during the downstroke of the
pressure cylinder pads.

b)

a)

c)
Figure 13. a) Intercooler of an automotive showing the b) core and the end tank to be lock
seamed together and c) some typical defects that may arise from incorrect assembling
procedures/parameters.

One of the main problems arising in the production of intercoolers is the


occurrence of cracks and/or defects during the lock seaming operation
(figures 13c)). Because this problem is found to be related with the load
applied on the plastic end tanks by the cylinder pressure pads it was decided
to build a tool that could be able to reproduce the assembly operation in
laboratory controlled conditions (figure 14)). The tool was instrumented with
Metal Forming Technology at the Technical University of Lisbon 325

conventional load cells based on traditional strain-gauge technology in order


to allow measurement of the load in previously selected positions and
pressure sensitive films that change colour as a function of the applied level
of pressure were employed for obtaining the correspondent distribution of
pressure on the rubber gasket placed along the perimeter of the lock seaming
region. The contact load applied by each cylinder pad and the distribution of
pressure on the rubber gasket are depicted in figure 15.

Figure 14. Experimental set-up to characterize and measure the load applied by the cylinder
pads and the distribution of pressure along the perimeter where the core and the end tank are
lock seamed together.

The analysis of the experimental data obtained in laboratory controlled


conditions and of the correspondent lock seaming procedure allowed the
development of an alternative solution that is aimed to minimize the
occurrence of defects. The solution was developed by the metal forming
group of Instituto Superior Técnico and it is presently being patented by the
industrial company.

4. CONCLUSIONS

The metal forming group of Instituto Superior Técnico conducts an


integrated strategy of teaching, research and development that offers
students and researchers the opportunity to experience and benefit from
state-of-the-art activities in bulk forming, sheet forming, tube forming and
powder forging.
326 J.M.C. Rodrigues and P.A.F. Martins

(a)

(b)
Figure 15. (a) Contact load applied by each cylinder pad and (b) correspondent distribution of
pressure. The inset shows the effect of the applied pressure on a film.

The research and development portfolio of the group is driven by


fundamental topics in the field of computer simulation of metal forming
processes and by the needs of manufacturing industries. Teaching at the
undergraduate and postgraduate levels of education has been profiting very
much from this strategy in the way that metal forming curricula of the
engineering courses is organized in order to fulfil demands of the industry
Metal Forming Technology at the Technical University of Lisbon 327

and at the same time to prepare students for life learning and for future
research and academic activities.

ACKNOWLEDGEMENTS

Authors would like to express their gratitude to all the students and
colleagues that collaborate with them in the past two decades and help the
metal forming group of Instituto Superior Técnico to acquire national and
international recognition.
Authors also wish to acknowledge Professors Manuel J. M. Barata
Marques and Costa André Júnior for their past work ahead of the
manufacturing unit of the mechanical engineering department of Instituto
Superior Técnico.

REFERENCES
1. J. M. C. Rodrigues e P. A. F. Martins, Tecnologia mecânica: Tecnologia da deformação
plástica – Vol. I, Escolar Editora, 2005.
2. J. M. C. Rodrigues e P. A. F. Martins, Tecnologia mecânica: Tecnologia da deformação
plástica – Vol. II, Escolar Editora, 2005.
3. M. L. Alves, J. M. C. Rodrigues and P. A. F. Martins, Three-dimensional modelling of
forging processes by the finite element flow formulation, Journal of Engineering
Manufacture, 218, 1695-1707, 2004.
4. Shangwu Xiong and P. A. F. Martins, Numerical solution of bulk metal forming
processes by the reproducing kernel particle method, Metal Forming 2006 - The 11th
International Conference on Metal Forming, University of Birmingham, U.K., 2006.
5. M. L. Alves, J. L. M. Fernandes, J. M. C. Rodrigues and P. A. F. Martins, Finite element
remeshing in metal forming using hexahedral elements, Journal of Materials Processing
Technology, 141, 395-403, 2003.
6. M. L. Alves, J. M. C. Rodrigues and P. A. F. Martins, Simulation of three-dimensional
bulk forming processes by the finite element flow formulation, Modelling and
Simulation in Materials Science and Engineering – Institute of Physics, 11, 803-821,
2003.
7. Shangwu Xiong, J. M. C. Rodrigues and P. A. F. Martins, On background cells during
the analysis of bulk forming processes by the reproducing kernel particle method,
Computational Mechanics, 2005 (submitted).
8. P. J. Leitão, A. C. Teixeira, J. M. C. Rodrigues and P. A. F. Martins, Development of an
industrial process for minting a new type of bimetallic coin, Journal of Materials
Processing Technology, 70, 178-184, 1997.
9. E. M. Pimentel, Estudo teórico-experimental do mecanismo de cravação de caixas em
permutadores de calor para a indústria automóvel, MSc Thesis, Instituto Superior
Técnico, 2006.
AGRONOMY: TRADITION AND FUTURE

Pedro Aguiar Pinto


Instituto Superior de Agronomia, Universidade Técnica de Lisboa, Tapada da Ajuda,
1349-017 Lisboa, Portugal, papinto@isa.utl.pt

Abstract: Agronomy is an integrative science born as a synthesis of knowledge coming


from biological and physico-chemical sciences, agricultural practices as
changed by technological development and higher education schools. The
knowledge obtained was very successfully introduced in agricultural practice.
The success of agriculture and the negative impact of some agricultural
practices as well as the specialization in science brought Agronomy into and
identity crisis. The systems approach is presented as a tool to return to the
future the integrative tradition of Agronomy.

Key words: Agronomy, agriculture, integrative science.

1. AGRICULTURE, AGRONOMY AND ECOLOGY

Agricultural management is engaged with fields of plants and areas of


land, This requires knowledge of plant communities with aerial and soil
environments. These organismal and higher levels of biological organization
are the subject fields of Ecology, but explanation of behavior at these levels
depends upon integration of relevant knowledge spanning lower levels from
molecules and cells to organs [1].
The production of organic materials in the field depends upon the
physiological ability of plants and on the characteristics of the environment
within which they are grown. These production processes are the subject of
ecological analyses based on biological, chemical and physical principles.
The integrated result of these analyses is the core subject of Agronomy.

329
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 329–339.
© 2007 Springer. Printed in the Netherlands.
330 P.A. Pinto

The crops that are grown and how they are grown are human decisions
that depend also upon the utility of the products, the costs of production and
the risk involved. The reason why they are grown is mainly the production
of food and fiber and this is the economical activity that we call Agriculture.

1.1 The birth and growth of a scientific discipline

Although Agriculture is as ancient as history, the systematic study of the


processes involved in agricultural production is relatively recent.
While Ecology has a precise birth date and even a father, I could not find
yet one single father for Agronomy nor a precise birth date.
Table 1 presents a sample chronology of apparently unrelated events. Its
main purpose is to set the scenario in which Agronomy was born and gave
its initial steps as an integrative science.

Table 1. Chronology of events in science, technology and education related to Ecology,


Agronomy and Agriculture in the nineteenth century. Information from [2-3].
Year Scientific events Technological Teaching and education
developments milestones
1798 An essay on the Principle
of Population (T. Malthus)
1802 Steam engine thresher Germany: 1st Agronomy
school at Möglin (Thaer)
1813 Elements of Agricultural
Chemistry (Humphry-Davy) Hungary: 2nd Agronomy
1815 school at Georgikon
(Samuel Tessedik)
1818 Germany: Hoenheim
(Schwertz)
1820 Introduction of guano in École Agro-Forrestière de
England Roville (1822)
1823 Systema mycologicum École Agro-Forrestière de
(Elias Fries) de Nancy (1824)
1826 First mechanical mower École Agro-Forrestière de
(England) Grignon (1826)
1831-6 H.M.S. Beagle voyage
1838 Entomology manual (Carl
Burmeister)
1840 Law of the minimum
(Justus van Liebig) Rothamstead Exp. Station
1843 (John Bennet Lawes)
1845 Superphosphate (England)
1850 Use of sulphur for powdery
mildew (France) Instituto Agrícola at
1853 Lisbon
1855 Géograph. Botan. raisonnée
(Alp. de Candolle)
Agronomy: Tradition and Future 331

Table 2. Chronology of events in science, technology and education related to Ecology,


Agronomy and Agriculture in the nineteenth century. Information from [2-3]. (cont.)
Year Scientific events Technological Teaching and education
developments milestones
1859 The origin of species
(Charles Darwin) Chain mechanization in
1860 the Chicago slaughter Land-Grant Universities
1862 house (USA)
1865 Phyloxera in France,
1866 Hybridization experiments Portugal, Spain, Italy
in plants (Gregor Mendel)
Ernst Heckel (oecologia)
1870 Grain-binder (USA) Encicl. Agraria Italiana
1871 (Gaetano Cantoni)
1879 Thomas phosphate
1886 Soil classification (Vassilii
Dokouchaev)
1888 Rhizobium (Martinus
Willem Beijerink)
1890 First herbicides (France)
1911 Instituto Superior de
Agronomia (Lisboa)

Agronomy developed as an integrative science incorporating and


interrelating knowledge from the scientific disciplines that dealt with the
biology of plants and animals of interest to agricultural production, with the
characterization and understanding of the physical environment, while
agricultural production was accelerating and requiring further technological
developments. The need for a superior education in these interrelated areas
fostered the appearance of specialized schools almost everywhere in Europe
and North America. All these events occurred in little more than one
hundred years and their combination gave rise to systematic knowledge that
carries the fingerprints of science, technology and education with the
characteristics of engineering: solving problems raised by a fast-growing
economical activity: Agriculture.

2. THE PERFORMANCE OF AGRICULTURE

During the 20th century, Agriculture showed an astonishing performance.


While the world population grew threefold to 6 x 109 around the year
2000, the food output of Agriculture has been able to meet its food demands,
what seemed impossible in the most widespread forecasts [4].
In fact, agricultural production occupies only 38% of the available land
area (Table 2). The two major food crops (wheat and rice) that occupy only
332 P.A. Pinto

7.5% of the agricultural land produce more than enough energy and protein
for the present world population [1].

Table 3. World land use. Data from [5].


Land use Area (Mha) Agricultural area %
Arable land 1375
Permanent crops 128 4936 38%
Permanent pasture 3433
Forest 4157
Other uses 3955
Sub-total 13048
Inland waters 339
Total 13387

While population goes on growing the fraction of agricultural area has


stabilized worldwide and decreased in the United States, Europe, and
particularly in Portugal (Figure 1).
50%

UE15
USA
Agricultural area / Total area (%)

45%

Portugal

40%

World

35%
1960 1965 1970 1975 1980 1985 1990 1995 2000

Figure 1. Evolution of the fraction of agricultural land in the World, USA, EU15 and
Portugal.

At the same time this performance was obtained with ever less people
directly involved in farming operations (Figure 2). The percentage of the
labor force in Agriculture is decreasing everywhere and apparently tends to
be less than 5 per cent, which is the case both in the United Sates and in
Europe.
Agronomy: Tradition and Future 333

60%

50% World
%active agricultural populatio

40%

Portugal
30%

20%

EU15
10%

US
0%
1960 1965 1970 1975 1980 1985 1990 1995 2000

Figure 2. Evolution of the active agricultural population (%) in the World, USA, EU15
and Portugal.

3. THE PERFORMANCE OF AGRICULTURE

This impressive performance was insured by an increase in cultivated


area that is still noticeable at a global scale, but has already ended in the
Western world. This increase in area was obtained at a considerable cost in
land clearing and is not exempt of environmental consequences.
It seems clear that an increase in area could not, by itself, explain the
tremendous increased capacity of sustaining the world population.
Although in the developing countries, the increase in production still
comes from increases in cultivated area, in the developed countries most of
the increase in production has come from increases in yield per unit area.
Figure 3 shows the historical trends in the yields of wheat and rice in the
United Kingdom and Japan, respectively. They are compared with present
yields in very diverse situations. While low wheat yields in Australia and in
the USA are explainable by the severe environmental conditions in which
this crop is grown, low rice yields in Philipines or Indonesia reflect
inadequate agronomy practice.
This increase in yield per unit area can be attributed to both different and
improved plant crops and to better technology. It is noticeable that the
steepest increase begins with Agronomy, as it was shown in Table 1.
334 P.A. Pinto

8000 United Kingdom, 99

France, 99
7000
Japan, 99
6000

5000
Yield (t/ha)

4000 Taiwan
France
M exico
3000 Sri Lanka Italy

Thailand Indonesia Thailand, 99


USA
2000 India
Phillipines USSR, 99
Canada
1000 Rice, Japan USSR
Pakhistan Australia
Wheat, United Kingdom India
0
800 1000 1200 1400 1600 1800 2000

Figure 3. Historical trends in the grain yield of rice in Japan and of wheat in England
compared with 1968 yields of wheat an rice in several countries. Adapted from [7]. Yields
updated to 1999 from [8].

From that time on, advances in several scientific disciplines where


instrumental in the understanding and improvement of agricultural
production.
The use, at first of natural fertilizers (guano, phosphates) followed by the
synthesis of nitrogen fertilizers fostered the attainable yield of most crops.
The knowledge in the mechanics of Genetics and their application in
Plant Breeding allowed the appearance of new varieties better adjusted to
higher levels of fertilizers.
The losses due to weeds, pests and diseases were better controlled by
either using pesticides or obtaining resistant or more competitive crop
varieties.
At the same time, the mechanization of crop operations reduced the
harshness of farm workers and increased several-fold labor productivity.
In Figure 4 an attempt is made to distribute the increase in yield of corn
by several possible causes as a percentage of the total. The change in
agronomic practices is not always positive, although the balance is a net
improvement.
Agronomy: Tradition and Future 335

Other unidentified negative factors -23

New pests and diseases -8

Increased crop mechanization 5 Physics

Change in crop sequence (intensification) -7

Erosion increase -8
Physiology
Better spatial arrangement of plants 8
Climatology
Better determination of planting date 8

Increased control of pests and diseases 21 Phitopatology


Reduction in organic matter and in manure -28

Increased commercail fertilizer use Chemistry 47

Introduction of improved cultivars Plant breeding and Genetics 58

-40 -30 -20 -10 0 10 20 30 40 50 60 70

Figure 4. Per cent impact of several factors of different origin in the increase in the yield
doubling of corn in Minnesota from 1930 to 1979. Adapted from [9].

The integrated contribution of the several scientific disciplines that was


characteristic of the rise of Agronomy as a science is also illustrated in
Figure 4. Yet, it has been gradually lost due to an increased specialization in
research and teaching.
Liebig’s law of the minimum provides an explaining hypothesis for this
fact.
Most of the scientific progress in the understanding of the limiting factors
of agricultural production was made by identifying the most limiting factor
on a given set of circumstances, isolating it, and investigating the means of
improving its impact on production. These cycles of identification, isolation
and research are often too long and it is foreseeable that ecologists,
physiologists, plant breeders or pathologists might reach to different and
sometimes opposed solutions.
Furthermore, this process of isolating a single factor, leads to the
isolation of the scientific discipline that tries to study it. The enormous
amount of information in each discipline also pushes towards specialization
and isolation. The result is that frequently the scientific disciplines that
contribute into making Agronomy and integrative discipline are themselves
in a centrifugal trajectory.
The success of Agriculture, by providing more than enough food supply
for the world population is one of the explanations of the present identity
336 P.A. Pinto

crisis of Agronomy as a science. Abundant supply, problems that seem


solved, lower the interest in the economic activity as well as in the scientific
challenge.
At the same time, the reverse of the success is an increased public
scrutiny on the negative environmental impacts of Agriculture, already
foreseen in the case depicted in Figure 4.

4. THE SYSTEMS APPROACH

Bertallanfy’s General Systems Theory [10] provides the approach that


can return integration to Agronomy.
The scenery where agriculture takes place is the “agricultural field” that
can be viewed as an ecosystem with two main singularities: most primary
and secondary production is exported; and in order to replace this flux of
mass and energy, the ecosystem must be “subsidized” (Figure 5).
solar
radiation

Aerial environment

Methan
Reflection
Processing Plant
Fuel products
Harvest

Conservation

Crop
Animals

Animal
Pesticides products

Excretions
Irrigation
Senescence, diseases, pests

Soil Export
Fertilization

Energy subsidy

Figure 5. Diagram of the energy flow of an agricultural ecosystem.

In order to be operational, the systems approach must consider four sub-


systems of agriculture [11]:
(a) the biological sub-system – integrating knowledge about crops and
animals and about the effects on them of the physical environment and
of man’s activities.
(b) the work sub-system – integrating knowledge about the physical tasks
in agricultural operations
Agronomy: Tradition and Future 337

(c) the production economics sub-system – integrating knowledge about


prices of products sold and factors of production bought, production
plans and risk and uncertainty
(d) the socio-economic sub-system - integrating knowledge about
markets, legislation, research and education
It can be easily recognisable that each sub-system affects and is affected
by all others (Figure 6).

• plants • produce price


• animals • factors price
• physical • production plan
environment • uncertainty and risk
• crop practices

• farm operations • agricultural markets


• work • land market
• machinery • politics and laws
• energy • research and education

Figure 6. Relationships among Raeburn’s four sub-systems of agriculture.

The use of the systems approach hopefully will allow, has it has
successfully done in Ecology, a framework in which it will be possible to
foresee the impact of a single measure in all components of the system as
well as in the system as a whole, stimulating the co-operative effort of
several scientific disciplines into approaching the problem.

5. APPLICATIONS

In our Department at the Instituto Superior de Agronomia the need for a


systems approach was clearly felt. The possibilities opened by the
information technologies are enormous and an example of their promises is
the use of relational data bases.
AGRIBASE is a data model incorporating crop simulation models,
relational databases and geographical information systems allowing an
338 P.A. Pinto

integrated environmental characterization and matching it with crop


characteristics (Figures 7 and 8).

Crop
Cro
Climate Soils Prices ......
technology

Agri Base
Figure 7. Schematic representation of the AGRIBASE data model.

SOL Classes textura MAQ VT


CTextura Tractor
Maquina
CTextura
VT

CUL Culturas TEC Maquinas MAQ Operacoes TRA Tractores TRA Preços
Cultura Tractor Tractor Tractor
CUL Sistemas Tecnologia Maquina Vutil Regiao
Sistema Fase Operacao Ano
Duracao Oper h_ha Preco
CUL Culturas definicao Tractor Custo h

Sistema Maquina
MAQ Tipos MAQ maquinas MAQ Preços
Cultura h_ha
Tipo Maquina Maquina
Tecnologia Tipo Regiao
Pond VUtil Ano
Ordem TP Preco
VD Custo h
TV Custo ha
LT Fonte

Ger Meses TEC Tecnologias TEC Fases TEC Operacoes TEC Factores FCT Factores FCT Preços
Mes Cultura Cultura Cultura Cultura Factor Observacoes
N Tecnologia Tecnologia Tecnologia Tecnologia Tipo Factor GER Regioes
Rega ini Fase Fase Fase Unidade Regiao Regiao
Rega fim Inicio Operacao Oper Ano
Ger Meses_1
Estrut Pond Descricao Factor Preco
Mes
Tecno Desc Per Quant Fonte
N
Inov Num
Intens
Area min TEC Subsidios TEC Produtos TEC Mao de obra MO Mao de obra MO Precos
Acess Cultura Cultura Cultura Mao de obra Mao de obra

Ecological niche
Declive Tecnologia Tecnologia Tecnologia Regiao
Pedreg Fase Fase Fase Ano
Observacoes Ano Produto Oper Preco
Freguesia Quant Mao de obra Fonte
Subsidio Quant fase Quant Observacoes
Observacoes

COM Produtos COM Precos COM Mercados


Produto Produto Regiao
Unidade Regiao Mercado
Mercado

ANI Produtos Ano


Mes
ANI Factores Unidade INF Precos
Preco
ANI Mao de obra Cofinanciada
Observacoes
ANI Maquinas

Crops Technology Phases OperationsCost components Cost definition

Figure 8. Relationships diagram of the agricultural activities data model

The matching of crop characteristics, including not only their ecological


requirements but also the different range of possible technologies employed
with the environmental restraints of a given field, farm or region has been
Agronomy: Tradition and Future 339

successfully used in agricultural planning. It is analogous to the process of


filling with the adequate species a given ecological niche.

6. CONCLUSIONS

The once intuitive and easy integration of different disciplines into


Agronomy that has been menaced by the centrifugal trajectory of the
contributing areas of science, might be restored, recurring to the use of some
tools made available by the developments in information technologies,
namely, simulation models, data base management and environmental
monitoring.

REFERENCES
1. Loomis, RS, Connor, DJ. Crop Ecology, Cambridge, Cambridge University Press, 1992.
2. Maroto, JV. Historia de la Agronomia, Ediciones Mundi-Prensa, Madrid, 1998.
3. Dodson, S.I, Allen, TFH, Carpenter, SR, Ives, AR, Jeanne, RL, Kitchell, JF, Langston,
NE, Turner, MG. Ecology. Oxford. Oxfor University Press. 1998.
4. Malthus, T. An Essay on the Principle of Population, Digital, 2004.
5. FAOSTAT. Agricultural data, http://apps.fao.org/page/collections?subset=agriculture,
2002.
6. FAOSTAT. Agricultural data, http://apps.fao.org/page/collections?subset=agriculture,
1998.
7. Evans LT (ed.) Crop physiology. Cambridge. Cambridge Uniersity Press. 1975.
8. FAOSTAT. Agricultural data, http://apps.fao.org/page/collections?subset=agriculture,
2000.
9. Stoskopf N. Cereal grain crops, Prentice Hall, 1985
10. Bertalanffy L. General Systems Theory. Foundations, development, applications. Re-
vised edition. George Braziller, Inc. New York, NY, 1998.
11. Raeburn, JR. Agriculture: Foundations, principles nad development, john Wiley & Sons,
1985.
TOWARDS A CLEAN ENERGY FOR THE
FUTURE – THE RESEARCH GROUP ON
ENERGY AND SUSTAINABLE DEVELOPMENT
OF IST

Maria da Graça Carvalho1 and Luis Manuel Alves2


1
Instituto Superior Técnico, Universidade Técnica de Lisboa, Avenida Rovisco Pais, 1049-001
Lisbon, Portugal, email: maria.carvalho@ist.utl.pt

2
Instituto de Engenharia Mecânica, Instituto Superior Técnico, Universidade Técnica de
Lisboa, Pav. Mecânica I–2º, Avenida Rovisco Pais, 1049-001 Lisbon, Portugal,
email: lalves@ist.utl.pt

Abstract: Energy is central to achieving sustainable social and economic development.


As a matter of fact, it is not possible to achieve economic growth to satisfy
current basic needs of Humanity without an intensive use of energy. On the
other side, one of the main challenges for the use of energy in the negative
effects of greenhouse gases on the global climates. Europe needs a greater
security of supply and to achieve this goal it is necessary to competitive
markets and efficient regulations. It also necessary to diversify energy mix and
renewable energy sources could contribute to this objective. This paper
presents the work performed by the Research Group on Energy and
Sustainable Development of the Instituto Superior Técnico of the Technical
University of Lisbon on new and renewable energy sources for sustainable
development.

Key words: Greenhouse gases, climate change, clean energy technologies, renewable
energy, hydrogen, rational use of energy, energy system sustainability.

341
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 341–353.
© 2007 Springer. Printed in the Netherlands.
342 M. da Graça Carvalho and L.M. Alves

1. INTRODUCTION

Energy is central to sustainable development goals, and is at the forefront


of world leaders´ concerns since energy is the major contributor to climate
change. The challenge lies in finding ways to reconcile this necessity and
demand for energy with its impact on the natural resource in order to ensure
that sustainable development goals are realised. It is not possible to achieve
economic growth to satisfy all basic needs of Humanity without an intensive
use of energy. As a consequence, increasing global populations’ prosperity
and living standards are based on energy. In this perspective, sustainable
development must be based on energy policies with a future vision while
corresponding to the present needs.
One of the main challenges in the use of energy is the reduction of the
negative effects the greenhouse gases emission, which cause global
temperature increase and scientific proved climate change. The need to
reduce those emissions, especially CO2 emission, is stated in the United
Nations Framework Convention for Climate Change and regulated in its
Kyoto Protocol which assigns emission targets and greenhouse gases
reductions to developed nations. In the case of Europe, achieving Kyoto
Protocol targets will require an integrated approach in both supply (energy
production) and demand (energy use) sides.
To ensure security of energy supply in free markets, Europe will need to
promote competitive markets and efficient regulations. It is also required to
diversify energy sources, mainly by using renewable energy sources,
decreasing dependence on fossil fuel imports. The use of alternative fuels
and renewable energies is an excellent opportunity to believe in a better
future for energy in Europe. However, those sources are still responsible for
a small fraction of the energy consumed in Europe. Thus, it is necessary to
pursue research, technology development and demonstration to increase new
and renewable energy sources in the European energy systems.
The present article describes the work performed by the Research Group
on Energy and Sustainable Development of the Mechanical Engineering
Department of the Instituto Superior Técnico of Lisbon in the areas of: Clean
Combustion; Renewable Energy; Hydrogen and Fuel Cells; CO2 and Climate
Change; Clean Energy Technologies and Energy Efficiency; and Sustainable
Energy Systems [1].

2. THE PROBLEM AND CHALLENGES

To keep current social and economic development path is greatly


dependent on the intensive use of fossil energy – oil, coal and gas. The
Towards a Clean Energy for the Future 343

combustion of fossil fuels produces enormous amounts of CO2 and other


gases, which high concentration in the atmosphere is having negative effects
in air quality and on climate change. Several scientific studies already prove
the negative effects of the concentration of greenhouse gases in the
atmosphere on the globe climates and, thus, that climate change has an
important anthropogenic origin.
However, we live in a world that has only begun to consume energy.
During the next 50 years, as Earth's population expands from 6 billion
toward 9 billion, humanity will consume more energy than the combined
total used in all previous history. With carbon emissions now threatening the
very stability of the biosphere, the security of our world requires a massive
transformation to clean energy. The Kyoto Protocol calls for developed
nations to made all efforts to reduce their greenhouse emissions to the levels
of 1990. However, today India and China are gaining rapidly on Europe and
America in per capita energy consumption. As a consequence, with the
entering in force of the Protocol, discussions had begun to include the
developing world in the global effort to reduce carbon emissions.
To reduce carbon emissions without compromising the living standards
while giving the developing countries a way to also achieve social and
economic developing, the societies need to adopt measures in both supply
and demand sides of the energy sector: using clean technologies to produce
energy, increasing the penetration of renewable energy sources in the energy
systems and adopting rational use of energy techniques.

3. THE RESEARCH GROUP ON ENERGY AND


SUSTAINABLE DEVELOPMENT OF THE
INSTITUTO SUPERIOR TÉCNICO

The Research Group on Energy and Sustainable Development (RGESD)


is part of the Institute of Mechanical Engineering (IDMEC - IST), which is
one of the Research Institutes of Instituto Superior Técnico (IST), the
engineering school of the Technical University of Lisbon (UTL). RGESD
has developed its activities since 1985 in the development of tools to
simulate and to optimize the design, operation and control of full-scale
energy equipment. Later the Research Group has redirected part of its
research efforts to the development of tools and practices for energy
management, efficient use of energy, renewable energy technologies and its
potential in islands and remote areas.
344 M. da Graça Carvalho and L.M. Alves

3.1 The research group structure

The RGESD is currently engaged in several research areas, namely:

a. Energy supply:
- Clean Combustion;
- Renewable Energies;
- Hydrogen and Fuel Cells.
b. Energy Efficiency:
- Clean Urban Transport;
- Energy in Urban Environment;
c. Sustainable Energy Development:
- Energy Policies and Energy Planning;
- Integrated Environmental Studies;
- Sustainable Energy Development;
- Sustainability, Kyoto Protocol;
- CO2 and Climate Change;
- Capacity Building in Energy and Environment in 3rd Countries;
- Promotion of Energy Technologies.

The Research Group has over 50 Researchers, Technicians and


Administrative staff. The Group has been involved as a Partner or
Coordinator in more than 100 European funded projects covering a wide
variety of multidisciplinary subjects.
The Research Group on Energy and Sustainable Development is
currently involved in several European projects related to Clean
Combustion, Renewable Energies, Hydrogen and Fuel Cells, CO2 and
Climate Change, Clean Technologies and Energy Efficiency in Industry,
Clean Urban Transport, Energy Policies and Energy Planning, Sustainability
and Promotion of Energy Technologies [1].

3.2 Activities on clean combustion

The RGESD is part of a European consortium co-ordinated by the


University of Ulster (UK) developing a Thematic Network for Clean Power
Generation (POWERCLEAN) which objectives are:

- To encourage collaboration, co-operation, and exchange between EC


supported research projects and researchers;
- To help maintain the technical and industrial content of future European
energy-related research and to contribute to identifying future research
priorities for clean power generation;
Towards a Clean Energy for the Future 345

- To help communication between national and EC activities;


- To encourage the formation of new RTD partnerships between
stakeholders in the power sector including manufacturers, suppliers,
users and researchers;
- To improve dissemination of the results of EU Energy R&D
programmes;
- To support innovation in the European power generation and equipment
manufacturing sector.

The project aims at creating enhanced networking and co-operation


among energy RTD projects addressing fuel conversion and generation
issues supported in the European Commission Framework programmes in
order to advance the objectives in a coherent manner focused on
market/societal/environmental needs [2].

3.3 Activities on renewables energies

Research activity in the field of the integration of Renewable Energy


Sources into energy systems has been performed mainly for islands and
remote regions. Lately, the group has developed activities for urban
environments.
The RGESD has co-ordinated an European project, the EURO-ISLAS –
New and Renewable Energies Sources for Islands and Remote Regions,
which has performed social, economical and technical studies for several
technological scenaria to integrate renewable sources in the energy systems
of Madeira, Canaria and Crete islands.
One of these technical scenaria was modeled for the island of Porto Santo
(Madeira) in a new project – RenewIslands, also co-ordinated by IST. The
RenewIslands project – Renewable Energy Solutions for Islands, was aimed
to the development of solutions and strategies integrating intermittent
renewable energy supply, fuel cell and hydrogen infrastructure for:

- Greater intermittent renewable energy sources and innovative


decentralised power systems penetration in islands and other markets;
- Increasing the market penetration of new energy systems, combining
fuel cell (FC), renewable energy sources (RES) and hydrogen (H2) in
islands and remote regions.

The project RenewIslands developed a computer model (H2RES) aimed


to analyse issues associated with intermittent RES penetration in islands and
assess potential for hydrogen energy storage, to understand and configure
integrated RES/H2/FC applications and to model their technical, economic
346 M. da Graça Carvalho and L.M. Alves

and environmental characteristics, to check technical and economic


feasibility of a grid-connected integrated RES/H2/FC installation in Porto
Santo Island, and to assess the barriers and opportunities, and develop and
implement strategies for integrated RES/H2/FC systems in islands and other
markets [3].
The H2RES model is based on time series analysis that includes wind,
precipitation and solar as renewable resources and presents reversible hydro,
batteries and hydrogen as energy storages. It also allows for deferrable and
hydrogen loads. The modelwas also applied to the case of other European
and third countries islands - Corvo and Graciosa (Azores), Mljeta (Croatia),
and Sal and S. Vicente (Cape Verde). The model will be applied for the case
of a bigger island, the Island of Hainan in China.

3.4 Activities on hydrogen and fuel cells

The project Vaillant – System Development, build, field installation and


European demonstration of a Virtual Fuel Cell Power Plant, consisting of
residential micro-chips, co-ordinated by the German enterprise Vaillant
GmbH. The aim of the project is to develop, to install, to test and to
demonstrate a Virtual FC Power Plant to open the market to environmental
friendly innovative fuel cell technology [4]. It is also the aim of the project
to apply innovative fuel cell technology to solve the problems of
transplantation of a “lab-scale technology” into a “very-day technology”. In
the scope of the Vaillant project, a fuel cell prototype was installed at the
IST pool generating electricity (4.5 MW) and heat from a natural gas
reformer. The system worked for a period of 1,800 h without problems.
The IST, through the Research Group on Energy and Sustainable
Development, co-ordinated the European project “Hy-Society – The
Euopean Hydrogen (Based) Society”, which main objective was to support
the introduction of safe and dependable hydrogen-based society in Europe
through the creation of an enabling environment, by the:

- Identifying non-technical barriers and the proposal of policies and


measures to remove and reduce existing barriers;
- Quantifying the technological, social, economical and environmental
impacts of the introduction of hydrogen in the European society;
- Providing European decision and policy makers with a plan of action for
the introduction of hydrogen in European society;
- Fostering the broad public awareness and debate as to the opportunities
and challenges of the hydrogen society, stimulating the dialogue with all
interest groups.
Towards a Clean Energy for the Future 347

The Hy-Society project developed a complete Action Plan for the


introduction of hydrogen in Europe as well as a “First Stage” European
Hydrogen Roadmap [5].
Based on the work performed by the Hy-Society project, IST is
participating in a new European Union action, the HyWays project, which
major aim is to develop the European Hydrogen Energy Roadmap. The
objectives of this project are not only to aim at assessing policy strategies for
the transition towards a hydrogen based society and the definition of targets
and milestones, but also to establish a European Hydrogen Consensus. To
achieve these objectives HyWays is implementing activities on three
different layers, meeting scientific and technical, strategic as well political
concerns of a possible transition towards a hydrogen economy.
At the national level, the RGESD is participating in a project funded by
the Portuguese Agency for Innovation (Ministry of Economy and
Innovation), the “EDEN – Endogenizar o Desenvolvimento de Energias
Novas (Endogenous Development of New Energies)”. The overall objective
of the Project is to create a national technological platform for hydrogen
capable of mobilise an active intervention and participation of the
community - scientists, manufactures, utilities, end-users, politicians - for the
transformation process of today’s energy paradigm, promoting the hydrogen
as a key player.
More specifically, the goals of the project are:

- Integrate the technological bases associated to FC in the distributed


energy production;
- Create national expertise in the conception, project and engineering of
energy production systems based on FC;
- Develop models to integrate hydrogen and renewable energy sources;
- Consolidate technological and scientific expertise development of FC
based systems, in order to promote National participation in European
and International projects;
- Develop scientific and technological expertise in support Technologies
related to FC and assess the associated opportunities for the private
sector;
- Study and assess the logistic issues related with the large scale
introduction of H2 in the energy chain and establish and national action
plan to facilitate the changing process;
- Study and assess the technological, economic and social impacts related
with the introduction of H2 in the national energy system;
- Promote the dissemination and training in the area of FC and associated
support Technologies, and applications.
348 M. da Graça Carvalho and L.M. Alves

3.5 Activities on CO2 and climate change

The current work on CO2 and Climate Change is mainly developed in the
frame of the project “Delphi CO2 Removal – Future perspective of CO2
removal at power plant”. The project Delphi is co-ordinated by the
University of Essen (Germany) and its main objectives are the assessment of
the future prospective of CO2 removal in fossil fuel fired power plants in
Europe. The Delphi project will establish the state-of-the-art of CO2 removal,
the expected impact on CO2 levels as a result of CO2 removal from power
plants, the obstacles that may delay the application of CO2 removal
technologies, and further research requirements.

3.6 Activities on clean technologies and energy efficiency

The project “Green Hotel – Integrating Self Supply into End Use for
Sustainable Tourism”, co-ordinated by AREAM – Regional Agency for
Energy and Environment of Madeira, has the participation of the RGESD in
a large European consortium integrating 12 institutions from eight countries
[6].
The action aims at developing and introducing an integrated efficient
system for energy and water production, as well as waste management in a
new hotel resort and marina, in construction in the Madeira Island. The
action is intended to provide solid technological solutions to increase the use
of renewable energy technologies from a diversity of energy supplies, public
awareness of cost effective and environmentally friendly technologies, and
to contribute to the social and economic development of insular
communities, by the use of innovative technologies aiming at demonstrating
the economic and social benefits of sustainable local energy sources.
The Green Hotel project will boost the use of new and innovative
technologies on insular regions through a permanent exhibition explaining
the technologies used and their advantages that can be visited by the
common people of the communities. Building construction and
refurbishment will use sustainable building construction solutions: low heat
transfer coefficients and low solar factors, preference on endogenous
materials, energy saving procedures, water saving equipment and special
procedure for water usage, local wastewater treatment system. A fuel cell
will be used in the marina services to produce heat and as an emergency
electricity generator. The whole technological system will be based on the
more economic running costs. The project will include the study and the
project cost presented are based on those that represent the lowest
investment, to demonstrate the possibility of making renewable energy
sources decentralised generation the main energy source to satisfy the needs
Towards a Clean Energy for the Future 349

of a community. Furthermore, the action will create an appropriate display


for several innovative technological solutions for insular community
problems, addressed to local residents and visiting tourists. Those visitors
will be the main disseminators of the action success along with the
promotion of campaigns and well targeted publications creating new market
opportunities.
The major result of the action will be a modern hotel resort including a
marina and related services, and incorporating an innovative integrated
energy system, including appropriated technological solutions for new and
RES integration and the use of energy efficient building materials and
methods, efficient water desalination and treatment, solid waste and
landscape treatment processes. The complex is intended to demonstrate to
local communities and tourist visitors an innovative and well balanced
technological, social and economic solution for the integration of RES in
insular communities. Reports on approaches for share-holding of time-
sharing owners and on impact of the action in green tourism implementation
and on job and wealth creation in Madeira and other EU islands, will be
produced.

3.7 Activities on clean urban transport

The work performed by the Research Group in the area of energy use in
transportation is made mainly within the project “CUTE – Clean Urban
Transport for Europe”, a project co-ordinated by the German Evobus. The
aim of the project is to develop and demonstrate a emission-free and low-
noise transport system, including the accompanying energy infrastructure,
which has great potential for reducing the global greenhouse effect
according to the Kyoto protocol, improving the quality of the atmosphere
and life in densely populated areas and conserving fossil resources. For this
purpose the application of the innovative hydrogen-based fuel cell
technology is to be established by using fuel cell powered buses in an urban
environment together with novel hydrogen production and support systems
as part of a European Union wide demonstration scheme [4]. The project
will serve to strengthen the competitiveness of European industry in the
strategically important areas of hydrogen processing, fuel cell and mobility
technology. In tandem to this the project demonstrates also to European
Society the closeness of such innovative technology to their every day
concerns like improved employment, human health, environmental
protection and quality of life. The major objectives are as follows:
350 M. da Graça Carvalho and L.M. Alves

- Demonstration of 27 fuel cell powered regular service buses over a


period of two years in 9 European inner city areas to illustrate the
different operating conditions to be found in Europe;
- Design, construction and operation of the necessary infrastructure for
hydrogen production, including the required refueling stations;
- Collection of findings concerning the construction and operating
behavior of hydrogen production for mobile and static use, and exchange
of experiences including bus operation under differing conditions among
the numerous participating companies;
- Ecological, technical and economical analysis of the entire life cycle and
comparison with conventional alternatives. Accompanying social study
to analyse and increase awareness of these new technologies;
- Quantification of the abatement of CO2 at European level and
contribution to commitments of Kyoto Protocol;
- Registration of the socio-economic effects such as the impact on
employment and relations between individual industrial sectors as a
result of the changes to energy and transport systems.

The major tasks of IST in the CUTE project, were well-to-wheel


analysis; comparison of different fuel alternative – hydrogen, diesel, natural
gas, energy consumption, pollutant emissions (NOx, CO, HC, particulates),
and greenhouse gases emission.

3.8 Activities on energy policies and planning, and


sustainability

The group implemented a technical and economical feasibility study for


the introduction of Liquified Natural gas in islands, with a case study for the
Island of Madeira. The main objectives of the study were to replace the use
of oil products with natural gas for energy supply to islands in Europe, to
increase the efficiency of poly-generation systems using LNG, to develop
new technology for direct use of cold from LNG, and to develop and
demonstrate maritime small-scale distribution of LNG in Europe [7].
The Research Group on Energy and Sustainable Development has
supported several African Countries in the field of Energy Policies and
Planning towards achieving energy sustainability. The Group elaborated the
National Energy Plan of Cape Verde and supported the elaboration of legal
framework and energy policies in Cape Verde, Angola and Mozambique [8].
In the frame of an EU project, IST co-ordinates an European and African
consortium to implement the project “CDM for Sustainable Africa: Capacity
Building for CDM in Sub-Saharan African Countries” with capacity building
activities in Niger, Botswana, Mozambique, Zambia, Tanzania and South
Towards a Clean Energy for the Future 351

Africa, resulting on the identification and technical and economic feasibility


studies of CDM projects in these countries.
Currently, the Research Group is co-ordinating a project financed by the
European Commission COOPENER programme, the “IE4Sahel –Energy for
Poverty Alleviation in Sahel”, targeting the Sahelian countries – Mauritania,
Cape Verde, Senegal, The Gambia, Guinea-Bissau, Mali, Burkina Faso,
Niger and Chad. The action will contribute to strengthen the capacity in the
Sahel Region, in the areas of training and networking for regional, national
and local energy policy makers, regulators, and planners, by strengthening
and reorienting an existing regional training and information centre, the
ARC - AGRHYMET Regional Centre, located in Niamey, the capital of
Niger. The action addresses the most important issues related to poverty
alleviation, energy and energy planning. The existence of a Centre of
Excellence will lead to a long lasting effect in the region to develop
important tools for critical support to energy policy and regulation.
In the field of sustainability, the Research Group is collaborating in a
project for the Sustainable Development of Croatian capacity in CHP
(Combined Heat and Power) Sector [9]. The objectives of the LIFECROCH
project are:

- Increasing the sustainability of Croatian development;


- Capacity building of Croatian institutions, raising the awareness of
policy and decision-makers on the challenges and opportunities on
Kyoto Protocol;
- Improving co-operation between Croatia and EU countries;
- Results reuse in the economies in transition of Central and Eastern
Europe that have CHP capacities;
- Assessing the needs and priorities of the target region and country in the
restructuring of the CHP sector;
- Contributing to environmental improvements of local air and water
quality and reduction of GHG emissions, while avoiding increased use
and potential imports of fossil fuels;
- Enhancing local expertise.

4. CONCLUSIONS

To maintain current living standards of developed countries and to allow


developing countries to achieve sustainable social and economic
development, enormous amount of energy are required. However, dealing
with energy in the future will also require an integrate approach at the supply
side as well as in the demand side. Only with a systemic and integrated effort
352 M. da Graça Carvalho and L.M. Alves

in the energy sector at all levels will allow the achievement of the Kyoto
Protocol Objectives, Thus, a new way of thinking energy is necessary.
Europe needs to obtain a greater security of supply and, as a
consequence, it will be necessary to diversify energy sources. In this matter,
new and renewable energy sources will have an important role to play. Also,
the rational use of energy is an important issue to reduce greenhouse gases
emission, and, thus, it should be taken in account while developing a new
vision for energy production and use for the future.
European energy sector has an enormous potential in the implementation
of clean technologies in third countries and will have a crucial contribution
in the economic growth of Europe.

ACKNOWLEDGEMENTS

The authors would like to thank the European Commission, and the
Portuguese Ministries for Science, Technology and Higher Education and
for Economy and Innovation, for supporting the work of projects in which
this paper was based.

REFERENCES
1. Carvalho, MG. “Energia: Uma Visão para o Futuro", Conference A Investigação
Científica na Universidade Técnica de Lisboa , Instituto Superior Técnico, Lisboa, 3
February 2006.
2. Zsigraiová, Z., Tavares, Semião, V, and Carvalho, MG. – “Municipal Solid Waste
Incineration – Problems and Future Perspectives”, Seventh International Conference on
Energy for a Clean Environment, Lisbon, Portugal, 7-10 July 2003.
3. Duic, N, Lerer, M and Carvalho, MG – “Increasing the Supply of Renewable Energy
Sources in Island Energy Systems”. International Journal of Sustainable Energy, Vol.
23, No. 4, December 2003, pp. 177-186.
4. Chen, F, Fernandes, TRC, Yetano Roche, M and Carvalho, MG – “Investigation of
Challenges to the Utilization of Fuel Cell Buses in the EU vs Transition Economies”.
Renewable and Sustainable Energy Reviews (about to be published).
5. Fernandes, TRC, Yetano Roche, M, Hugh, M, Duic, N, Gonçalves, G and Carvalho, MG
– “A Discussion on the Potential for the Development of Hydrogen as an Energy Carrier
in Portugal”. Eighth International Conference on Energy for a Clean Environment –
CLEAN AIR 2005, Lisbon, Portugal, 27-30 June 2005.
6. Melim Mendes, J.M, Carvalho, MG, Duic, N and Alves, L – “Green Hotel”. 2nd
Dubrovnik Conference on Sustainable Development of Energy, Water and Environment
Systems, Dubrovnik, Croatia, 15-20 June 2003.
7. Alves, LM, Domingues, A and Carvalho, MG – “Small Scale LNG in Madeira Island”.
1st International Conference on Small Scale LNG in Europe, Oslo, Norway, 29-30
September 2005.
Towards a Clean Energy for the Future 353

8. Costa, A, Monteiro, C. Miranda, V. Alves, LM and Carvalho, MG - “Assessment of


Renewable Energy using a Geographical Information System”. Presented at the Euro
Conference on New and Renewable Technologies for Sustainable Development”,
Funchal, Madeira, Portugal, 26-29 June 2000. New and Renewable Technologies for
Sustainable Development, (Ed. Naim Hamdia Afgan, Maria da Graça Carvalho), Kluwer
Academic Publishers, 2002, pp. 211-220.
9. Afgan, NH and Carvalho, MG. “Sustainable Assessment Method for Energy Systems -
Indicators, Criteria and Decision Making Procedure”, Kluwer Academic Publishers,
USA (180 pag.), 2000.
PART VI

NATURE, ENVIRONMENT AND


SUSTAINABILITY
INDUSTRIAL ECOLOGY:
A STEP TOWARDS SUSTAINABLE
DEVELOPMENT

Paulo Manuel Cadete Ferrão


Centre for Innovation, Technology and Policy Research, Instituto Superior Técnico,
Universidade Técnica de Lisboa, Av. Rovisco Pais, 1049-001, Lisboa, Portugal,
e-mail: ferrao@ist.utl.pt

Abstract: Industrial ecology is a broad framework for thinking and acting in the realm of
sustainability. The name suggests, metaphorically, the blending of ecological
systems and industrial economies. The ecological side offers possibilities to
learn from observing resilient, robust, long-lived ecological communities as
examples of sustainable systems. The industrial side suggests that society can
move towards sustainable economies by embedding the principles learned
from ecological systems to the design of firms and larger social institutions.
Industrial Ecology promotes a holistic view of engineering systems where the
system under analysis must be viewed in a global context. This framework is
quite challenging and requires the development of a set of tools to bridge
different scales, from site or product specific analysis to the whole economy
diagnostic and from the economic to the socio-environmental dimension, thus
resulting in a multi-disciplinary set of analytical tools. Providing an adequate
framework for this “Industrial Ecology Toolbox” and putting it at the service
of the promotion of sustainable development is the major objective of the
R&D reported in this paper. R&D at IST-UTL on the development and
application of different tools aimed at providing a coherent framework for this
“Industrial ecology toolbox” is revised and its contribution to the promotion of
sustainable development policies and practices in the socioeconomic arena is
demonstrated with specific case-studies. The tools analyzed range from macro-
economic techniques to specific environmental analysis tools, and it was
shown how other tools could be developed and used to promote the interaction
between economic and environmental analysis within macro and micro-scales,
thus enabling the design of more sustainable systems of different complexity
levels.

Key words: Industrial ecology, MFA, LCA, ecodesign, sustainable development.

357
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 357–383.
© 2007 Springer. Printed in the Netherlands.
358 P.M.C. Ferrão

1. INTRODUCTION

The physical nature of the economy is emerging as a new paradigm,


based on increasing public recognition of environment-economy
interconnections. In this context, modern economies can be seen as living
organisms that ingest raw materials, which are metabolized into products and
services, but also in waste, in the form of materials/products without use and
other emissions. This metaphor allows economic and political decision
makers, industries, banks, insurance companies, and others, to change the
way they look at the environment, namely to see beyond end-of-pipe
approaches, adopting a broader perspective that considers the full product
life-cycle and systems interaction.
In fact, it was only recently, i.e., during the last decade of the 20th
century, that environmental strategies were increasingly integrated in
business strategies [1]. Initially, companies have moved from treating
environmental management as a case-by-case problem solving issue, to a
regulatory compliance issue, and recently, to an issue for proactive
management, in the centre of business strategy. In the past decades, as
discussed in [2], firms have begun to introduce environmentally innovative
products and services to the market place. Examples include recyclable
goods of all kinds, novel leasing systems for durable products, energy-
efficient computers, and many more items.
However, for example, to turn a recyclable good in an effectively
recycled material, adequate infrastructures must be available, and a proper
policy framework should be designed for these activities to be economically
sustainable in a market driven economy. This framework is considered to be
the next step in the evolution of the historical pattern of environmental
strategies, the new “Industrial Ecology stage”.
The evolutions required to step up to this new stage, an “Industrial
Ecology stage”, are classified at three levels, the need for an appropriate
“Industrial ecology methodologies toolbox”, the establishment of a
structured set of indicators to support sustainable policies and priority setting
at a regional level, and finally, the development of a new organization of
infrastructures, technologies, sectors and firms to promote co-operation
between the various actors involved within an Industrial Ecology
framework.
This paper is intended to discuss the contribution of the research and
development at the Instituto Superior Técnico, UTL, on these emerging
fields and, in particular, its contribution to the establishment of an “Industrial
Ecology Toolbox”.
Industrial Ecology: A Step towards Sustainable Development 359

2. INDUSTRIAL ECOLOGY

Industrial ecology is a broad framework for thinking and acting in the


realm of sustainability [3]. The name suggests, metaphorically, the blending
of ecological systems and industrial economies. The ecological side offers
possibilities to learn from observing resilient, robust, long-lived ecological
communities as examples of sustainable systems. The industrial side
suggests that society can move towards sustainable economies by embedding
the principles learned from ecological systems to the design of firms and
larger social institutions. For many, industrial ecology is paradigmatic in that
it provides a new vocabulary for talking about and understanding
sustainability.
Another broad set of arguments that claim connections to the roots of
unsustainability can be found in the literature of industrial ecology and
ecological economics. Both fields argue that the present condition of the
world is created, at least in part, from a reductionist set of assumptions that
lead to a failure to address the interconnectedness of human activities with
the natural world [2]. Existing belief structures are generally unaware of
features of model ecosystems that lead to systems properties such as
robustness and resiliency corresponding to similar properties that arguably
would be found in sustainable human systems.
These systems have evolved to steady states in which none of the
processes produce outputs that are toxic to the system as a whole, (they may
be toxic to other components in a self-defensive sense.) They are
characterized by strong loop-closing pathways for material flows compared
with the largely once-through nature of modern, industrial economies.
Energy in Nature is utilized to an extent that comes closer to the ultimate
thermodynamic availability than in human systems. Tools can be constructed
that embody these ideas in forms of design guides and rules of thumb for
designers of product systems, focusing on several dimensions of
sustainability, as follows:

x Reduction of environmental impacts, such as global warming, toxic


pollution, ozone depletion and others.
x Creation of networks and connections.
o Creation of material loop closing including disassembly, and
recycling, for example.
o Improvement of energy utilization effectiveness.
x Dematerialization, including product life extension, light weighting,
and the promotion of function in detriment of material-based products
(knowledge provision, for example).
360 P.M.C. Ferrão

Industrial Ecology does, as a consequence, promote a holistic view of


engineering systems (here considered as anthropogenic systems), where the
system under analysis must be viewed in a global context (“the system as a
whole”), i.e. a system of systems which interact within the economy
boundaries and with Nature. This framework is quite challenging and
requires the development of a set of tools to bridge different scales, from site
or product specific analysis to the whole economy and from the economic to
the socio-environmental dimension, thus resulting in a multi-disciplinary set
of analytical tools.
Providing an adequate framework for this “Industrial Ecology Toolbox”
and putting it at the service of the promotion of sustainable development
policies is the major objective of the R&D reported in this paper, and this is
basically motivated by the inadequacy or incompleteness of monetary based
tools to trace the relationship between the human economy and its habitat.

3. AN INDUSTRIAL ECOLOGY TOOLBOX: A


MULTISCALE BRIDGE BETWEEN
ENVIRONMENT AND ECONOMICS

The need for an Industrial Ecology Toolbox is based in the assumption


that the characterization of the physical nature of human economy is vital for
understanding the sources and full nature of impacts of society upon the
natural environment. It is similarly assumed that effective strategies toward
sustainable development will rely on the systematic collection of physical
measures of material and energy flows.
This led to the development of Material Flow Analysis (MFA) models,
which are intended to quantify the relation between the economy and the
environment, where the economy is seen as a subsystem inserted in the
environment and depending on a continuous throughput of materials and
energy – likewise a living being. Raw materials, water, and air are extracted
from the natural system and partially transformed into products and services
and partially sent back to nature, as wastes and emissions.
According to the first law of thermodynamics (conservation of mass), the
total of inputs must equal the total of outputs plus the net accumulation of
materials in the system, and therefore any material added to the economic
activity, if it is not resulting in useful products or services, will inevitably be
returned to Nature and will cause any type of perturbation to the Natural
system. Thus, the starting point to avoid environmental burdens is to monitor
the materials we provide to the economic metabolism and to follow them in
order to assess material efficiency of economic processes. This is the
ultimate goal of MFA tools.
Industrial Ecology: A Step towards Sustainable Development 361

There are a number of MFA tools available [4,5], which vary in terms of
the flow domains covered, their spatial or institutional extent, and the degree
of detail with which they are compartmentalized and quantified within the
economy (mass, volume, spatial or thermodynamic measures).
Life Cycle Assessment (LCA) constitutes another environmentally
oriented tool, [6], which is used with the purpose of accounting for
environmental impacts of products and services, along its complete life
cycle, i.e. from cradle to grave. Considering that each industry is dependent,
directly or indirectly, on a great set of other industries, this approach is
expensive and time-consuming because resource input and environmental
discharge data have to be estimated for each of the modeled processes for the
life cycle of a product or service. In addition, the method induces arbitrary
boundary analysis decisions and, consequently, the lack of completeness
may promote truncation errors.
On the other hand, at the domain of the economic tools dedicated to
explain the metabolism of an economy through the quantification of the
monetary flows between different economic sectors, Economic Input Output
tables, are worldwide established for decades [7].
This macroeconomic approach that characterizes the inter-industry
effects of products/processes for a diverse set of commodities can be
extended making use of environmental information associated to the
emissions and other environmental burdens of each economic sector per unit
value produced. This approach is known as Economic Input- Output Life
Cycle Assessment (EIO-LCA) and allows for accounting all the direct and
indirect inputs to producing a product or service by using the input-output
matrices of a national economy [8].
The use of input-output models may be advantageous since it takes into
account the entire supply chain for a product (including indirect suppliers),
allowing for tracing the full range of inputs to a process, and consequently
providing a complete system boundary. However, EIO-LCA has several
limitations in evaluating the economy metabolism, such as:

x EIO tables and its environmental extensions are integrated


environment-economy accounts but require enormous data
requirements. Consequently, there are long time lags in data
collection and table preparation, and related problems in discerning
key patterns and trends in the vast array of generated information.
x The aggregated structure of the economic input-output tables
available does not allow for characterizing specific processes or
products, but rather economic sectors with different levels of detail,
from country to country.
362 P.M.C. Ferrão

This analysis shows that there are tools available to characterize the
economy metabolism that range from specific environmental analysis (LCA)
to macro-economic analysis (EIOLCA), but it is clear that for intermediate
levels of analysis, relevant for sustainable product or systems design, we
need alternative tools that bridge economics and environment with
multiscale and multidisciplinary skills.
A new Industrial Ecology framework is thus required and Figure 1
represents the framework that has guided the R&D work developed at IST-
UTL. Figure 1 illustrates the bridging role of the Hybrid EIO-LCA
techniques that include a set of algebraic manipulations that enable the
incorporation of any required details on particular processes inherent to the
life cycle of the product to be modeled, thus contributing to overcome the
limitation associated to the poor desegregation of the available economic IO
tables, and offering an interesting compromise and integration between
macro-economic data and process detailed information, thus resulting in a
cost-effective strategy to promote product life cycle assessments.
Spatial-scale
Micro-scale Macro-scale
Environment

LCA MFA

Ecodesign
Hybrid EIO-LCA

LCAA EIO-LCA
Economy

EIO

Figure 1. The IST-UTL Industrial Ecology toolbox framework

The tools analyzed so far are basically descriptive, providing adequate


sources of data to monitor and characterize systems behavior, and contribute
to promote a better understanding of the economy’s metabolism at different
scales, and to improve their performance towards a sustainable development
paradigm. However, they are not directly intended to design more
sustainable systems, and this was the motivation for the R&D in the two
Industrial Ecology: A Step towards Sustainable Development 363

remaining tools represented in Figure 1, Ecodesign and Life Cycle Activity


Analysis (LCAA).
Most design tools now used by designers focus on the way a product is to
be used and then only on those aspects that were of concern to the user.
Ecodesign tools are those that go beyond this stage and, based on LCA
information, can inform them of the totality of environmental impacts
created by a product or service over its whole life cycle. If they now wish to
innovate in these aspects of a product or service, they can obtain information
to guide their processes.
Life Cycle Activity Analysis (LCAA) ties mathematical programming
formulations of activity analysis to their environmental impacts. LCAA is
based on the integration of Activity Analysis, a well-known procedure in
economics, solving for optimal levels of production and for the optimal
allocation of resources, with environmental Life Cycle Assessment, which
aims to quantify the environmental impacts of a product or a service from
“cradle” to “grave”. LCAA integrates engineering, environmental and
economical sciences, including operations research, as it solves for optimal
solutions of multi-variable complex systems, and can thus be interpreted as a
new sustainable systems design tool, featuring an economically optimized
system under environmental constraints.

4. INDUSTRIAL ECOLOGY TOOLBOX: R&D AT


IST-UTL

4.1 MFA: Material Flow Analysis

MFA is acknowledged as a set, or family, of models that focus upon a


given geographical area, characterized by the systematic physical
measurement of the magnitude and “location” of the mass of specific flows
of environmentally significant materials for purposes of environmental
monitoring, analysis and management [9]. It examines national or regional,
economy-wide driving forces behind induced flows that incorporate most
major materials metabolized in the economy, though at various levels of
desegregation í in terms of identifying flows between economic system
components (e.g. sectors) and between these components and the natural
environment.
MFA is a tool with both capability of taking a snapshot of the economy
metabolism as well as of its dynamics, as evidenced in this section through
the analysis of the Portuguese economy.
364 P.M.C. Ferrão

One of the most-used methodologies of MFA at the national level,


adopted by the Eurostat, [10], the statistical office of the European Union,
considers each country’s economic system included in the domestic
environmental system that, in turn, is included in the global environmental
system. From the material input side, a main indicator can be derived, the
Direct Material Input (DMI): the total amount of materials extracted from
the domestic environment or imported, as raw materials, intermediary, or
final products, that enter the economic system. The main output indicator is
the Domestic Processed Output (DPO) – the sum of the “outputs to nature.”
This includes emissions to air and water, wastes deposited in landfills, and
dissipative flows. Although material flow indicators are evaluated
aggregating materials by mass, water and air are generally excluded because
these flows are orders of magnitude higher than others and their use could
hide the weight of other materials [10].
A researh work [11] characterized the evolution of the metabolism of a
large number of economies during the last decades. DMI per capita was
tested econometrically as a function of the Gross Domestic Product (GDP)
per capita, using panel data ranging from 1960 to 1998 for 16 industrialized
countries and the results obtained suggest that as industrialized economies
grow, the intensity of material consumption first increases, but eventually
starts exhibiting a decreasing trend after a certain income threshold is
reached, as represented in Figure 2.

Figure 2. Predictions of the relation between the DMI per capita and the GDP per capita
(adapted from [11]).

The period of increasing material consumption, was associated with


transitional economies. An empirical analysis of the development of the
Industrial Ecology: A Step towards Sustainable Development 365

Portuguese economy was conducted in [9], showing that during the last two
decades of the 20th century Portugal has crossed this development stage.
The analysis of the Portuguese economy during this period was
characteristic of the behavior of a transitional economy, and this is
particularly relevant in a moment when 10 new member states have joined
the EU. Here, the analysis of the Portuguese development may provide an
interesting case study to be considered for policymaking in sustainability
domains.
The analysis described in [9] showed that the Portuguese direct material
input increases in the transitional phase of the economy metabolism was
mainly due to the extraction of industrial rock (e.g. sand, clay, limestone),
and biomass and to the increase of fuels imports.
This suggests that the development model for this transitional period
corresponds to an intensive growth of materials consumption associated to
the strengthening of critical infrastructures, such as roads and housing.
On the other hand, the ability of MFA to catch a snapshot of the economy
metabolism is verified by analyzing the Portuguese economy’s metabolism
in the year 2000, as presented in [9]. The material balance is presented as a
matrix, where the input materials are disaggregated in different components
of the domestic and imported DMI, and then aggregated in terms of biomass,
minerals and others, and fossil fuels. The destinations of these materials
were then identified as addition to stock, exports of goods, or domestic
processed output, which corresponds to environmental burdens. As far as
fossil fuels are concerned, for the specific purpose of producing the mass
balance diagram represented in Figure 3, inputs correspond to the mass of
fossil fuels input in the economy and outputs correspond to the mass of
carbon and hydrogen contained in the gaseous emissions and ashes, which is
to say that oxygen used in the combustion processes is not represented either
in the input or in the output, in order to keep the mass balance correct.
The global balance of the use given to the material flows that entered in
the Portuguese economy during the year 2000 resulted in an accumulation of
48% of these materials in the economy, adding to stock. A share of 9% was
exported as goods; 43% resulted in outputs to nature and constitute potential
environmental burdens. In short, results seem to confirm that materials were
largely accumulated in the socioeconomic system, or returned to Nature.
The results presented in Figure 3 show that the inputs in the form of
biomass were processed by the economy and largely transformed in waste,
and this process is characterized by a short time scale, as this transformation
occurs in a period of months. A fraction of this waste was landfilled,
becoming biodegradable urban waste, and the most of the rest became
wastewater organic matter. A residual fraction was also used as fuel. Only
366 P.M.C. Ferrão

about 23% of the biomass was either added to the stock, for example wood
for construction, or exported, for example as pulp and paper.
Fossil fuels are also characterized by short stocking times and are rapidly
consumed, resulting in gaseous emissions and ashes in the power plants and
other burning equipment.
On the other hand, 76% of the minerals extracted domestically or
imported remained in the economy and constituted stocks, while 9% were
transformed into products that were exported, and 15% are transformed into
different types of wastes that had some potential to be recycled. However, it
is estimated [9], that only around 18% of that mineral DPO is recycled.
In order to evaluate the potential of recycling as a strategy to increase
resource use efficiency, DPO was disaggregated in its different forms,
namely biomass wastes, emissions to air, construction and demolition wastes
and ashes from combustion processes, and municipal solid waste (excluding
biomass).

Figure 3. Schematic representation of Portuguese metabolism, base year: 2000, [9]. Figure 1.
Figure captions should be set in Times New Roman, 9pt, and centered to the figure margins.
This figure describes page-layouts.
Industrial Ecology: A Step towards Sustainable Development 367

The analysis has shown that Portugal still had a great potential to reuse
and recycle materials, particularly mineral and biomass wastes, which
represent 56 million tons/year [9].
This potential and the need to reduce the environmental burden have led
to the development of different national strategic plans and specific
legislation. It is upon two great groups of wastes – the urban solid wastes
and the industrial wastes – that the majority of strategic plans have been
defined, but still evidencing modest results due mainly to the youngness of
the plans and the need to create some structures (physical and legislative).
The instruments and infrastructures being developed suggest that the
recycling market may achieve considerable importance in Portugal during
the next years, and here MFA tools may be relevant to improve materials
management.
It can thus be concluded that the exercise of materials accounting and
balancing allows, in a macroeconomic perspective, to identify development
models and, above all, by picturing the flows, to provide a special
contribution to the design and implementation of strategies of prevention and
optimization of resources use. Recycling, together with reduction at source,
an objective always present in the strategy oriented to a “preventive
approach,” along with consumption pattern changes are vital to avoid
considerable amounts of virgin materials extraction from the environment. In
short, materials balances are confirmed to be an essential tool for supporting
sustainability policies.

4.2 LCA: Life Cycle Assessment

Life cycle assessment (LCA) is a general methodology for gathering


information on the environmental impacts produced by a product or service
over its entire life cycle. In the vocabulary of LCA, product life cycle means
all the stages, from the production of the raw materials and energy that are
used in manufacture, through the manufacturing process itself, to marketing
and distribution, use and finally the processes involved at the end-of-the-
product life [2]. These may be disposal or some form of recovery and reuse.
Many techniques for performing LCAs have been developed, but most
follow the lines as defined by ISO 14020, the LCA part of the international
standard for environmental management systems. LCA is defined herein as,
“… a technique for assessing the environmental aspects and potential
impacts associated with a product, by: compiling an inventory of
environmentally relevant inputs and outputs of a system; evaluating the
potential environmental impacts associated with those inputs and outputs;
interpreting the results of the inventory and impact phases in relation to the
objectives of the study.”
368 P.M.C. Ferrão

LCA has been the preferential tool to support the recent movement for
extended producer responsibility policies which has driven environmental
concerns to a product life-cycle level and constituted a major driver for the
introduction of environmentally innovative products and services. Here, a
vast number of examples can be provided, such as those involving IST R&D
activity, on the automobile industry [12, 13], on food packaging [14] or on
the electric and electronic industry [15].
These examples range from adopting a life cycle approach for the
continuous improvement process of a product, involving the certification of
the product according to ISO 14040 [15], as the basis of establishing a
broader analysis of a sector in an Industrial Ecology perspective, as reported
for the automobile [12] or for the food packaging [14] sectors.
However, the most generalized use of LCA is in the analysis of the
environmental impact of a product during its complete life cycle. Here, the
LCA of an automobile assembled in Portugal, a Multi-Purpose Vehicle
(MPV), the VW-Sharan, is reported in order to illustrate the type of
information that LCA can produce and how it can be used to promote more
sustainable practices and policies. A detailed inventory table for the MPV
was gathered during one year work at the manufacturing plant and the
resulting quantification of its constituent materials, can be divided into 7
major groups: steel, plastics, cast iron, non-ferrous metals, rubber, fluids and
glass. Figure 4 compares the typical composition of automobiles in the 50’s,
in the 90’s and those obtained for VW Sharan [13].

Figure 4. Typical composition of automotive vehicles in the 50’s, the 90’s [16] and VW
Sharaon. Othera include materials such as leather or synthetic rubber. [13]

The results obtained show that VW Sharan was in line with current
tendency of manufacturing lightweight cars, by increasing plastic contents,
in order to minimize fuel consumption. Looking for the vehicle life cycle in
Industrial Ecology: A Step towards Sustainable Development 369

Europe, one has to consider that end-of-life vehicles (ELV) are subject to the
following legal requirements:
x Until 01/01/2006 - reuse and recovery of 85% on a mass basis
(recycling 80%) for vehicles produced after 1980.
x Until 01/01/2015 - reuse and recovery of 95% on a mass basis
(recycling 85%).
These targets are to be achieved ensuring that ELV´s are delivered at
authorized treatment facilities that are responsible for the removal of
hazardous substances from the ELV and the subsequent shipment of ELVs to
other actors, without costs for the last owner. One of the main objectives of
the analysis developed in [13], was to evaluate the environmental
consequences of ELV recycling targets. Three end of life scenarios were
modeled, where different recycling rates for specific materials and different
technologies efficiencies were considered. The use phase was characterized
only by the impact of burning petrol during 200.000 km. The LCA results
obtained for one ELV processing scenario are represented in Figure 5, for 11
environmental impact categories. Data is normalized, i.e. divided by the
annual environmental burdens attributable to an average European citizen in
1990 for each environmental impact category considered.

Figure 5. Normalized LCA results for VW Sharan. [13]

Figure 5 quantifies the relative contribution of the three life cycle phases
for each relevant environmental impact category (production, use and end-
of-life). The results obtained show that except for solid waste emissions, all
environmental burdens are clearly dominated by the fuel consumption and
370 P.M.C. Ferrão

subsequent emissions, during the use phase of the vehicle. The


environmental impacts associated with the use phase are essentially
dependent on the fuel burnt, the type of use and on the engine performance.
The results obtained do also show that the implementation of the EU
directive on ELV provides significant reductions in emission of solid waste
during the life cycle but its effectiveness requires technological or
organizational developments that guarantee the economic feasibility of the
dismantling/shredding activities, if a drawback on current lightweight
tendency is not to be observed. However, it is also shown that the
implementation of the EU directive does not induce any relevant benefit on
the greenhouse gas emissions and on the release of heavy metals, when
compared to current practices in ELV processing.

4.3 EIO-LCA: Economic Input-Output Life Cycle


Assessment

Economic I–O analysis describes the interdependence among sectors of a


given economy by a set of linear equations expressing the balances between
the total input and the aggregate output of each commodity and service
produced and used in the course of one or several periods of time.
Considering that the relationship between a sector's output and its inputs, are
represented in a matrix constituted by technical coefficients, A, the output
required from each sector, X, to satisfy an increase in demand, Y, is
quantified by: X = (I-A) -1 Y , where (I-A) -1 is commonly referred to as the
Leontief Inverse and, I, is the identity matrix. Details of the matrix
mathematics can be found in [8].
The EIO-LCA methodology complements the economic input-output
analysis by linking economic data with resource use (such as energy, ore,
and fertilizer consumption) and/or environmental impact categories (such as
greenhouse gases emissions). At an European level, environmental data is
available from the National Accounts Matrix including Environmental
Accounts (NAMEA), which accounts for the GHG emissions in the form of
a matrix (b) of gaseous emissions per economic sector. Considering that B
represents the vector of different GHG emissions (CO2, CH4…), if b is a
matrix of GHG emissions per monetary unit of each sector’s output,
environmental impacts can be estimated by:

B= b · X = b · (I - A)-1 · Y (1)

As discussed by [17], a few different types of attempts to integrate


benefits of process based analysis and input-output model were performed
including addition of input-output based results upon process based models
Industrial Ecology: A Step towards Sustainable Development 371

and desegregation of monetary input-output tables. A hybrid model that


allows for full interaction between a process based LCA model and an input-
output model was suggested by [17], and constituted the basis for the model
presented here, that was extended to develop a computer model for the
Portuguese economy.
In the hybrid method, a new algebraic formulation is adopted that
includes in the same matrix the background processes associated with EIO
data and the foreground processes that are specific of the system to be
analyzed and provide greater desegregation to the analysis. These processes
are modeled including material inputs, emission outputs, and their
interaction with economic activity (the background system). Here, the
foreground processes are those characteristic of the product life cycle under
investigation, and the background correspond to the economic sectors
activity, as represented in the national accounting systems.
The integration of the two models has to be done carefully because on
one side the foreground and background matrix have different units, and, on
the other, it is necessary to avoid duplication of material/processes
accounting.
The algebraic formulation of this model is as follows. In the foreground
system, let the external demand of process output i be given as k, where the
use of tilde denotes any activity in the foreground system. If the technical
coefficients of the foreground system quantify the products/commodities
required in each process, for accomplishing one unit activity level, t, the
technical coefficients are denoted by, Ã , and:

Ã.t = k (2)

This equation can be solved for t (unit activity level required by each
process) by inverting the technology matrix à and multiplying it with the
vector of external demand of process output k.

T = Ã-1. k (3)

Considering that the environmental burdens associated with the processes


~
in the foreground system are expressed by b , the environmental
interventions are expressed in the foreground process as:

~
B = b . Ã-1 . k (4)

The matrix b is the intervention matrix, since its coefficients represent


interventions of the different economic processes in the environment: inputs
372 P.M.C. Ferrão

(mainly extractions of resources) and outputs (mainly emissions of


chemicals)
If we consider the formulation of the emissions for the foreground system
(4) with the one for Input-Output (1), the Hybrid method can be represented
by the following general expression:

~ 1
~ ªA M º
B >
b b ˜«
L
@I  A
» ˜k (5)
¬ ¼

The methodology used to create the matrix of coefficients and to


normalize the foreground and background units of the process, can be
calculated by the expressions (6) and (7), [17]. L and M denotes inputs from
background and foreground systems to one another, respectively. In linking
the foreground and background matrix the dimension of elements for L and
M matrices should meet with corresponding rows and columns. L shows
monetary input to each sector per given operation time, while M shows total
physical output per total production in monetary terms.

lpq = qpq pp (6)

 apq
mpq (7)
pp

Where,
qpq : input of sector p in each unit process q,
pp : unit price of product from sector p,
apq : technical coefficient from economic matrix

The formulation of the hybrid methodology has been implemented in a


dedicated software developed at IST, [8]. This software enables the user to
select the products/raw materials/energy sources requested, from a database
where more than 12000 items are available for the Portuguese economy
characterization. This data includes the products/raw materials/energy
designation, quantities consumed/produced per sector and its average price,
information that is crucial to model the purchases of the foreground
processes in the background economy.
When the software is run, the following steps have to be followed:

1. Characterization of the foreground processes, making use of the


following information:
Industrial Ecology: A Step towards Sustainable Development 373

x Process Available Products - where the available products,


raw materials or energy to be consumed in the process are
displayed.
x Demand - amount of products, raw material or energy chosen
to be consumed in the process.
x Activity level - amount of the process unit activity used in the
functional unit.
2. Identification of the sectors available in the Portuguese EIO tables,
which are used in the Process: Here the sectors of the EIO matrix
that are part of the foreground processes are identified, and the
amount used is quantified.
3. Characterization of the environmental burdens associated with the
foreground process.

Once identified the foreground processes and the respective commodities


consumed, the software automatically fulfils matrix M in (7). These
calculations are based on each commodity price, provided by national
statistics, which is available in the program databases, and on the technical
coefficients in the background system, for the economic sector in which the
commodity is classified.
A more basic example of the use of EIO-LCA in the Portuguese
economy, [8], is provided by the evaluation of the contribution of each
economic sector in terms of the GHG, which is relevant in defining policies
under the framework of the Kyoto protocol. Those figures can be obtained
through the National Accounts Matrix Including Environmental Accounts
(NAMEA). Data available for 1993 and 1995, allows for characterizing the
evolution of the Global Warming Potential (GWP) and GDP per economic
sector between 1993 and 1995. Figure 6 has been designed to provide the
basis for this analysis. The evolution of the GWP verified between 1993 and
1995 has been plotted as a function of the evolution of their contribution to
the national GDP. The characteristics of the evolution of each economic
sector have been associated to their location in the graph. Sectors for which
the economic development rate is higher than the GWP growth rate may be
considered weak sustainability providers, while those sectors that combine
economic growth with a decrease of its environmental impact can be
considered to have an evolution towards sustainability.
374 P.M.C. Ferrão

40% 'GWP

7
20% 3 1. Textile & clothing industry
2. Construction
Non-sustainable 2
3. Transport vehicles and equipment
5
4. Agriculture & hunting
8 'GDP
4 Total 5. Chemical products
1
-40% -20% 20% 40% 6. Financial services
7. Services rendered for companies
9
Towards sustainability
8. Petroleum
-20% 9. Restaurants & Hotels

-40%

Figure 6. Evolution of the contribution of different economic sectors for GWP as a function
of their contribution to the GNP, between 1993 and 1995, [8].

The analysis of figure 6, shows that between 1993 and 1995, the
Portuguese economy has grown by 10%, and this has corresponded to an
increase in GWP of only 0.6%. This evolution can be interpreted as an
increase of the global efficiency of the economy, considering its contribution
to global warming. In fact, the environmental performance of each economic
sector lays shortly below the diagonal in Figure 6, which can be interpreted
as “Business As Usual”, with the exception of textile and petroleum sectors,
which have increased their contribution to the national GNP while
decreasing the impact on global warming, and this was due to productivity
gains combined with environmental improvement programs, particularly in
the oil sector.

4.4 Ecodesign

Tools that extend the temporal and spatial world that frame a designer’s
problem space can expand the limited horizons that have been developed
over time [2]. Life cycle analysis, although perhaps not developed as a
methodology for this specific purpose, certainly has such a potential. Most
tools used by designers are mainly focused on the way a product is to be
used and then only on those aspects that were of concern to the user.
Designers now have tools that can inform them of the totality of
environmental impacts created by a product or service over its whole life
cycle. If they now wish to innovate in these aspects of a product or service,
they can obtain information to guide their processes.
Industrial Ecology: A Step towards Sustainable Development 375

The development of environmentally sound products requires new


paradigms in the product development process and new approaches in
particular regarding a computerized virtual product development process. In
this context, a new Ecodesign software tool has been developed at IST,
which incorporates novel design for recycling (DfR) strategies that combine
the use of emerging technologies dedicated to shredder residue (SR)
recycling, together with design for dismantling (DfD) strategies [18].
In fact, the current understanding about DfR of complex products that
incorporate nonmetallic materials has been, until now, closely related to the
practice of design for disassembly, which allows for the separation and use
of recycled materials in substitution of their virgin correspondents. There are
three primary reasons why disassembly is practiced: (1) to service or repair a
product during its useful life, (2) as end-of-life separation to recover
valuable parts and/or materials, and (3) for the removal of hazardous or toxic
parts and/or materials.
However, the relatively high costs associated with disassembly
operations are leading to the development of new SR separation
technologies, aimed at recycling nonmetallic materials after shredding, and
this is a new approach to DfR that requires new tools to be considered at a
design level.
As, in particular, car manufacturers are responsible for the overall vehicle
life cycle, the widely adopted solution to exercise this responsibility
consisted in the establishment of ELV managing societies, all over the EU.
These societies are generally governed by car manufacturers, and they
enable, for the first time in history, their control over the complete ELV
processing chain. It is within this framework that a consistent effort has been
put forward in the development of new technologies dedicated to recycling
automotive shredder residues as an alternative/complement to more labor-
intensive dismantling activities. This effort has been focused on upgrading
the available technologies for processing the light and heavy fractions of the
SR, [19], namely by developing separation technologies and finding
recycling possibilities for the products gained from the separation.
These technological innovations provide a great motivation for the
development of new DfR strategies, but this requires the ability to manage
information on ELV treatment technologies that considers both strategies,
disassembly and SR recycling. The development of a new DfR tool on these
emerging premises constituted the main contribution of the work developed
at IST, which resulted in a series of algorithms implemented in a new
software tool [18], illustrated in Figure 7.
376 P.M.C. Ferrão

Figure 7. Sample view of the DfE software tool developed.

The new methodology provides the identification of economically


optimum recycling strategies for achieving given recycling and reuse rates,
by combining dismantling, shredding, and post-shredding activities.
An innovative approach was adopted in the software development, by
making use of genetic algorithms, and this reduces computation time and
user intervention. This is done while ensuring that the information on parts
connection is considered in the optimization of the disassembly sequence,
together with a novel introduction of the concept of shredding and shredder
residue recycling as alternative operations.
The new approach tested in some case studies shows that the solutions to
be adopted at a design level to improve recyclability are not necessarily
based on substituting all the fixed connections, which enable easy
dismantling by new techniques/materials that make use of fast-removing
connections. The ecodesign strategy to be adopted depends on the targeted
recycling rate, the cost of recycling and, above all, on the characteristics of
the local recycling infrastructure. Therefore, the software allows for
customization of the techno-economical parameters that characterize the
main operators in these infrastructures.
Industrial Ecology: A Step towards Sustainable Development 377

4.5 LCAA – Life Cycle Activity Analysis

LCAA is a multidisciplinary based tool that integrates engineering,


environmental and economical sciences, including operations research,
solving for optimal solutions of multivariable complex systems, and can thus
be interpreted as a new sustainable systems design tool.
Environmental analysis tools like Life Cycle Assessment (LCA) or
Materials Flows Analysis (MFA), however valuable, generally do not
include the description of economic mechanisms (allocation, optimization,
substitution) or costs and benefits. Traditional economic models, on the
other hand, have mainly focused on the general notion of externalities and do
not explicitly describe the flows and transformation of materials.
In this context, a new analytic tool: Life Cycle Activity Analysis (LCAA)
was proposed, [20,21], which ties mathematical programming formulations
of activity analysis to their environmental impacts. LCAA is based on the
integration of Activity Analysis, a well-known procedure in economics,
solving for optimal levels of production and for the optimal allocation of
resources, with environmental Life Cycle Assessment, which aims to
quantify the environmental impacts of a product or a service from the
“cradle” to the “grave”.
The classical formulation of Activity Analysis distinguishes three classes
of goods: primary goods (natural resources or materials), intermediate goods
and final goods (outputs). LCAA extends the concept of linear activities to
embrace mass and energy fluxes over the entire life cycle of products. In
particular, the proposed LCAA model includes one additional category of
goods: "environmental goods", which represent the emissions of pollutants,
energy consumption and the dumping of waste. These environmental outputs
can be further aggregated into a number of environmental impact categories,
such as global warming, ozone depletion, etc. This approach links up with
the development of Life Cycle Assessment methodology and its aim is
twofold. Firstly, it interprets the environmental burdens included in the
output table in terms of environmental problems or hazards. Secondly, it
aggregates the data for practical reasons, particularly for decision-making.
The mathematical model of LCAA uses an input-output format, and may
have the following formulation [20,21]:

Decision variables, to be determined:


x is a column vector of levels of production activities,
t is a column vector of levels of transportation activities,
w is a column vector of supply levels of primary resources.
378 P.M.C. Ferrão

Parameters:
Apr is a matrix of input coefficients; each element denotes the quantity of
inputs required to operate a production activity at unit level;
Atr is a matrix of input coefficients; each element denotes the quantity of
resources (e.g. fuel) required to operate a transportation activity at unit
level;
Bpr is a matrix of output coefficients; each element is the quantity of
outputs obtained when an activity is operated at unit level;
Btr is a matrix of output coefficients; each element denotes the quantity
of outputs emitted when a transportation activity is operated at unit
level;
cpr is a row vector of unit costs of operating the various production
activities, it is known and given;
ctr is a row vector of unit costs of operating the various transportation
activities, it is known and given;
crs is a row vector of unit costs of primary resources, it is known and
given;
d is a column vector of final demand, it is known and given;
g is a column vector of environmental goals set by a policy-maker.

The list of goods is partitioned into four classes: inputs of primary goods
(P); intermediate goods (I); final goods (F) and environmental goods (E).
Correspondingly, matrices Apr and Bpr become partitioned into: Apr = (-AP , -
AI , 0, -AE ) and Bpr = (0, BI , BF , BE ). Conventionally, one enters the A-
coefficient of each input with a minus sign and the B-coefficient of each
output with a plus sign. This format includes the possibility of having -AE,
i.e. sinks of pollutants. Matrices Atr and Btr, however, are only partitioned
into Atr = (-Aprp) and Btr = (BtrE), since the list of goods used in the
transportation activities only include primary resources and environmental
emissions (no intermediate or final goods are considered). The basic
mathematical format of Life Cycle Activity Analysis can now be written as
the following linear program:

min cpr . x + ctr . t + crs . w (8)


subject to
-APpr . x - APtr . t + w t0 (9)
I I
(- A pr + B pr ) . x =0 (10)
BFpr . x td (11)
(-BEpr +AEpr) . x - BEtr . t t -g (12)
x, t, w t0 (13)
Industrial Ecology: A Step towards Sustainable Development 379

To assure that, for each intermediate commodity in each link, there is


conservation of the quantities of goods being produced, transported and used
in the subsequent activities, additional equations have to be included. In
short, one equation is needed for balancing the quantity of each intermediate
good leaving a region and another equation should be added for balancing
each intermediate good entering a region.
In addition, the x, t and w vectors may be bounded from above, to reflect
capacity constraints of production and transportation activities and on the
availability of primary resources. Capacity bounds can also be included to
reflect current behavioral patterns or to impose environmental policy
options.
The objective is to minimize the sum of all current unit costs and the
costs of all primary resources (8). Constraint (9) establishes the balance
between the quantities of primary resources used by the activities and the
amounts extracted from the environment. Constraint (10) states market
clearing for the intermediate goods. Constraint (11) says that the demand
must be satisfied. Constraint (12) states that the environmental impacts
should be at most equal to the targets defined (vector g).
This formulation shows that LCAA integrates engineering,
environmental and economical sciences, including operations research, and
that it solves for optimal solutions of multivariable complex systems, and
can be interpreted as a new Industrial Ecology tool, in that it can be used to
promote optimum systems design, ecologically and environmentally.
The feasibility and potential of the LCAA methodology for optimizing
the life cycle of products, with emphasis in alternative end of life processing
activities was demonstrated through the analysis of integrated economic,
environmental, energy and product system models, developed and applied to
specific case studies [20, 21].

5. INDUSTRIAL ECOLOGY R&D: INTERACTION


WITH THE SOCIETY

The European Union has introduced several policy instruments based in


the Extended Producer Responsibility (EPR), in order to improve the
environmental performance of products and services through their life-
cycles, and, in particular, their end-of-life (EOL) disposal.
In this context, the Portuguese government has been creating legal
frameworks for extending the responsibility over different types of products,
mandating producers and importers to achieve end-of-life recycling and
recovery targets. As a consequence, landfill and incineration without energy
recovery were severely limited and collection, recycling and reuse were
380 P.M.C. Ferrão

strongly promoted. In practice, producers were obliged to constitute end-of-


life management system to accomplish these objectives. Several products
such as vehicles, tyres, lubricants, electric and electronic processes went
through this processes and IST made use of the experience gained through
the R&D on Industrial Ecology to support the design and implementation of
these end-of-life product management societies.
The case study of the end-of life tyre management society is here briefly
reported. The Portuguese Government assumed that the end-of-life tyres
management was a matter of great environmental importance and therefore
introduced specific laws establishing the principles and the norms applied to
the end-of-life tyres management. Tyre disposal targets required that
producers should accomplish the following:

Until January 2003:


x collecting of 85% of the end-of-life tyres;
x retread of 25% of the end-of-life tyres;
x recycling and incineration of all recovered tyres, that are not
retreaded, which, at least, 60% must be recycled.
Until January 2007:
x collecting of 95% of the end-of-life tyres;
x retread of 30% of the end-of-life tyres;
x recycling and incineration of all recovered tyres, that are not
retreaded, which, at least, 65% must be recycled.

This required producers, distributors, recyclers and retreaders to be


identified and characterized and local processing infrastructures to be
analyzed. The information gathered was the basis for the development of a
strategic approach for the end-of-life management system, according to 3
key ideas, in the following order: the environmental and legal targets
fulfillment, the targets fulfillment in an economical reasonable way and the
minimization of distortions in market practices and within major
stakeholders. The work developed at IST, based on the Industrial Ecology
R&D reported here, contributed to design an economically optimized EOL
tyre management system and this resulted in the establishment of a “green
fee” to be paid when a new tyre is acquired. The challenge was to keep it to
the minimum while promoting the activity of all EOL operators in order for
the targets to be fulfilled.
This process resulted in the formation of the end-of-life tyre management
society, Valorpneu, which was licensed by the government. Valorpneu has
started its operation by February 2003, with a network of 11 collection
centers. In the end of the year, the number of collection centers increased to
Industrial Ecology: A Step towards Sustainable Development 381

28, significantly reducing the travel distance necessary to distributors and


other agents to dispose their end-of-life tyres.
Valorpneu has managed to fulfill its collection, retreading and recycling
targets immediately in the first year of operation and ensured financial and
organizational stability in the Portuguese end-of-life markets and
infrastructures, particularly in relation to recyclers. Additionally, Valorpneu
has assumed the responsibility for the EOL tyres stocks that have been
accumulated in the country during previous years and were abandoned in the
environment causing environmental problems. This stock, estimated as
60.000 ton, is to be processed until 2007.
Finally, it should be mentioned that the Industrial Ecology R&D in
Portugal is interacting with society in other relevant areas, such as:

x Supporting the establishment of new waste management policies


and practices.
x The establishment of eco-industrial parks.
x The identification of new business opportunities, derived from
innovative solutions to new environmental challenges.

Overall, it can be concluded that R&D in Industrial ecology cannot be


decoupled from the society with which it interacts and that this interaction
between science and society constitutes, after all, an intrinsic characteristic
of a scientific domain which is intended to contribute to promote a
sustainable development.

6. CONCLUSIONS

Industrial ecology, as a broad framework for thinking and acting in the


realm of sustainability is based on the metaphor that looks at industrial
economies as ecological systems, and is intended to improve the design of
firms and larger social institutions, i.e., complex systems, looking forward to
be a step towards sustainable development.
Industrial Ecology promotes a holistic view of engineering systems
which requires the development of a set of tools to bridge different scales,
from site or product specific analysis to the whole economy and from the
economic to the socio-environmental dimension, thus resulting in a multi-
disciplinary set of analytical tools, the “Industrial ecology toolbox”.
The need for an Industrial Ecology Toolbox is based in the assumption
that the characterization of the physical nature of human economy is vital for
understanding the sources and full nature of impacts of society upon the
natural environment. It is similarly assumed that effective strategies toward
382 P.M.C. Ferrão

sustainable development will rely on the systematic collection of physical


measures of material and energy flows.
This paper analyzed the R&D developed at IST-UTL aimed at providing
a coherent framework for this “Industrial ecology toolbox” and its
contribution to the promotion of sustainable development policies and
practices in the socio-economic arena.
The different tools analyzed range from macro-economic techniques to
specific and local dependent environmental analysis tools, and it was shown
how other tools could be developed and used to promote the interaction
between economic and environmental analysis within macro and micro-
scales, thus enabling the design of more sustainable systems of different
complexity levels.

REFERENCES
1. P. Ferrão and M. V. Heitor, Integrating environmental policy and business strategies:
The need for innovative management in industry. P. Conceição, D. Gibson, M. Heitor
and S. Shariq eds. Science Technology and Innovation Policy: opportunities and
challenges for the knowledge economy, Quorum Books, 503-518, 2000.
2. J. Ehrenfeld, P. Ferrão and I. Reis, Tools to Support Innovation of Sustainable Product
Systems. Knowledge for Inclusive Development, Intl. Series on Technology Policy and
Innovation, Quorum Books, 417-433, 2002.
3. J. R. Ehrenfeld, Industrial ecology: a framework for product and process design. Journal
of Cleaner Production, 5 (1-2), 87-95, 1997.
4. P. Daniels, S. Moore, Approaches for quantifying the metabolism of physical economies
- Part I: Methodological Overview. Journal of Industrial Ecology, 5 (4), 69-93, 2002.
5. P. Daniels, S. Moore, Approaches for quantifying the metabolism of physical economies
- Part II: Review of Individual Approaches. Journal of Industrial Ecology, 6 (1), 65-88,
2002.
6. P. Ferrão, Introdução à gestão ambiental: A avaliação do ciclo de vida de produtos. IST
PRESS, Lisbon, Portugal, 1998. (In Portuguese).
7. J. Nhambiu, P. Ferrão, M. Baptista, M. Quintela, Environmental accounting of the
Portuguese Economy: a tool to support Policy Making. ConnAccount Conference.
Stockholm, Sweeden, 26-29 June, 2001.
8. P. Ferrão and J. Nhambiu, The use of EIO-LCA in assessing national environmental
polices under the Kyoto protocol: the Portuguese economy. Paper accepted for
publication in: International Journal of Technology, Policy and Management, 2006.
9. S. Niza and P. Ferrão, A transitional economy´s metabolism: The case of Portugal.
Resources, Conservation and Recycling, 46, 265-280, 2006.
10. Eurostat, Economy-wide Flow Accounts and Derived Indicators. A Methodological
Guide. Luxembourg, 2001.
11. Canas, P. Ferrão and P. Conceição, A new environmental kuznets curve? Relationship
between direct material input and income per capita: evidence from industrialized
countries. Ecological Economics, 46 (2), 217-229, 2003.
12. P. Ferrão and J. Figueiredo eds. A ecologia industrial e o automóvel em Portugal. Celta
Editora, 2000.
Industrial Ecology: A Step towards Sustainable Development 383

13. P. Ferrão, I. Reis, and J. Amaral, The Industrial Ecology of the Automobile: a
Portuguese perspective. International Journal of Ecology and Environmental Sciences,
28, 27-34, 2002.
14. P. Ferrão, P. Ribeiro and P. Silva, A ecologia industrial e a embalagem de produtos
alimentares em Portugal. Celta Editores, 2005. (In Portuguese).
15. Giacommucci, M. Graziolo, P. Ferrão and A. Caldeira Pires, Environmental assessment
in the electromechanical industry. P. Conceição, D. Gibson, M. Heitor and F. Veloso
eds. Knowledge for the Inclusive Development, Quorum Books, 465-476, 2002. 1998.
model. Ecological Economics, 48 (4), 451 – 467, 2004.
16. T. E. Graedel and B. R. Allenby, Industrial Ecology and the Automobile, Prentice Hall,
17. S. Suh, Functions, commodities and environmental impacts in an ecological economic
model. Ecological Economics, 48(4), 451 – 467, 2004
18. J. Amaral and P. Ferrão, Design for recycling in the auto industry: new approaches and
new tools. Journal of Engineering Design, 17-3, 2006.
19. P. Ferrão and J. Amaral, Assessing the economics of auto recycling activities in relation
to European Union Directive on End of Life Vehicles. Technological Forecasting and
Social Change, 73, 277-289, 2006.
20. F. Freire, P. Ferrao, C. Reis, S. Thore, Life Cycle Activity Analysis Applied to the
Portuguese Used Tire Market, SAE Transactions, American Technical Publishers, 109,
1980-1988, 2000.
21. F. Freire, S. Thore and P. Ferrão, Life Cycle Activity Analysis: Logistics and
environmental policies for bottled water in Portugal, OR Spektrum, 23 (1), 159-182,
2001.
FORESTS FOR THE 21st CENTURY?

João Santos Pereira1 , Helena Martins2 and José G. C. Borges3


Departamento de Engenharia Florestal, Instituto Superior de Agronomia, Universidade
Técnica de Lisboa, Tapada da Ajuda, 1349-018, Lisboa, Portugal

1
jspereira@isa.utl.pt, 2 hmartins@isa.utl.pt, 3 joseborges@isa.utl.pt

Abstract: The present Portuguese forests resulted from reforestation in the context of
socio-economic changes in rural areas, which have been occurring since
middle of the 20th century. Hence, some of its vulnerabilities are related to the
lack of tradition in the management of forests in a country where agriculture
was the dominant activity until recently. In addition to the vulnerabilities
resulting from inadequate management forests are facing today, as well as in
the future, changes in the environment (climate) and in potentially harmful
biotic invasions - e.g. pests and diseases - must be taken into account.
Simultaneously, global markets change as well as the impact of agricultural
and rural development policies on forestry. In spite of these constraints, it is a
relevant sector for the national economy that claims for research results to
support policies and management planning concerning its sustainability. This
paper describes the national forest sector in its economic, social and
environmental facets. Based on current research, we discuss some of the
impacts of climate change scenarios and the new forest management
paradigms on the forests of the future in the Portuguese continental territory..

Key words: Forest sciences, challenges, research priorities, future scenarios.

1. INTRODUCTION

According to the last National Forest Inventory, forest areas represent


around 38% of the national territory [1]. This area tends to increase,
following the abandonment of less competitive agriculture. In spite of
several constraints to productivity related to poor management, production
forests have been acknowledged as an important source of income,

385
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 385–400.
© 2007 Springer. Printed in the Netherlands.
386 J.S. Pereira et al.

especially in what concerns timber and cork. Forests are also attractive for
leisure activities and a few have important historical and cultural value.
Moreover, a large proportion of the forests in national territory are
associated to high biodiversity and important ecological values, what led to
the inclusion of 162 613 ha in protected areas and 594 509 ha in Natura 2000
Network.
As with other natural resources, management of forest resources at
present is complex, involving concepts such as sustainability, multi-purpose
management and public participation, which poses new challenges to
managers and policy makers. Scientists have an important role in providing
credible and relevant information that can support reasoned decision-
making. This contribution involves a better understanding of forest systems
and the assessment of the probable consequences associated with various
proposed management actions [2, 3]. Additionally, it has been argued that
scientists have to make sure that this information is interpreted and used
correctly [4].
The present work intends to briefly characterize the challenges that
forests face in the present and how these and new problems will affect
forests in the near future. A future scenario of Portuguese forests will be
built based on the trends that are already possible to identify in terms of
climate change, evolution of the characteristics of forest systems, of the
demands of products and services and of availability of information and
tools to support management decision making.

2. THE PORTUGUESE FORESTS

To understand the future we need to know the past. Most of the existing
Portuguese forests were either created (re-created) by man or allowed to
grow over abandoned agricultural land, as part of a secondary ecological
succession, on a changing rural territory. We do not know when did
deforestation of the Portuguese territory begun but should have been in pre-
historical times [5, 6]. Forests were progressively cut for firewood and
construction, and replaced by pasture and crop fields. During the Middle-
Ages forests were protected for timber, but also hunting for the royalty and
upper classes. There was conscience that forests not only produced wood
and fruits but also provided environmental services. The most famous story
about royal action to protect forests was protagonised by King Dinis, in the
13th century. He expanded and improved one of the most charismatic pine
forest in Portugal (Pinhal de Leiria), in order to prevent the invasion of
arable fields by sand dunes. It was also by this time that the profession of
Monteiro-Mor (sort of game keeper) was created in order to have personnel
Forests for the 21st Century? 387

responsible for the protection of the natural resources of woodlands,


especially game. This profession has been considered the ancestor of the
forester. In 1751, the management of forest areas became a responsibility of
the navy. Its activity and excess of power were, however, limited by the end
of the 18th century by intervention of Queen Maria I [7].
The modern concept of forester was introduced in the beginning of the
19th century with José Bonifácio de Andrada e Silva, who had attended to
lectures provided by the Monteiro-Mor of Brandenburg, in Germany. He
became the first trained forester in Portugal. He was also author of the first
published work about forestry, "Memoir about the need and the usefulness of
planting woodlands in Portugal". This book represented a call for a urgent
need to protect and increase forest areas, which were restricted by then to
10% of the total national territory. It was then created the position of General
Administrator of Woodlands, which could only be occupied by personnel
with education in Natural Sciences and practical knowledge on woodlands
management [7, 8]. The main concern and priority of foresters by that time
was to expand the area covered by forests.
In spite of earlier attempts of reforestation, the area of production forest
increased only by the beginning of the 20th century (Table 1). However, from
then on "the expansion (of the forest) was more extensive and faster in
Portugal than in the other European countries" [9]. This expansion was
mainly due to the expansion of production forest and it was a natural
response to the abandonment of agriculture, the delayed industrialization and
the emigration from the agricultural world [9]. Given the high speed of
"forest transition" (i.e., the change of a situation of liquid deforestation for a
situation of net reforestation) a true forestry tradition was not created.
Even though the results of the last National Forest Inventory were not yet
fully published, an increasing trend is expected, as an answer to the
persisting abandonment of less competitive agricultural areas and the
increasing demand for forest products and services. For example, the
‘paperless office’ scenario developed in the 70´s of 20th century did not
materialise. Several studies showed that new technologies contributed for the
growth of paper consumption for communication [10].

Table 1. Evolution of the forest area from 1874-1998 [adapted, 1, 11].


Year 1874 1902 1920 1934 1939 1970 1980 1990 1998
10^3 600 1900 2000 2500 2500 2800 3000 3000 3350
ha

The most important forest systems in Portugal are maritime pine forests,
the cork oak stands and the eucalyptus plantations (Table 2). The area
occupied by cork oak stands has increased due to the economic interest
surrounding cork and its industrialization since the beginning of the 20th
388 J.S. Pereira et al.

century. The area of eucalyptus has also increased following the installation
of the pulp industry.

Table 2. Evolution of the areas of the main species in (10^3 ha) [adapted, 1, 11].
Year 1902 1928 1940 1956 1980 1988 1995 1998
Maritime 430 1000 1161 1288 1300 1300 1140 976
pine
Cork oak 366 560 690 637 650 670 720 713
Holm oak 417 380 360 579 535 470 460 462
Eucalyptus - 10 - 99 215 360 690 672
Others 743 160 256 223 310 200 270 379

Several aspects have been constraining the productivity and the


profitability of these forest systems. Alves [12] classified them into two
major groups: technical-economic and institutional. Inadequate or non-
existing silviculture and management practices account for heavy losses in
productivity, particularly in the case of maritime pine [13, 14]. Moreover,
they enhance the risk of fire. The industry values little the wood and the too
frequent fires impose a high risk for the investment in management.
Concomitantly, the area of pine forest will tend to decrease. The structure of
land ownership, absenteeism or old age and the low bargaining power of
forest landowners are the main factors that still contribute to this situation.
Management of eucalypt plantations compares somewhat favorably with
maritime pine as the pulp industry invests in forestry and manages a
substantial eucalypt forest area. Nevertheless, climate and soil conditions of
the territory and the environmental concerns of the urban population appear
to limit the possibility of expanding the area of plantations. However,
investment on forestry research and development, especially on tree
breeding, may lead to higher productivity and offset the lack of new
plantations.
Cork oak forests are truly agro-forestry systems in which dry land
agriculture, cattle and pig rearing coexisted. For most of the last 100 years,
cork has been a major product. The striping of cork is done on regular basis,
every 9 years on average, after a minimum tree diameter is reached.
Productivity has been declining as a consequence of several factors. Among
them are poor silviculture, namely poor cork striping practices, and
inadequate agricultural practices and over-grazing. This combined with root
diseases and adverse climatic conditions led to cork oak mortality and to
lower stand density. Cork industry invests little at the forestry level and
policy instruments aiming at cork oak protection have not been effective
[15]. The heavy dependence of cork oak forests upon a single commercial
product, cork, makes them more vulnerable to changes in public preferences
and markets.
Forests for the 21st Century? 389

Most forests are privately owned. In the case of pine forests, in addition
to the constraints listed above, there are land tenure and investment
problems, namely the small size of forest holdings - the average private
property area is lower than 5 hectares - and private owners old age, low
literacy or absenteeism, which represent limitations to adequate and forest
management [16]. Moreover, there is no tradition of association among
landowners. These rarely have bargaining power and forest products prices
are often set by firms who specialize in harvesting and trade. For decades,
the state has been mostly concerned with the management of its own forests
and did not promoted a coherent forest policy capable of confronting current
challenges.
Forest fires are a major constraint on forest production, management and
investment. In five years (2001-2005), about 20% of forested and native
(successional) vegetation burnt. The same had happened every year to an
average of ca. 4% of the maritime pine forest area (ca. 1 million ha) (DGRF,
non published results). This is simply too risky for business and insurance
companies.
In summary, the problems that the forest sector faces must be analyzed
within the framework of the national development options and current
macroeconomic policy. The fragilities of the Portuguese forest systems
mentioned above can be included in a broad group of aspects that have
promoted intervention areas at the international agreements for forest
sustainability. In the 2nd Ministerial Conference on the Protection of Forests
in Europe (MCPFE) in Helsinki, 1993, the signatory countries agreed upon
the definition of sustainable forest management and accepted the
implications of it on forest management (Resolutions H1 and H2). It
concerns [11]: 1) avoidance of activities that cause the degradation of forest
soils, bio-physical conditions, fauna and flora and of their influence
(services); 2) encouragement of multipurpose forestry and support of its cost
by society in general; 3) stable policies and instruments to support decision
making at territory management; 4) adequate planning of forest management
based on resources inventory and environmental impact assessment; 5) the
best combination of products and services for a specific forest area while
keeping an economic and ecological rationality; 6) protection of susceptible
areas due to the presence of ecological, cultural and landscape values as well
as water resources; 7) to both maintain and improve the stability, the vitality,
the regenerative ability, the resistance and the ability of forest ecosystems to
adapt to disturbance, fire, diseases and pests and overexploitation of its
resources; 8) use well adapted species; 9) adaptability of forest species to
local conditions; 10) to promote the recycling and the use for energetic
purposes of forest products and debris. These intervention areas have been
390 J.S. Pereira et al.

motivating and inspiring the most recent policies and funding orientations
for the national forest sector.

3. CLIMATE CHANGE SCENARIOS AND FORESTS

The carbon dioxide (CO2) concentration in the atmosphere has been


rising since the pre-industrial era, with a long-term concentration of 280 ppm
(parts per million), reaching today (2005) near 375 ppm. Being a greenhouse
gas (GHG) this increase in CO2 may contribute to global warming. There is
strong evidence that most of the observed recent climate change is
attributable to GHGs emissions resulting from human activities, especially
from burning fossil fuels and land-use changes (namely deforestation). The
magnitude and the speed of current climate change are unprecedented at
least in the last thousand years.
Practically the whole of the territory of continental Portugal is under the
climatic influence of the Mediterranean, i.e. long dry summers and mild
rainy winters. Most scenarios of regional climate change affecting Portugal
point out to the warming of winters and an increase of the length of the dry
season [17]. Furthermore, the peculiarities of the pluviometric patterns
produce relatively frequently climatic hazards like floods and droughts [18],
or heat waves.
In brief, the impacts of these scenarios of relevance to the future of
national forests might be [19]:
Elevated CO2 - per se, this can result in a modest increase in productivity
of trees. However, part of these gains may be offset by the increased length
of the dry summer season, which may result in severe plant water stress;
Warming - the higher winter and spring temperatures may beneficial. The
longer growing season resulting from earlier bud break in the spring may
improve productivity in absence of water deficits;
Lower rainfall in spring - may cause more severe water deficits and
greater tree vulnerability to climatic variation. Water availability determines,
(1) the mere existence of forests (below a lower limit of annual rainfall – ca.
500 mm annual rainfall at the latitudes of the Portuguese mainland – forests
are no longer the dominant biomes, being substituted by savannas or deserts)
and (2) the primary productivity of forests, which is often correlated with
available water (e.g., as measured by the actual evapo-transpiration). These
changes in growth and survival of plants may also affect their geographical
distribution as well as the physiognomy and composition of plant
communities. The rapidity of climatic change and habitat fragmentation due
to human activity, inhibit natural plant species migration. That may be
accentuated by the aggravation of the meteorological risk of forest fire;
Forests for the 21st Century? 391

Greater frequency of extreme events - may trigger a greater frequency


and increasing risk of forest fires [19, 20, 21];
Catastrophic droughts - may irreversibly change the geographical
distribution of tree species [22] and, for that matter, ecosystem function.
There are several cases reported in the USA and Spain, for instance.
The previously mentioned impacts may induce pest and diseases
outbreaks and increase forest species vulnerability to them. The temperature
increase in winter and spring and the rainfall decrease might cause outbreaks
of both native and invader species. The increase of water deficit and of fire
occurrence can also favour pests due to the higher vulnerability and
mortality of trees. A higher vulnerability to perforating insects can promote
the attack of certain diseases that insects are vector of. High temperatures
together with high humidity can promote dissemination through soil of
pathogenic fungi. Finally, there will be a higher risk of invasion of tropical
and sub-tropical pest and diseases.
In summary, the scenarios of climate change indicate a shift in the
geographical optima for most forest tree species and a higher risk of low
productivity and high mortality in the areas of present occupation. This
requires certainly a major understanding of the adaptation of forests to the
new scenarios.
Another aspect worth mentioning on this topic is the role that forests
might have in the mitigation of emissions of GHGs. The Kyoto Protocol and
the Marrakech agreements have increased the interest in using forests as a
cost-effective mean of mitigating CO2 emissions, through their capacity to
store carbon, while providing other environmental, economic and social
benefits. In long-term perspective, though, carbon accumulated in forests
will be released back to the atmosphere as a result of the decay of litter and
soil organic matter and as a consequence of disturbances in forests. Climate
warming may enhance the rate of ecosystem respiration, i.e., carbon loss
from soil and vegetation. Therefore, in order to understand if a specific
forest acts as a long run sink or a source for atmospheric CO2, it is thus
necessary to measure the net change in the carbon stocks of forest systems
(vegetation, soil, products) [23, 24].
The present capacity of Portuguese forests to store carbon may be
considered potentially high [19]. In average, European forests store annually
124 g C per m-2 of forest area, with a coefficient of variation of 62%. In
comparison, Portuguese pine forests and eucalyptus plantations can be much
more productive, with a NEP (Net Ecossystem Productivity) greater than
800 gCm-2year-1. The year-long eddy covariance carbon dioxide fluxes were
measured from 2003 through 2005 at the Portuguese CarboEurope forest site
‘Herdade da Espirra’ (38º38’N, 8º36’W) with a 11 years old Eucalyptus
globulus plantation (300 ha). The forest acted as a strong carbon sink
392 J.S. Pereira et al.

reaching an uptake of 938 gCm-2year-1 in 2003. The precipitation was near


the 30 years mean value (709mm). In 2004, however, the carbon uptake was
reduced by 10% as a consequence of lower precipitation (48% below the
mean value). A further reduction in rainfall in 2005 (rainfall between
January and September was reduced by 67% relative to the precipitation in
the same period in 2003), led to a reduction NEP of 56% relative to NEP
between January and September of 2003 [25]. Cork oak stands are an
exception since they are at present weak sinks (NEP well below 100 gCm-
2year-1), that will tend to turn to sources in years with adverse conditions
like the 2005 drought (see Table 3).

Table 3. The net ecosystem productivity (NEP) measured by an eddy covariance system and
the calculated gross primary productivity (GPP) or total photosynthesis in gm-2year-1 and
annual rainfall, at Mitra (Évora, Portugal; 38º32’26” N, 8º00’01” W, 220-250 m a.s.l.).
year NEP (gm-2year-1) GPP (gm-2year-1) Annual precipitation
(mm)
2003 12 623 639
2004 31 520 469
2005 -35 796 442

In the present, the importance of Portuguese forests as carbon sinks is


lowered by the importance of forest fires [26] (CAC, 2001). This may be
even lower in the future because primary productivity and standing biomass
may decrease due to climate change (warming and drought) [19, 22]. The
question arises whether this could be avoided by enhancing the capacity of
forests to store carbon either by choosing more adapted species or by
adapting forest management.

4. FOREST MANAGEMENT

Forest management at the 19th century had the sole concern of insuring
a regular supply of wood by taking into account financial and biological
parameters that constrained productivity and profitability. Nowadays, it has
to take into account multiple resources and services, impacts and society
expectations. The sustainability and the multi-function forest paradigms have
turned the attention of managers, politicians and society in general to a
broader range of resources and services associated with forest areas. While
undertaking a management planning process, foresters have to take into
account biodiversity, scenic values, and leisure opportunity, among other
more profitability-related concerns. Battaglia et al. have described these new
forest management challenges as being "the demand for a balance between
increasing production and profitability and environmental stewardship and
sustainable management". More recently, there is an increasing awareness
Forests for the 21st Century? 393

for the need of taking also into account fire risk and climate change. In
practical terms this represents an increase in the complexity of forest
management planning that claims for tools and methods to support decision-
making [28]. The integration of information technologies and modeling
techniques resulted in some of the most promising ones such as the Decision
Support Systems (DSS).
Information technologies applied to forest management planning in
general, and DSS in particular, allowed a better understanding of the
relationships between ecology, environment and economy, as well as the
impacts of silviculture interventions. Thus, they allow a more adequate
representation of the complexity of forest systems, higher efficacy and
efficiency, at lower costs [29]. This is due to the ability of DSS integrate
diverse information, models and methods (such as Multicriteria Decision
Analysis methods) in a synergic way. DSS have a modular structure
characterised by four components: 1) an information management system
(that integrates geographic information) that follows a data quality
framework; 2) a prescriptions simulator; 3) a decision model; and 4) a final
module that provides the information concerning the solution proposed. This
modular characteristic allows the integration of further developments and
updated information without requiring a new system. Namely, in order to
attend to specific multifunctional problems DSS might further need to
accommodate traditional biometric data, data on other resources (e.g.
wildlife), new modelling approaches, wildlife habitat models, vegetation
models, biodiversity models and new decision models [30]. Therefore, as
pointed out by Reynolds et al. [30] "there is clearly a need for
multidisciplinary efforts that can bring together modelers, ecologists,
foresters and other expertise/responsibilities in forest management to
effectively address ever-changing challenges".
DSS application is, however, limited by the amount of quantitative
information available. Ill-defined and data-poor forest management
problems cannot be conveniently addressed by these systems. Management
problems of Portuguese forests are frequently of such types, what requires
on one hand, an adaptive approach towards management planning and on the
other hand, qualitative and participatory approaches to management
modeling. In cases where the difficulty in management planning lies on the
lack of understanding of the forest system functions and the responses to
silvicultural interventions, this could be done by integrating monitoring
indicators. The information and models base can then be improved as more
knowledge is gathered while analyzing the evolution of the indicators.
Another approach to adaptive management implies further development
of DSS for forest management in the area of hybrid systems. First, more
effective systems for addressing complex problems might be derived from
394 J.S. Pereira et al.

integrating the best features from among existing systems to create new
hybrid systems. An alternative approach to hybrid systems could be based on
the integration of logic modeling systems, that provide frameworks for
interpreting and synthesizing potentially diverse sources of information
about ecosystem state, with specialized information systems that manage
data and models to generate data inputs for the logic models. A third
promising development concerns the hybridization of systems based on the
technical/rational model with soft systems (qualitative modelling)
approaches in contemporary decision theory such as systems that support
consensus building, collaborative learning, social DSS, etc. This last
approach can complement either of the first two, and overlaps significantly
with our final topic, discussed next. A hybrid DSS of this sort could be
particularly relevant to support the development of management plans for a
recent multi-owner forest land planning unit that has been created in
Portugal, the Areas for Intervention at Forest (ZIF).
The use of information technologies for collaborative planning implies
further developments in its ability to promote communication, exchange of
information, awareness, understanding and trustworthiness. This can be
achieved by developing user-friendly interfaces and visualization tools [31].
Moreover, the integration of internet technology and accompanying
development tools would allow dispersed and asynchronous working [32].
In terms of silviculture, there were also an evolution and adaptation to the
new challenges that forest management imposes. For example, aesthetic and
habitat-related concerns have been limiting the extension, the location and
the moment of interventions. Forests located in protected areas have also
been object of special concern, specifically on the selection of the species,
on the environmental impacts of harvesting, on the protection of native
species and on management activities. The increasing demand of other forest
products such as mushrooms, berries and game have also claimed for an
integrated management, where the management of the canopy and of the
understorey are done in an integrated and complementary way. Moreover,
this integrated management has to balance the amount and the type of
understorey with the fire risk.
Forest management has also evolved in order to increase resistance and
resilience of forests to wildfires. A series of interventions have been defined
to decrease the amount of fuel, to create landscape fragmentation and to
diversify stands' structure and composition.
In what concerns climate change, three perspectives should be considered
when approaching forest management, although they can overlap. The first
implies to consider forest management as a mean to enhance the ability of
forests for carbon sequestration. Several studies have demonstrated that this
could be done by changing rotation length, harvesting period, thinning
Forests for the 21st Century? 395

intensity and the type of products [e.g. 33, 34]. The second perspective
implies the use of wood to produce bio-energy, thus saving fossil fuel
emissions. However, this option could drive forest management into a
simplification, reducing the amount of wood used for other purposes and by
exploring plantations only with that purpose. Finally, the third implies
adaptation of forest management to future forest scenario driven by climate
changes. In this perspective, Pereira et al. [19] present some alternatives that,
depending on the type of forest management, may be considered:
multifunctional structure: silvicultural management should aim at
maximizing the economic output in a multifunctional perspective. Higher
quality products should be promoted (e.g. quality timber, larger dimensions),
supporting the higher cost of environmental sound practices. Possible
increase of natural regeneration, especially on low productivity sites;
monofunctional silviculture: focus on fast growing species and cork oak,
fit to supply the industries. Possible introduction of alienspecies better fit to
future climatic conditions. Tree improvement of commercial species.
Intensive management of plantation sites, allowing for shorter rotations and
maximizing production;
natural silviculture: indigenous species should be used where
afforestation, reforestation is needed. Possible tree improvement
programmes to ensure the survival of endangered indigenous species. Forest
management aims at maintaining healthy and stable forest stands. Priority on
natural regeneration.
Many of the adaptations of forest management mentioned so far are
complementary or overlap. This is basically because the new challenges
have driven forest management practices of multiple purposes towards a
common thread, sustainability. This complex concept leads forest managers
to consider the impact of management practices not only on the forest aspect
they are intended for, but also on the entire forest system.

5. FORESTS IN THE 21st CENTURY

The knowledge that has been gathered so far and the observed trends
allow the analysis of possible future scenarios for the evolution of the
national forest sector. It is expected that the problem of property
fragmentation will be solved by market pressure. Small properties will tend
to aggregate in order to promote a scale effect on production and
profitability, in order to answer to the decrease of the real price of traditional
forest products. Moreover, only through this aggregation will be possible
investment on technology and better silviculture practices, which are
determinant to insure competitive forestry. National governmental most
396 J.S. Pereira et al.

recent policies are promoting this aggregation by providing financial support


to the creation of forest areas with at least 1000 ha (ZIF).
A more urbanised population will tend to look more frequently for forest
areas for leisure activities and traditional forest products. Their awareness to
ecological problems related to soil erosion, depletion of water resources and
global warming will increase. The predictable reinforcement of
environmental protection policies [34], together with the continuing
fragmentation of habitats will increase the importance of forest areas for
conservation purposes. Therefore, it is also expected that forests will be
increasingly regarded as multifunctional systems. The use of forests for
carbon sequestration will carry on as a priority, but it is possible that their
capacity to store carbon will reach its maximum limit around the middle of
the 21st century. In Portugal, the national priority will continue to be wildfire
prevention, which will have to be articulated with the impact that climate
changes will have on species distribution. Therefore, it is expected a higher
investment of public and private entities on adequate planning of forest areas
and their management.
In order to be able to face the challenges that future scenarios present, the
Portuguese forest sector has to invest on research, education and innovation.
The following vignette was written for a IUFRO Publication on Information
Technology and the Forest Sector1 [34] and illustrates the positive outcomes
of such investment. Non-italic text represents adaptations of the original text
to the national forest context.

Forest management in 2025: a vignette

J.B. is a regional forest planner [...]. On arriving at work six weeks ago,
she found an email message from the national forest planning staff, advising
that it was time for J.B.’s region to update its forest plan.
J.B. started by consulting the region’s web site. Stakeholders, via satellite
internet access, regularly visit the regional site to review and comment on
regional plans and express their [...] concerns and interests with the forest
environment. J.B. queried the site’s content-bots who gave her an updated
analysis of key, recent issues raised by stakeholders. Concerns for forest
sustainability remained the top issue [....] and there was now increased
interest [...] in promoting forest sector jobs.
Issues had changed enough since the last round of planning that J.B.
decided to visit IUFRO’s online planning resources site. Querying the site’s
model database, she found a model from 5 years ago, developed for central
Europe, that was actually a pretty close fit to the current issues in her

1
Hetemäki, L. and Nilsson, S. (eds., 2005). Information Technology and the Forest Sector.
IUFRO World Series W.18 - Vienna, 245 pp.
Forests for the 21st Century? 397

region. The selected model needed some minor modifications, but J.B. had
not yet had in-depth training in designing these particular kinds of planning
models, so she visited the online training area of the site. The self-paced
training took her four hours. At the end, the training program administered
a short test to check that key concepts of model design had not been missed.
The program also checked its own database of knowledge resources, and
recommended a colleague in Hungary that J.B. might want to consult if she
needed advice on model design and application.
Model revisions required two days, and, on review, J.B.’s Hungarian
colleague concurred that her modifications seemed appropriate. The
regional web site notified the stakeholders by email that a new planning
model had been proposed. Although these models are technologically very
advanced, they also are very intuitive and easy to understand. They were
quickly reviewed and validated by the elders.
The planning model defined the data requirements for an initial
assessment of current condition. J.B. visited the GlobalForestCommunicator
site, and quickly assembled the appropriate GIS layers for her region, all
suitably transformed to the projection her government routinely uses. The
initial assessment was presented to the national forest planning staff, who
suggested three strategic alternatives for further consideration. The regional
planning site advised the Stakeholders about this new information. After
their review, a fourth strategic alternative was added.
Evaluating the alternatives required running a number of programs,
including, for example, a harvest scheduling optimizer, a stand growth
simulator, various expert systems, etc., to project the consequences of the
four alternatives into the future. The planning model actually documented
this sort of information for its users, but only in a general way. J.B. also
needed more specific guidance on how to tune parameters for the
recommended models, so she visited IUFRO’s ForestModelArchive web site.
Once the projections had been run, initial results were again reported to
the national planning staff, who recommended choosing their original
alternative C. All of the map products, analyses, recommendations, etc. from
the planning process were organized with the region’s e-Plan application
and posted to the regional web site, where they were now reviewed by the
villagers. The village elders encouraged everyone to review and comment,
so there were actually several thousand comments received. However, the e-
Plan application’s automated processing of comment content made it easy to
track public response and document the adequacy of comment handling by
the agency.
J.B. reviewed the content analysis and presented her findings to the
national planning staff. While the national planning staff originally
recommended alternative C, the stakeholders were almost overwhelmingly in
398 J.S. Pereira et al.

favor of alternative D, and, using map products and documents from the e-
Plan web site, they made a rather compelling case. On further, review and
discussion with the stakeholders, a compromise alternative, capturing
important elements of both C and D was mutually agreed to by the national
and regional planning staffs and the stakeholders.
With a strategic alternative now agreed to by all parties, J.B. ran
additional components of the planning application to develop specific,
tactical plans for what sorts of management activities to perform in what
areas of the planning region. These plans launched the initial phase of plan
implementation. Interestingly, the basic evaluation system that was used to
perform the initial assessment of current condition, and the assessment of
alternatives, would now be used in plan implementation to track and report
progress.
J.B. leaned back in her chair, and paused to reflect at the end of the
process. She recalled those horror stories from graduate school of how
forest planning processes in North America and Europe could take 8 to 10
years back in the 1980s and 1990s. Why, even in the 2010s, it was not
unusual for a planning process to run 30 to 36 months. She had to smile,
realizing that 6 weeks really wasn’t long at all.

REFERENCES
1. DGRF/Direcção Geral dos Recursos Florestais, Inventário Florestal Nacional. 1995-
1998, Lisboa, 233 pp, 1998.
2. T.J. Mills and R.N. Clark, Roles of research scientists in natural resource decision-
making. Forest Ecology and Management, 153, 189-198, 2001.
3. M. Krott, Catalyst for innovation in European forest policy sciences. Evaluation of the
EFI Research Program 3: policy analysis. Forest Policy and Economics, 5, 123-134,
2003.
4. C.G.Shaw III, F.H. Everest and D.N. Swanston, Working with knowledge at the
science/policy interface: a unique example from developing the Tongass Land
Management Plan. Computers and Electronics in Agriculture, 27,377-387, 2000.
5. M. Williams, Deforesting the earth: from prehistory to global crisis. University of
Chicago Press, Chicago, xxvi, 689 pp, 2003.
6. A.A. Alves, N. Devy-Vareta, A. Oliveira and J.S. Pereira, A floresta e o fogo através dos
tempos. J.S. Pereira, J.M.C. Pereira, F.C. Rego, J. Silva and T. Silva eds. Incêndios
Florestais em Portugal: Caracterização, Impactes e Prevenção Eds. ISAPress, Lisboa,
Portugal, 15-40, 2006.
7. Leitão, N. Forest and foresters in the Portuguese history, http://www.naturlink.pt/canais/
Artigo.asp?iArtigo=9678&iLingua=1 (in Portuguese)
8. A.A.M. Alves, Pontos de referência da evolução das Ciências Florestais em Portugal no
séc. XX. História e desenvolvimento da ciência em Portugal no séc. XX. Publicações do
II centenário da Academia das Ciências de Lisboa, 858-869, 1992.
Forests for the 21st Century? 399

9. A.S. Mather and J.M.C. Pereira, Transição florestal e fogo em Portugal. Incêndios
Florestais em Portugal: Caracterização, Impactes e Prevenção. J.S. Pereira, J.M.C.
Pereira, F.C. Rego, J. Silva and T. Silva eds, ISAPress, Lisboa, Portugal, 258-282, 2006.
10. Hetemaki, L. and S. Nilsson, (eds) Information Technology and the Forest Sector.
IUFRO World Series Volume 18, Vienna, Austria: International Union of Forest
Research Organizations, 150-171, 2005.
11. M.C. Radich and A.A.M. Alves, Dois séculos de floresta em Portugal. CELPA, Lisbon,
Portugal, 2000.
12. A.A.M. Alves, Forestry development in Portugal. Potentialities and constraints.
Proceedings of the JNICT/NAS/USAID, Workshop on Future Expectations of
Portuguese Forestry, Póvoa do Varzim, 13-16 December, 1-22, 1983.
13. M. Páscoa and A. Alves, A condução dos povoamentos como factor determinante da
produtividade florestal. Comunicações ao 1º Congresso Florestal Nacional, FCG,
Lisboa, 2-6 Dezembro, 69-70, 1986.
14. A.C.Oliveira, J.S. Pereira and A.V. Correia, A Silvicultura do Pinheiro Bravo. Centro
Pinus, Porto, Portugal, 2000.
15. CESE, Livro Verde da Cooperação Ensino Superior-Empresa. Sector Florestal.
Conselho Para A Cooperação Ensino Superior-Empresa, Lisboa, 1998.
16. F.O. Baptista and R.T. Santos, Os Proprietários Florestais. Celta Editora, Oeiras,
Portugal, 93 pp, 2005.
17. P.M.A. Miranda, M.A. Valente, A.R. Tomé, R. Trigo, M.F.E.S. Coelho, A. Aguiar and
E.B. Azevedo, O clima de Portugal nos séculos XX e XXI. F.D. Santos and P. Miranda
eds. Alterações Climáticas em Portugal. Cenários, Impactes e Medidas de Adaptação.
Gradiva, Lisboa, 2006.
18. J. Luterbacher and E. Xoplaki, 500-year winter temperature and precipitation variability
over the Mediterranean area and its connection to the large-scale atmospheric circulation.
H.-J. Bolle ed. Mediterranean Climate. Variability and Trends, Ed. Springer Verlag,
Berlin Heidelberg, 2003.
19. J.S. Pereira, A.V. Correia, A.P. Correia, M. Branco, M. Bugalho, M.C. Caldeira, C.
Souto-Cruz, H. Freitas, A.C. Oliveira, J.M.C. Pereira, R.M. Reis and M.J. Vasconcelos.
Forests and Biodiversity. F.D. Santos, K. Forbes and R. Moita eds. Climate Change in
Portugal. Scenarios, Impacts and Adaptation Measures. Gradiva, Lisboa, Portugal, 363-
414, 2002.
20. Durão, R.M. and J. Corte-Real, Alterações climáticas: futuro dos acontecimentos
extremos e do risco de incêndio. J.S. Pereira, J.M.C. Pereira, F.C. Rego, J.N. Silva and
T. Pereira da Silva eds. Incêndios Florestais em Portugal. Caracterização, impactes e
prevenção, ISA Press, Lisboa, 231-255, 2006.
21. J.M.C. Pereira and M.T.N. Santos. Fire risk and burned area mapping in Portugal,
Direcção Geral das Florestas, Lisboa, 2003.
22. J.S. Pereira, M.M. Chaves, M.C. Caldeira and A.V. Correia. Water availability and
productivity. J.I.L. Morison and M.D. Morecroft eds. Plant growth and climate change,
Blackwells, London, 2006.
23. T. Karjalainen, A. Pussinen, S. Kellomäki and R. Mäkipää. Scenarios for the carbon
balance of Finnish forests and wood products. Environmental Sciences and Policy, 2,
165-175, 1999.
24. I.J. Bateman and A.A. Lovett, Estimating and valuing the carbon sequestered in
softwood and hardwood trees, timber products and forest soils in Wales. Journal of
Environmental Management, 60, 301-323, 2000.
400 J.S. Pereira et al.

25. J.Mateus, G. Pita and A.M. Rodrigues. Seasonality and inter-annual forest atmosphere
carbon and water exchanges in a Portuguese Eucalyptus plantation (Mediterranean
climate), p. in press, 2006.
26. CAC/Comissão para as Alterações Climáticas, Programa Nacional para as Alterações
Climáticas, Lisboa, 80 pp, 2001.
27. M.Battaglia, P. Sands, D. White and D. Mummery, CABALA: a linked carbon, water
and nitrogen model of forest growth for silvicultural decision support. Forest Ecology
and Management, 193, 251-282, 2004.
28. J.G.C. Borges, Sistemas de Apoio à Decisão em planeamento em recursos naturais e
ambiente. Revista Florestal, 9(3), 37-44, 1996.
29. J.G. Borges, A. Falcão, C. Miragaia, P. Marques and M. Marques, A decision support
system for forest resources management in Portugal. G.J. Arthaud and T.M. Barrett eds.
System Analysis in Forest Resources, Kluwer Academic Publishers, Managing Forest
Ecosystems, Dordrecht, The Nederlands, 155-164, 2003.
30. K.M. Reynolds, J.G. Borges, H. Vacik and M.J. Lexer, Information and communication
technology in forest management and conservation. L. Hetemaki and S. Nilsson eds.
Information Technology and the Forest Sector, International Union of Forest Research
Organizations, IUFRO World Series Volume, Vienna, Austria, pp. 150-171, 2005.
31. A.O. Falcão, M. Próspero-dos-Santos and J.G. Borges 2006. A real-time visualization
tool for forest ecosystem management decision support. Computer and Electronics in
Agriculture, in press.
32. S. Belton and T.S. Stewart, Multiple Criteria Decision Analysis. An integrated approach,
Kluwer Academic Publishers, Massachusetts, 2002.
33. T. Karjalainen, Model computations on sequestration of carbon in managed forests and
wood products under changing climatic conditions in Finland. Journal of Environmental
Management, 47, 311-328, 1996.
34. P. Lasch, F.-W. Badeck, F. Suckow, M. Lindner and P. Mohr, Model-based analysis of
management alternatives at stand and regional level in Brandenburg (Germany). Forest
Ecology and Management, 207, 59-74, 2005.
THE ROLE OF THE EMERGENT
TECHNOLOGIES TOWARDS AN INTEGRATED
SUSTAINABLE ENVIRONMENT

Elizabeth Duarte1, Maria N. Pinho2 and Miguel Minhalma3


1
Departamento de Química Agrícola e Ambiental, Instituto Superior de Agronomia,
Universidade Técnica de Lisboa, Tapada da Ajuda 1349-017 Lisboa, Portugal,
e-mail: eduarte@isa.utl.pt
2
Departamento de Engenharia Química e Biológica, Instituto Superior Técnico/ICEMS,
Universidade Técnica de Lisboa, Av. Rovisco Pais, 1, 1049-001 Lisboa, Portugal,
e-mail: marianpinho@ist.utl.pt
3
Departamento de Engenharia Química, Instituto Superior de Engenharia de Lisboa/ICEMS,
Rua Conselheiro Emídio Navarro, 1, 1959-007 Lisboa, Portugal,
e-mail: mminhalma@mail.ist.utl.pt

Abstract: Most of the industrial production processes were developed in the 50s at the
time of cheap and abundant raw materials, energy and water resources. The
intensive use of water of good quality and the search for new
processes/products aiming maximal profits led to scarcity and degradation of
natural resources. Reducing material waste is one of the greatest challenges
facing industry today. Because water is one of the industry’s major waste
products, the ability to reduce waste water would be a giant step in the
direction of overall waste reduction. Water conservation and water use were
considered justifiable only if they represented economic savings either in
material recovering or in the avoidance of treatment costs. However, today’s
industrial facilities are constantly striving to operate more efficiently, and the
most successful plans are relentless in their search for the following: higher
product yields; beneficial use of by-products; improved energy efficiency;
safer and more reliable operations; improved public image; reduced
environmental impacts. This paper provides a systematic approach with four
outstanding examples from diverse industries: Corrugated board; Dairy; Coke
and Cork. The authors have combined the use of proven and accepted
technologies and practices with some new emergent technologies developing a
new systematic approach for minimizing net water usage at industrial
facilities, presented it in a straight forward manner.

Key words: Emergent Technologies; Corrugated Board Industry; Dairy Industry; Coke
Industry; Cork Industry; Sustainable Environment.

401
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 401–420.
© 2007 Springer. Printed in the Netherlands.
402 E. Duarte et al.

1. INTRODUCTION

Water has become a critical issue on the global sustainable development


agenda. The focus is on managing resources more efficiently by improving
the quality of water suppliers and easing the pressure of demanding. In fact,
most of the industrial production processes were developed in the 50’s at the
time of cheap and abundant raw materials and unlimited water resources.
The intensive use of water of good quality led to the production of large
volumes of wastewater and that challenged the development of end-of-pipe
wastewater technologies, as shown in Figure 1.

Figure 1. Traditional approach based on end-of-pipe technologies.

The exclusive recourse to end-of-pipe technologies resulted in an


imbalance between water resources and its demand and that in association
with the high levels of pollution generated made the integrated water
management to become a necessity. This water management is being
furthermore reinforced by an increase of public awareness, more strict legal
requirements, increase of scarcity of the natural resources and economic
restrictions and is leading to the development of new sustainable processes
[1]. The development of sustainable processes requires the incorporation of
emergent technologies that lead a more efficient and selective production
processes with [2]:

x reduction of raw materials, water, subsidiary materials, and energy


consumption
x treatment/recycling of process streams as process
water/solvent(s)/surplus raw materials
x treated wastewater reuse preventing degradation of receiving water
bodies and the environment.
Emergent Technologies and an Integrated Sustainable Environment 403

The European commission is pursuing these objectives through the


identification in the different industrial sectors of the best available
techniques (BAT) that can be either emergent or conventional technologies.
This information is compiled in the Integrated Pollution Prevention and
Control (IPPC) Directives.
This paper highlights the development guidelines that refocus our
approach to an innovative technological and management strategy, which
will allow the achievement of the appropriate technology, helping to solve
the industrial pollution problems [3].
An overview of technologies applicable to the improvement of
production process/water quality is described in connection to [4]:

x Source reduction (in process)


x Water treatment (end of pipe)
x Recycle (external)

Four case studies are described based on the approach of an integrated


sustainable environmental strategy.

2. METHODOLOGY

Five basic contaminant categories have been identified as representative


of the key wastewater quality issues associated with water and waste water
reclamation: inorganic, organic, dissolved, suspended and biological
compounds. These categories were related to the BAT applicable as
displayed in Table 1 [5].
The development of a specific process incorporating conventional/
emergent technologies requires integration of wastewater characterization
data with water quality requirements.
The list of technologies is not exhaustive; rather, it summarizes the
technologies available for water – reuse applications [6].
Technologies are generally applied for one of the following reasons:

x Source reduction (in- process)


x Water treatment (end- of- pipe)
x Recycle (external)

According to the same methodology described in Table 1 a matrix was


developed relating the type of technology with the applicability (source
reduction, water treatment, external recycle).
404 E. Duarte et al.

Four case studies were selected to demonstrate this methodology


approach based on pilot scale results obtained prior to design and
implementation of the final process.

Table 1 Wastewater contaminant treatment applicability by technology type.


Contaminants Inorganic Organic Dissolved Suspended Biological

Technology
Anaerobic, aerobic, 3 3 3 3 3
nitrification, others
Centrifuge separation 3
Flotation 3 3 3
Sedimentation or 3 3 3 3
settling
Precipitation/ 3 3 3 3
Crystallization
Activated Carbon, Ion 3 3 3
exchange, gas
absorption
Filtration (granular 3 3 3 3
bed, vacuum drum,
press, belt filter, others)
Pressure driven 3 3 3 3 3
membrane processes
Electrodialysis 3 3
Pervaporation 3 3
Evaporation, 3 3 3
Distillation, Stripping
(air, steam)
Drying, incineration, 3 3 3 3
spray drying
Solvent Extraction 3 3
Chemical Oxidation 3 3 3
(ozone, others)

3. EMERGENT TECHNOLOGIES APPLICATION –


CASE STUDIES

3.1 Corrugated board production industry

3.1.1 Introduction

The base case presented in this paper was developed in a corrugated


board production industry, which has a production capacity of 48 x 106
Emergent Technologies and an Integrated Sustainable Environment 405

square meters of corrugated board. An evaluation of the water consumptions


– industrial and domestic – was carried out. All the water consumed was
potable water. In what concerns the discharge of wastewater streams, the
domestic households were discharged in the public sewer and the industrial
wastewater were discharged in body water, without any treatment. After this
evaluation of the inputs and outputs a water use optimization plan was
developed and implemented. The strategy developed and the results obtained
are described in the present work.

3.1.2 Methodology

Diagnosis of the water use and water use and wastewater discharges
– Initial Situation
The first part of the strategy developed was to carry out the diagnosis of
the corrugated board plant, including the study of internal water cycles,
identification and quantification of the wastewater streams and the study of
requirements of production regarding water quality.
Water needs and consumptions
The average annual water supply needs of the industrial plant, before the
implementation of the water use optimization plan, were around 25,600 m3.
The main water uses in the industrial plant analyzed are industrial use,
similar to urban use and garden irrigation.
In the initial situation, the similar to urban use was responsible for 25.3%
of the total water consumption in the industrial plant. The industrial use was
responsible for 71.1% and the garden irrigation use for 3.6%.
Figure 2 summarizes the distribution of water needs by the different uses
in the industrial plant, before the implementation of the water use
optimization plan.

Figure 2. Distribution of water needs by the different uses in the industrial plant.
406 E. Duarte et al.

The water consumption distribution by processes before the optimization


plan implementation is described in Table 2.

Table 2. Water consumptions distribution by processes in the initial situation.


Water Use Water consumption (m3/year) % By use
Similar to Urban Use
Canteen 2,076 32
Showers 2,604 40
Toillets 1,800 28
Total Similar to Urban Use 6,480 100
Industrial Use
Starch production 1,716 9
Washing operations 3,312 18
Cooling processes 2,064 12
Steam production 3,024 17
Other consumptions 8,076 44
Total Industrial Use 18,192 100
Garden Irrigation 924 100
Total 25,596 -

The analysis of this table shows that the major use, in terms of water
consumption, is the industrial use. The item “Other consumptions” is
referred to non-identified consumption due to the leakages in the water pipe
supply systems. The second major water consumption occurred in the
equipment washing operations.
Figure 3 summarizes the percentage water consumption by industrial
process/activity before the implementation of the water use optimization
plan.

Figure 3. Percentage water consumption by industrial process/activity in the initial situation.


Emergent Technologies and an Integrated Sustainable Environment 407

Wastewater production and discharges


Concerning the industrial wastewater production, it was concluded that
its origin is mainly from equipment washing processes - flexographic
printers, starch production and application - cooling processes and steam
production. In what regards domestic household its origin were mainly from
toilets, canteen and business area. [7]
The total amount of industrial wastewaters daily production was, in the
initial situation, respectively 20.9 m3 and 24.4 m3.
In what concerns the wastewaters discharges, in the initial situation all
the domestic streams were discharged in the public sewer and the industrial
streams were discharged in a body waters receptor without any treatment.
The storm waters were contaminated with industrial wastewaters composed
by the washing waters from the naphtha tanks area. The domestic
wastewaters had a high level of oils and fats, due to the discharges of the
fried oils from the canteens.
Definition and implementation of the water use water use
optimization plan
The water use optimisation plan defined integrates five main
components:

x Segregation of the wastewaters streams (domestic, industrial and


storm waters) in order to make possible the definition of a wastewater
management strategy;
x Wastewater production reduction plan through the definition of
internal measures;
x Development of an analytical study of the different industrial
wastewater, in order to obtain a hierarchical classification based on its
pollutant charge;
x Selection and implementation of the appropriate technology for the
industrial wastewater treatment, in order to maximize the reuse of
treated wastewater and redefinition of the internal water cycles;
x Water consumption reduction plan through the implementation of
internal measures and an intensive human resources training program.

Segregation of the wastewaters streams


In what concern the segregation wastewater streams, and attending the
main goals purposed, the first measure advised was to eliminate the
contamination of the storm waters stream by the wastewater produced in
occasional washing processes in the naphtha tanks area, collecting it, at is
source, in an adequate container. The second measure was to collect the fried
oils from the canteen in an adequate container, eliminating it discharge in the
domestic stream.
408 E. Duarte et al.

Wastewater production reduction plan through the definition of internal


measures
The second component was the introduction of the concept of “waste
minimisation”, which means the reduction of the waste generated to the most
feasible extent, subsequently, treated, stored or disposed off. In this case-
study, some preventive internal measures were advised, through operating
procedures, mainly in the sector of flexographics printers, by introducing
pressure devices in the washing equipment, followed by the education of the
operators for the need to save water during the washing operations.
Another measure advised was the installation of water counters in each
production sector, in order to make possible the water consumption control
and the identification and immediately intervention in case of eventual
anomalies.
Hierarchical classification of the industrial wastewaters streams
The analytical study of the different streams of industrial wastewaters
produced coupled with the study of the internal water cycles - this industry
has four major points of water consumption - and attending to the different
water quality requirements - high quality water for steam production and
cooling processes and less quality water for the two other points of
consumption, washing operations and starch making process - allowed the
definition of the more adequate strategy for the wastewater management.
Figure 4, illustrate the strategy defined, with the different wastewater
selected to be treated, as well, as the low level pollutant wastewater with re-
use potential in the industrial process.

INPUTS INDUSTRIAL PROCESS OUTPUTS INTEGRATED


MANAGEMENT
Treated Flexographic printers
wastewater washing operations Ink waters Industrial
process
direct reuse
Equipment cooling Cooling
Fresh
processes waters
Water

Treated Starch making process


wastewater

Corrugate washing Starch Wastewater


Treated
wastewater operations Waters treatment
plant

Fresh Water Boilers Purges

Figure 4. Integrated management of the different types of industrial wastewater.


Emergent Technologies and an Integrated Sustainable Environment 409

Selection and implementation of the appropriate industrial


wastewater treatment technology
From the point of view of water quality goals, emphasis was laid upon
techniques to remove heavy metals and organic micro-pollutants and, at the
same time, to improve the quality of the treated wastewater for reuse,
minimising its influence on production and products quality [8].
The result was the implementation of a wastewater treatment plant, based
on a physical-chemical treatment, followed by a filter process with a press
filter. In this process the standard processes were rationally allocated to the
specific circumstances in the most effective engineering and cost terms. All
the treated wastewater is reused in industrial operations as described in
Figure 4. [9]
Water consumption reduction plan
The water consumption reduction plan was implemented through the
definition of internal measures, which result from the following actions:

x Redefinition of internal water cycles by reusing wastewater treated


and wastewaters streams with a very low pollutant load (like the
cooling waters) in industrial processes (industrial washing operations
and starch make-up);
x Identification and repair of all leakages in the water conducting pipe
systems;
x Introduction of waters counters in order to prevent and detect
immediately anomalies;
x Intensive human resources training periodic programs, which aimed
the motivation of all workers to the importance and benefits of an
efficient use of water.

3.1.3 Implementation of the water use optimization plan

After the implementation of the water optimisation plan, the potable


water uses were reduced to:

x Industrial uses: cooling waters, steam production and starch


production;
x Domestic uses: canteen, showers and toilets;
x Garden irrigation

The industrial activities, that initially used potable water, like washing
operations and starch make-up, after the implementation of the plan, reuse
treated wastewater. The item “Other consumptions” in the initial situation
(see Table 3) no longer exists, once was due to leakages that were repaired.
410 E. Duarte et al.

In this way, the water needs and consumptions after the implementation
of the water optimisation plan, and the respective reduction obtained is
showed in Table 3.
Figure 5 summarizes the distribution of water needs by the different
sectors in the industrial plant before and after the implementation of the
water use optimisation plan.

Table 3. Potable water consumptions by processes and reduction obtained.


Initial Water
Water consumption
Water Use consumption Reduction
3 after the plan (m3/yea)r
(m /year)
Similar to Urban Use
Canteen 2,076 1,452 30%
Showers 2,604 2,460 5.5%
Toillets 1,800 1,548 14%
Total Similar to Urban Use 6,480 5,460 15.7%
Industrial Use
Starch production 1,716 38 97.8%
Washing operations 3,312 0 100%
Cooling processes 2,064 2,064 0%
Steam production 3,024 3,024 0%
Other consumptions 8,076 0 100%
Total Industrial Use 18,192 5,126 71,8%
Garden Irrigation 924 924 0%
Total 25,596 11,510 55%

Figure 5. Distribution of water consumptions by the different sectors in the industrial plant.
Emergent Technologies and an Integrated Sustainable Environment 411

3.2 Dairy industry - Cheese whey and second cheese


whey valorization

3.2.1 Introduction

Cheese whey (CW) and second cheese whey (SCW) are by-products of
cheese and curd cheese production that are usually not recovered and
therefore substantially contribute for the negative environmental impact of
the cheese manufacture plants. Membrane technology, namely, ultrafiltration
(UF) and nanofiltration (NF), may be used for the recovery of CW and SCW
organic nutrients, resulting from “Serpa” cheese and curd production. The
objectives behind the integration of membrane technology in the cheese
plants envisage by-products recovery, namely fat, proteins, lactose and
amino acids, process water reuse and reduction of the waste waters
discharged to the environment/wastewater collectors.

3.2.2 Integrated process for the valorization of cheese whey and


second cheese whey

“Serpa” cheese is made from ovine milk and has a very well defined
geographic origin and quality. Figure 6 shows the diagram of an integrated
process for the valorization of the by-products resulting from “Serpa” cheese
manufacture.
The cheese whey (CW) resulting from the cheese production can be
defatted and filtered in an ultrafiltration (UF) unit. The separated fat (product
3, Fig. 6) can be used in the production of highly nutritive butter. The UF
concentrate (product 1, Fig. 6) is very rich in proteins and can be purified for
a wide range of applications, such as dietary proteins for functional foods
and pharmaceuticals [10, 11].
The “Serpa” cheese whey is currently used in the production of curd
cheese. The effluent of the curd cheese production is called second cheese
whey (SCW), being a by-product with a very high content in organic matter,
its characterization is shown in Table 4.

Table 4. Second cheese whey characterization.


Parameter Value
pH 6.2
Specific conductivity 23.3 mS/cm
Total Organic Carbon (TOC) 31.2 g C/l
Lactose 50.6 g/l
Total nitrogen by Kjeldahl method 1.74 g N/l
Proteins and Free amino acids 8.3 g/l
412 E. Duarte et al.

1 2 3 UF Product
CW (from CW Defatted 1
cheese without fines CW
production)
Water, lactose,
Product salts and free
Fines 3 amino acids (a. a. )

SCW (from curd SCW Defatted


production) without fines SCW
4 5 6 NF Product
2
Legend:
1 & 4 – Remotion of fines by filtration.
2 & 5 – Remotion of fat by centrifugation.
3 & 6 – Ultrafiltration (3) and Nanofiltration (6) Water, salts
Product 1 – Protein concentrate.
(depurated
Product 2 – Lactose c oncentrate, free a. a. and some wastewater for
bivalent salts.
reuse, ex: CIP)
Product 3 – Fat for butter production.

Figure 6. Process for nutrients recovery and valorization of “Serpa” cheese by-products [12].

The “Serpa” SCW has a very high lactose concentration and is very rich
in mineral salts (essentially NaCl), vitamins and free amino acids. The very
high salt concentration is due to the addition of NaCl during the production
process of cheese and curd. Small amounts of fat and residual proteins are
also present in the SCW composition. Nowadays, the majority of the “Serpa”
cheese factories (if not all) treat the SCW as a common waste and mix it
with the domestic sewage and other less pollutant wastewaters. Without a
purification and recovery process like the one shown in Figure 6, the SCW is
a strongly pollutant effluent. The negative environmental impact and the loss
of this very valuable product are the reasons for the implementation of a
recovery and valorization process.
With the increasing evolution and utilization of membrane technologies
in the dairy industry since the late 1960’s, nanofiltration (NF) is a possible
economic option in the treatment of the SCW. The dashed line in Figure 6
delimits the NF unit operation (part of the overall valorization process). NF
membranes have low rejections (high permeability) to monovalent salts
(NaCl, KCl) and have high rejections to the organic compounds and to some
bivalent salts dissolved in the SCW. The NF operation processes a combined
feed of SCW and a CW ultrafiltration permeate, since both streams have
similar qualitative compositions. The NF processing of these streams has
two major advantages. First, the production of a clean effluent and the
reduction of wastewaters due to the possible reuse of some water in the
process (ex: “cleaning in place”, CIP). And secondly, the production of a
Emergent Technologies and an Integrated Sustainable Environment 413

lactose concentrate (product 2, Fig. 6) with potential application in


pharmaceuticals, sugar-cellulose fibers [13] and food industry [14].

3.3 Coke plant wastewater

3.3.1 Introduction

In the past ten years a special attention has been given to industrial
ammoniacal wastewaters containing ammonia and to its environmental
impact. The negative effect of ammonia/ammonium compounds in the
environment occurs at three different levels: overmanuring of surface water;
toxicity towards water-born organisms and consumption of oxygen through
nitrification. Ammonia is commonly present in the industrial wastewaters
from petroleum refineries, steelworks, fertilizers, pulp and paper, abattoir
and coke plants and its removal is carried out through the use of different
methods depending on its concentration and contaminants. The wastewaters
from coke plants present a double fold difficulty in what concerns its
treatment or purification. Besides the very high ammonia contents, there are
also very harmful anions like cyanides and phenols that render the frequently
used treatments for ammoniacal waters, as the biological ones involving
nitrification/denitrification, inefficient. The coke plant wastewaters
characterization is presented in Table 5.

Table 5. Wastewater physico-chemical characterization.


Parameter Value
TOC (mg C/l) 501.2
Conductivity (mS/cm) 27.3
Color (Hazen unit) 248.6
pH 9.4
Ammonium (g/l) 8.7
Phenols (mg/l) 151
Cyanides (mg/l) 176

The methodology approach in the present work envisages the integration


of nanofiltration as a fractionation technique that allows the confinement of
the prioritary pollutants in different streams and thus generating an
ammonia/phenol concentrate stream, which is depleted from the cyanide
anions, and a cyanide enriched permeate stream. The NF concentrate stream
is further fractionated by steam stripping with ammonia in the top stream
and the phenols in the bottom stream. The integrated process of
nanofiltration/steam stripping yields three streams each one of them enriched
in one of the three major pollutants that can then be subjected to specific
treatments and allows very significant energy savings.
414 E. Duarte et al.

3.3.2 Integrated process for the fractionation of the ammoniacal


wastewaters

In the present, the coke plants, namely Siderurgia Nacional S.A.


Portugal, treat their ammoniacal wastewaters by feeding then into a stripping
column, Figure 7, where the ammonia and cyanides go for the top stream of
the column which is then fed into a burner and the phenol go to the bottom
column stream and are then discharged.
This approach creates severe environmental problems in terms of air
pollution, formation of NOx, and in terms of aquatic pollution due to phenol
content. Therefore, in order to obtain a cleaner process a fractionation
technique like Nanofiltration step can be integrated in the scheme present in
Figure 7 allowing in one hand the confinement of the cyanides in the
permeate stream that can be further treated by a destructive process and on
the other hand allows the concentration of ammonium and phenols in the
concentrate stream that is then fed into the stripping column, this integrated
process is shown in Figure 8. The volume reduction of the stream that is fed
into the column also renders a very important reduction in the steam
consumption leading to considerable energy saving (Figure 9).
NH3, CN-

Ammoniacal Wastewater

Stripping Column
NH4+: 7.1-8.7 g/l
CN-: 75 – 276 mg/l
Phenol: 85-185 mg/l
Qcirc: 16 m3/h

Phenol

NH4+: 0.004 -1.3 g/l


CN-: 7.4 – 9.6 mg/l
Phenol: 13-148 mg/l
Qcirc: 17.9 m3/h

Figure 7. Scheme of the process used in Siderurgia Nacional S.A. for coke plant wastewater
treatment.
Emergent Technologies and an Integrated Sustainable Environment 415

Burner

NH3

Coke Plant Wastewater


NF Steam
RR=40% Stripping
Column

Cyanide

Specific
Treatment Phenol
Specific
Treatment

Figure 8. Scheme of the Nanofiltration/Steam Stripping process proposed for coke plant
wastewater treatment [15].

4000
(Kg/h)(kg/h)

3000
de Vapor
Steam Consumption

2000
Consumo

1000
y = -2950.7x + 2999.6
2
R = 1.000
0
0% 10% 20% 30% 40% 50% 60%
TR
RR
Figure 9. Steam consumption in function of NF recovery rate.

The NF results show that the cyanide concentration in the concentrate is


extremely dependent on the Recovery Rate (RR) and a maximal removal is
achieved for a RR of 40%, as shown in Figure 10.
416 E. Duarte et al.

14 400

Phe nol, C yanide conce ntrations (mg/l)


12 350
Ammonium conce ntration (g/l)

300
10

250
8
200
6
150
4
100

2 50

0 0
0% 10% 20% 30% 40% 50% 60%
Re cove ry Rate

Figure 10. Variation of the ammonium (¨), phenol (Ŷ) and cyanide (Ƈ) content in the
concentrate as a function of the Recovery Rate - concentration mode.

3.3.3 Economical evaluation

In the economical evaluation of the NF operation it was considered a


daily wastewater flow rate of 384 m3 effluent/day. It was also considered
that the plant is working 365 days/year and 24 hours/day. The labor was
estimated as being 0.2 men/year with costs of 1000 € per month.
The different parameters and costs of the NF operation are presented in
the following items.

NF Unit Parameters
Recovery Rate 40% Membrane Area (m2) 397
Permeate Flow rate (m3/h) 6.4 Permeate Flux (l/m2/h) 16.1
nº of cleanings/week 1 Cleaning agent (kg/cleaning) 20
Membrane Lifetime (year) 3 Feed Flow rate (l/min.) 9.2
Pump Efficiency 70% Pressure (bar) 30
Pump Energy (kWh) 3.3 Circulation Energy (kWh) 17.4
Economical Parameters
Project Lifetime (years) 7 Interest Rate (%) 15
Membrane Cost (€/m2) 50 Energy Costs (€/KWh) 0.05
Emergent Technologies and an Integrated Sustainable Environment 417

Labor Costs (€/month) 1000 Cleaning Agent Costs (EURO/kg) 5


Investment Costs (€)
Pumps + Electrical Inst. 19803
Modules + Installation 29700
Membrane 10142
Total (€) 59645
Investment Annualized Costs (€/year) 14336
Investment Annualized Costs (€/m3 of permeate) 0.256

Operation Costs (€/m3 of permeate)


Electricity 0.161
Membrane Replacement 0.101
Chemicals 0.093
Labor 0.050
Maintenance (2% do Inv.) 0.021
Total (€/year) 23883
Total (€/m3 of permeate) 0.427
NF Operation Total Costs (€/year) 38219

The NF annualized costs working at a RR of 40%, considering


investment and operation costs are as showed before of 38219 €/year.
The column striping steam savings are for a NF RR of 40% as mentioned
before of 136160 €/year, and therefore the integrated process presents
annually net savings of 98000€ or 0.698 €/m3 of effluent to treat.

3.4 Cork processing industry wastewater

3.4.1 Introduction

Membrane pressure-driven processes are playing a major role in


wastewater treatment due to their capability of removing organic matter,
over a wide range of sizes, going from small solutes with the use of
nanofiltration/reverse osmosis (NF/RO), macromolecules and colloids by the
use of ultrafiltration (UF) to suspended matter through the use of
microfiltration (MF). The wastewaters from food, pulp and paper, cork and
many other industries, are very complex mixtures of vegetal extracts
covering a wide range of molecular weights and very often presenting
colloidal behavior that is associated with great flux declines. The cork
processing wastewaters characteristics are presented in Table 6.
418 E. Duarte et al.

Table 6. Physico-chemical characterization of cork processing wastewaters [16].


Characteristic Value
pH 4.9
TOC 3350 mg C/l
Average colloid size 383.0 nm
Zeta-potential -13.2 mV
Total polyphenols 0.958 g/l (gallic acid)

3.4.2 Integrated process for the treatment of the cork processing


wastewaters

The integration of UF in the treatment of the cork processing waste


waters will allow the concentration of the polyphenolic/tannin fraction of the
wastewaters in a concentrate stream that can be then used in different
industries like the leather industry, the wood and cork agglomerate
industries, etc. Along with this valorization of by-products the permeate
stream purified water, can be recycled to the process and therefore the total
amount of discharges will be considerably reduced. The proposed treatment
is presented in Figure 11.

Figure 11. Role of Ultrafiltration in the treatment of cork processing wastewater.

Due to the colloidal behavior of the solutes present in these wastewaters


the UF permeate fluxes are drastically low when compared with the ones
obtained with water. In order to minimize this problem the integration of
different pre-treatments before UF can reduce the amount of fouling agents
fed into the membranes and therefore much higher permeate fluxes can be
obtained with this integrated process [16], which is shown in Figure 12.
Emergent Technologies and an Integrated Sustainable Environment 419

Figure 12. Pre-treatments/Ultrafiltration Integrated Process for the treatment of cork


processing wastewater.

4. CONCLUSIONS

The work described in this paper show that in the four case studies
presented the emergent technologies can successfully accomplished the final
goal of a sustainable environment with zero discharge/waste minimization/
process water recycling/valorization of by-products:
In the corrugated board industry the implementation of a water use
optimization plan is possible to achieve an efficient use of water through an
adequate management strategy. The achievement of the Zero Discharge
concept was successfully implemented in the four industrial plants of
paperboard sector in Portugal, which represent more than 80% of the
national production. In the dairy industries, membrane technology and
namely ultrafiltration, nanofiltration and reverse osmosis led to the
valorization of cheese whey and second cheese whey as protein and lactose
concentrates, respectively. Simultaneously the purified permeate waters can
be recycled as process water to aim zero discharge. Regarding the coke
industry the integration of nanofiltration with conventional steam stripping
allows the fractionation of cyanide and phenol contaminated ammoniacal
wastewaters to confine these contaminants in separate streams for specific
treatments. Simultaneously energy savings are achieved. In the cork
industry, the ultrafiltration of the wastewaters led to permeate water
recycling and to a potential valorization of the tannin concentrates for cork
and wood agglomerate industries.
420 E. Duarte et al.

REFERENCES
1. Pols, H. B. and Harmsen, G. H., “Industrial Wastewater Treatment Today and
Tomorrow”. Water Science and Technology, 30 (3), 109-117, 1994.
2. Bhamidimarri, R. and Shilton, A., “How Appropriate are “Appropriate Waste
Management Technologies”? - Defining the Future Challenge”. Water Science and
Technology, 34 (11), 173-176, 1996.
3. Papalimmeou, F., The Legislation Concerning Water Resources Management and
Protection. In: Water Pollution III: Modelling, Measuring and Prediction, L. C. Wrobel
and P. Latinopoulos (ed), Computational Mechanics Publications, Boston, pp. 441, 1995.
4. Hertz, D. W. et al., Status Report on the Clean Process Advisory System: New Process
Design Tools for Environmental Sustainability. Presented at the 1994 AIChE Summer
National Meeting, Denver Colorado, American Institute of Chemical Engineers, New
York, August 14-17, 1994.
5. Snoeyink, V. L. and D. Jenkins, Water Chemistry. New York John Wiley and Sons. 1980
6. Process Water Treatment and Reuse. Chemical Engineering Process. April 1993. pp 21-
35
7. Jacobsen, B.; Petersen, B.; Hall, J. E. “Are EU member state’s data on wastewater
collection and treatment comparable?”, European Water Pollution Control, 7, pp.19,
1997.
8. Duarte, E. A., Neto, I., Alegrias, M., Barroso, R., “Appropriate Technology” for
pollution control in corrugated board industry – the Portuguese case. Water Science and
Technology, Vol.38, nº 6, pp.45-53, 1998.
9. Goldblatt, M. E. et al., Zero Discharge: What, Why and How. Chemical Engineering
Progress, April 1993
10. Jayaprakasha, H. M. and Brueckner, H., “Whey protein concentrate: A potential
functional ingredient in food industry”, J. Food Sci. Technol. (Mysore), 36(3) 189-204,
1999.
11. McIntosh, G. H., Royle, P. J., Le Leu, R. K., Regester, G. O., Johnson, M. A., Grinsted,
R. L., Kenward, R. S. and Smithers, G. W., “Whey proteins as functional food
ingredients?”, Int. Dairy J., 8(5-6), 425-434, 1998.
12. Magueijo, V., Minhalma, M., Queiroz, D., Geraldes, V., Macedo, A. and de Pinho, M.
N., “Reduction of wastewaters and valorisation of by-products from “Serpa” cheese
manufacture using Nanofiltration”, Water Science & Technology, 52 (10-11), 393–399,
2005.
13. Fernandez, J., Vega, A., Coca, J. and Allan, G. G., “Sugar-cellulose composites. VI.
Economic evaluation of lactose production from cheese whey for use in paper”, J. Sci.
Food Agric., 82(10), 1224-1231, 2002.
14. Morr, C. V. and Barrantes, L., “Lactose-hydrolysed Cottage cheese whey nanofiltration
retentate in ice cream”, Milchwissenschaft, 53(10), 568-572, 1998.
15. Minhalma, M. and de Pinho, M. N., “Integration of Nanofiltration/Steam Stripping for
the Treatment of Coke Plant Ammoniacal Wastewaters”, Journal of Membrane Science,
242, 87-95, 2004.
16. Minhalma, M. and de Pinho, M. N., “Flocculation/flotation/ultrafiltration integrated
process for the treatment of cork processing wastewaters”, Environmental Science and
Technology, 35, 4916-4921, 2001.
INTEGRATED WATER MANAGEMENT

Ramiro Neves1 , José S. Matos2 , Luís Fernandes1 and Filipa S. Ferreira2


1
Secção de Ambiente e Energia, Dept. Engª Mecânica do IST, Instituto Superior Técnico,
Universidade Técnica de Lisboa, Av. Rovisco Pais, 1049-001 Lisboa, Portugal,
ramiro.neves@ist.utl.pt, lfernandes.maretec@ist.utl.pt

2
Secção de Hidráulica e Recursos Hídricos e Ambientais, Dept. Engª Civil e Arquitectura,
Instituto Superior Técnico, Universidade Técnica de Lisboa, Av. Rovisco Pais, 1049-001
Lisboa, Portugal, jsm@civil.ist.utl.pt, filipam-ferreira@hotmail.com

Abstract: In this paper, an overview on the development and application of state of the
art integrated water modelling tools to study water pollution, either from urban
or agricultural origin, from source to final destination, and also of the research
carried out at IST in the framework of integrated water management is
described. The modelling tools are used for computing the urban load in a sub-
catchment of Lisbon metropolitan area for assessing the trophic levels of
Tagus estuary and its relation with urban and agricultural loads.
The strategy for modelling development at IST is also described, showing that
modelling can be an important contribution for the integration of water
management. Results have shown that modelling of the functioning of
wastewater treatment plants is a mechanism for managing the urban
wastewater loads and that the trophic level in the Tagus estuary is controlled
by light penetration and not by nutrients. As a consequence, a reduction of the
nutrient loads from urban origin or a 50% of the agricultural nutrient load
would have no benefits in terms of trophic activity.

Key words: Integrated Management, Modelling, Receiving Waters, Wastewater.

1. INTRODUCTION

Water constitutes one of the most important limiting factors for the
development of Society and, as a consequence, its management takes
priority in the whole World. In the European Union, water management has
been directly and indirectly subject of multiple directives, from which stand

421
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 421–446.
© 2007 Springer. Printed in the Netherlands.
422 R. Neves et al.

out the (i) Nitrates Directive, (ii) Urban Waste Water Directive, (iii) the
Drinking Water Directive, (iv) Bathing Waters Directive and (v) Water
Framework Directive. The Dangerous Substances Directive, the Shellfish
Directive and the Habitats Directive are examples of other directives whose
implementation requires the knowledge of the water dynamics.
The multiplicity of legal diplomas which regulate water management is a
consequence of a variety of aspects in which Water is involved, namely as a
nutrient, as an habitat, as a leisure zone, as raw material for industry, as a
transport vehicle and a final destiny for residues. The multiplicity of
Institutions and Organisms involved in water management is a natural
outcome of the different perspectives of the use of the water, but also of
technological limitations, to whose resolution have contributed the R&D
activities of Instituto Superior Técnico (IST), especially the ones developed
in the framework of interdisciplinary integrated projects, which contribute to
optimize solutions and reduce the high costs involved in pollution control.
Pollution of urban origin reaches the environment through a drainage
network from point discharges, being the effluent treatment level before the
discharge dependent of the dimension of the town and of the receiving
waters (according to the Urban Waste Waters Treatment Directive).
Industrial discharges are subject to specific legislation, namely when
involving dangerous substances.
Pollution of agricultural origin presents diffuse characteristics, reaching
the environment through surface run-off and/or underground waters. The
impacts of agricultural activities in the environment are normally due to
nutrient lixiviation and eventually of toxic substances (normally pesticides
and/or herbicides) and due to soil erosion.
Eutrophication due to nutrients excess is nowadays in Europe the main
concern in terms of water quality management, requiring the integrated
management of nutrients of agriculture and urban origin, taking also into
consideration nitrogen atmospheric deposition.
The increase of trophic activity associated to eutrophication may
originate changes in species and anoxic situations which can endanger
habitats.
Reservoir and estuaries are systems with high residence time of water
and especially of particulate matter, therefore constituting the most
susceptible areas to eutrophication, especially reservoirs where residence
time has an order of one year, while in the latter it can vary from days to
months. Thus, the trophic level of reservoirs essentially depends on nutrients
availability, while in estuaries it can also be limited by residence time and by
light availability associated to sediment dynamics.
Eutrophication management in reservoirs and estuaries requires the
determination of the maximum nutrient loads possible that these water
Integrated Water Management 423

bodies can receive and the ability to control nutrient generation in the basin,
which depends on the intensity of the sources and on the retention capacity
of the basin, dependent on soil biogeochemical processes and in rivers. Thus,
integrated water management requires the ability to know water movement
and biogeochemical processes occurring in the water bodies which are
subject to higher risks (reservoirs and estuaries), but also to know the
movement and the biogeochemical processes occurring between the sources
and the receiving waters. These processes are normally simulated with basin
models, where urban areas are treated as point sources. Urban origin
discharges depend on interception capacity of effluents and on Waste Water
Treatment Plants (WWTP’s) efficiency.
In this paper, the state of the art of integrated water modelling and the
tools developed and/or used at IST are described, using as case study the
Tagus estuary, to which the integrated management is particularly important,
given the dimension of the urban discharges (corresponding to about 2.5
million equivalent inhabitants); and also given the dimension of the load
from the Tagus river, whose basin is the biggest in the Iberian Peninsula,
draining a important region from the agriculture and urban point of view,
especially in Spain.

2. URBAN WASTEWATER MANAGEMENT

2.1 General considerations

The present concept of urban sewerage dates back some 200 years. The
European cities grew at a rate and to an extent that was no longer
sustainable, due to an internal handling of water and waste in a way that
created foul conditions in general and unacceptable risk of waterborne
diseases in particular. The development of communal, holistic
approaches to handling of water in cities has been an indisputable
success according to the paradigms governing city development for more
than a century. The cities became well regulated. A certain standard with
paved streets, gutters and sidewalks, sub-terrain water supply and
drainage pipes, nicely contained rivers and lakes with stone or concrete
walls were the standard that still dominates the appearance the European
city [1]. With the established classical concept of sewerage, it was
controlled the waterborne diseases in the city.
Meanwhile, demands of society have developed, including promotion
of more sustainable approaches, in terms of better performance with
respect to resources, ethics and economics; new architectural features of
424 R. Neves et al.

water related structures in cities; necessity of control impacts on the


global environment and broadening of transparency in decision
processes.
The news demands lead to new challenges that are being solved with
new tools and new knowledge: analytical tools for a large spectrum of
chemicals and pollution parameters; new database technology, including
GIS (geographical information systems) and DSS (decision support
systems) and new computer simulation tools.
The assessment of the environmental performance of urban
wastewater systems is often a crucial issue, particularly in the developing
countries of the World. In Europe, this aspect assumes a special relevance
in view of the objectives set by the Water Framework Directive that aims
to achieve a good ecological status of all water bodies. The assessment is
an important step to optimize the performance of urban wastewater
systems and to evaluate proper rehabilitation measures.
To properly operate and manage urban drainage systems, numerical
models may be indispensable [2]. Moreover, urban drainage components,
including sewer systems and wastewater treatment plants (WWTP),
should be dealt with jointly, providing a holistic and more sustainable
approach. In fact, the integrated operation of the sewer network and the
WWTP may be required to reduce total emissions on the receiving waters
[3]. Therefore, over more than a decade, several integrated modelling
approaches were developed, some of them also including receiving
waters [4, 5, 6].
The modelling approaches, particularly integrated modelling approaches
are seldom applied by practitioners for planning urban wastewater systems,
particularly due to lack of data or deficient knowledge.

2.2 Modelling the performance of sewer systems

The deterministic modelling of water motion in sewer networks is


undoubtedly one of the success stories in the field. The application of the
unsteady open channel flow model, based on the Saint Venant equations
allowed an accurate description of the hydraulics, to an extent that "if the
simulation does not fit the results very well, then the information about the
system may be faulty, rather than the model” [7]. To make the Saint Venant
equations applicable for surcharged flows, Preissmann introduced the
concept of a hypothetical open slot at the top of the pipe.
Since the solution of the Saint Venant equations (or their approximations)
is computationally demanding, simpler flow routing models have been
developed. These hydrological models generally respect continuity equation
but replace the conservation of momentum with some conceptual
Integrated Water Management 425

relationship. The underlying concept is a cascade of reservoirs in series with


the water being routed downstream. Due to simplicity, the reservoir cascade
approach allows rapid simulation; on the other hand, effects such backwater
pressurized flows cannot be simulated at least not directly. This constitutes a
serious limitation, in particular for looped or flat networks. One of the main
advantages of these models is that the approach is most easily extended for
additional consideration of the transport phenomena.
Sewer simulation, hydrological flow routing methods are applied seldom
for prediction of hydrodynamics alone but usually in connection with the
simulation of water quality [4].
Since the early 1970s, the most frequent modelling approach used to
simulate pollutant transport in sewer systems takes into account four main
steps: pollutant accumulation; pollutant wash off; pollutant transport and
pollutant processes.
Simulations of the hydrology and the hydraulics of sewer systems have
been well accepted, especially with respect to flooding and hydraulic loads
on treatment plants and receiving waters, as well simplified simulation of
pollutant transport and pollution discharged from combined sewer
overflows. Perhaps the most known available models are: SWMM (Storm
Water Management Model, from the US Environmental Protection Agency),
MOUSE (Modelling Urban SEwer Systems, developed by the Danish
Hydraulic Institute), INFORWORKS (developed by Wallingford Software)
and others, such as HYDRA, Sewer CAD, XP-SWMM, FLUPOL and
SAMBA.

2.3 Wastewater treatment modelling

The modelling of the wastewater treatment subsystem is quite different


from the modelling of sewer systems in two respects: first, the underlying
hydraulics can nearly always be approximated crudely and, second, the
modelling is built up around unit processes. The mathematical description of
the unit processes usually requires the specification of large numbers of
components and of numerous interactions. Hunge [4] introduced a matrix
form for the presentation of the model reactions which has become standard
in all aspects of water quality modelling.
This overview on unit processes is limited to some of the most important
ones (activated sludge and clarifiers).
The modelling of the activated sludge process has clearly drawn most of
the unit process modelling since the 1950ties and many different approaches
have been explored. Since the groundbreaking work of the IAWPRC Task
Group on Mathematical Modelling of the Activated Sludge Process in the
early 1980ties, most model development work has been geared around what
426 R. Neves et al.

called the industry standard suite of Activated Sludge Models [9]. These
models have shown to adequately describe the behavior of nitrogen and
biological and chemical phosphorus removal processes, more particularly in
terms of the oxygen demand, sludge production and nitrogen/phosphorus
removal. More recently, refinements of the models were presented in which
storage processes are included. These models have also lead to the
introduction of simulation software in the consulting an engineering
companies and have been a driving force for a more detailed understanding
of the processes, leading to considerably improved operation treatment
plants.
Clarifiers act on particulate matter that one either wants to prevent from
entering the plant (primary clarification), or from leaving the system
(secondary or final clarification). Another objective of such unit processes is
the thickening, either to increase the biological activity in the bioreactors, or
to prepare for waste sludge treatment. Models for these systems are
classified according to their spatial resolution, going from simple 0- to
complex 3- dimensional models that require application of computational
fluid dynamics models. The 0- dimensional models only separate a
particulate-rich stream from a (nearly) particulate-free stream and have no
volume, relating to the assumption that no accumulation of mass occurs in
the clarifier. The most popular clarifier models that can reasonably describe
both the separation process and the dynamic mass accumulation in the
clarifier are the so-called ID-models. Since usually only 10 layers are
applied, the common approach is in fact a reactors-in-series approach rather
than a discretization of an ID partial differential equation.
Any clarifier model contains a settling velocity function that describes its
dependence on the local concentration (settling is hindered increasingly with
concentration above a certain threshold value) and the sludge volume index
as an indicator for the settling capacity. The empirical model of Takács [10]
is currently the most widely applied one.
The models are available to simulate the performance of wastewater
treatment plants (being based on activated sludge or biofilms). Some of the
most important models in terms of application are: EFOR (developed by the
Danish Hydraulic Institute), STOAT (developed by the Water Research
Center), SASSPRO (developed by the Science Traveler International),
BIOWIN (from Envirosim Associates, Lta) and CPS-X (from Hydromanties,
Inc.).

2.4 Integrated modelling

Even though one of the first mentions of the idea of integrated modelling
was made by Beck [11] and the first integrated model was applied 20 years
Integrated Water Management 427

ago [12], it took until the early 1990s before the concepts started to be
disseminated in larger scale. Whereas early approaches considered only total
emissions from sewer system and treatment plant, the work by Schütze [13]
and the work of Vanrolleghem [14] were the first to include deterministic
models of the total system. These studies revealed the importance of
consideration of both, treatment plant effluent as well as Combined Sewer
Overflows (CSO) discharges, for a proper assessment of impacts of storm
events on the receiving water body.
The Danish Hydraulic Institute (DHI) and Water Research Center (WRc)
developed an “Integrated Catchment Simulator (ICS)” in a large EU-funded
“Technology Validation Project”. ICS is basically a graphical interface for
setting up and running integrated models with feed forward feed back of
information. The present ICS version includes existing models for sewers
(MOUSE), rivers (MIKE 11), wastewater treatment plants (STOAT) and
coastal areas. During the course of this project, then fairly complex
constituent models were linked in various stages; first in a sequential way,
later in a simultaneous way. The complexity of the sub-modules, however,
currently limits the application of ICS.
The simulator platform WEST follows a different pathway. Although
originally developed for wastewater treatment modelling, it can be seen as a
general simulation environment for computing. The concept puts a limit to
the description of water motion and transport processes in the elements but
allows to implement more or less freely different conversion models for the
different elements (representing catchments, CSO structures, reactors and
clarifiers. WEST is predominantly an environment for the development of
fast surrogate models for the purpose of long term simulation.
SIMBA® is a simulation platform running on top of
MATLABTM/SIMULINKTM. Models are available for sewer systems,
treatment plants and rivers. The general principle is similar to the network
concept already presented for the example of WEST, however, the use of the
general purpose simulation environment MATLABTTM/SIMULINKTTM
allows the user to add its own modules to fit the actual modelling. Thereby,
the distinction between model developer and model user is largely removed.
This system is also a convenient tool for optimization of the overall
performance of the system.
Basically, it can be stated that today a number of tools are available
which allows the urban wastewater system to be considered in simulation as
what it indeed is - one single system [4].
Nevertheless, and due to the systems complexity, numerical models
generally require a large amount of data in order to build the physical
representation of the system and to calibrate and validate all the significant
model parameters. Data requirements include the catchments surface
428 R. Neves et al.

characterization (i.e., imperviousness area, ground slopes and usage), data on


sewer system characteristics (i.e., geometry and dimensions of pipes and
structures, storage volumes, pumping capacities) and hydraulic loads
(namely from dry weather flow and runoff). Errors or omissions in the
database contribute to model structure uncertainty, which is seldom
accounted for and may lead to incorrect decisions if models are not properly
calibrated.
Also due to the systems complexity and the commonly severe lack of
data, models and particularly integrated approaches are seldom applied by
practitioners for planning urban drainage systems [15]. Furthermore, there is
commonly an incompatibility between modelling time requirements and the
time demands of decision makers.
In view of these limitations, a simplified integrated concept for assessing
and grading the environmental performance of urban drainage systems was
developed in IST/UTL [16]. The Integrated Simplified Approach (ISA)
focuses on situations in which the application of complex models is
particularly difficult or involves a high level of uncertainties. Considering
the simplicity of the ISA concept, it should be specially applied in cases of
scarceness of data and during initial phases of the planning processes. The
ISA concept can be considered a management support tool that is intended to
assess the integrated environmental performance of urban wastewater
systems (including combined, separate or partially separate sewers and
WWTP). The ISA concept can be applied to simple drainage basins or to
basins in series or in parallel with the sewer lines and was already applied to
the Lisbon wastewater system.
On chapter 3 of the paper, a case study is presented, where a detailed
integrated approach was followed, using MOUSE for transport in sewers and
EFOR for treatment purposes. The models were calibrated with real data.

3. CASE STUDY: THE URBAN DRAINAGE


SYSTEM OF S. JOÃO DA TALHA

3.1 System characteristics

The urban drainage system of S. João da Talha serves the civil parishes
of Bobadela, S. Iria da Azóia and S. João da Talha, in the municipality of
Loures. The system includes a wastewater treatment plant (WWTP) and two
main gravity interceptors, namely the South Interceptor and the North
Interceptor.
Integrated Water Management 429

These interceptors were built in very unfavorable conditions due to the


characteristics of the foundation soils and to the high freatic levels, having
suffered subsequent differential settlements. Recent topographic surveys
demonstrate the existence of uneven slopes along the interceptors, different
from the designed ones. Some sewer stretches present upward sloping.
Consequently, self-cleansing velocities are seldom verified, thus leading to
the deposition of sediments and the frequent occurrence of surcharges.
The interceptors transport the effluents of combined and separate
domestic sewer systems. Therefore, combined sewer overflows (CSO) take
place during rain storms, discharging into the Tagus estuary and contributing
to the pollution of receiving waters.
The North Interceptor drains the effluent of most of the regions’
industries effluent. It length is nearly 3,8 km and it presents an initial stretch
of 315 mm diameter, intermediate stretches of 400, 600 and 800 mm
diameter and, after the connection with South Interceptor, a small stretch that
leads to the WWTP entrance of 1000 mm diameter. The South Interceptor is
around 2 km long and its diameters vary between 400 and 600 mm. Each
interceptor has a weir through which the overflows are discharged into the
receiving waters.
As illustrated in Figure 1, S. João da Talha wastewater treatment plant is
located in Bobadela between the national road EN 10 and the railway. The
WWTP, operating since 1997, was designed to serve 130 000 equivalent
population (e.p.) in the design period. Nowadays, more of 65% of the treated
wastewater has an industrial origin (IST, 2005).

Figure 1. Location of S. João da Talha wastewater treatment plant.


430 R. Neves et al.

The WWTP is an activated sludge plant that includes the following


treatment stages for the liquid phase:

x preliminary treatment (after the wastewater enters the treatment plant


collection wet-well, it is pumped by six Arquimedes screw pumps and
enters the preliminary treatment, which includes screening, sand, grit
and fats removal, flow measurement and a homogenation tank);
x physical-chemical treatment and primary settling;
x biological treatment by activated sludge;
x secondary settling;
x final discharge into the estuary of Tagus river.

Sludge is treated in thickeners, anaerobic mesophilic digesters and


mechanical centrifuges and, subsequently, land applied.

3.2 Field experiments in S. João da Talha WWTP

The experimental work was carried out in order to characterize


quantitatively and qualitatively the wastewater in S. João da Talha WWTP,
during dry weather. Two experimental campaigns were carried out in the
days 12/13 and 26/27 of January 2005. The campaigns included the
collection of wastewater samples in the following sections: at the entrance of
the WWTP (SA1 (CJ1)), downstream of the primary treatment (SA5), in the
final effluent (SA7), in the sludge supernatant stream (SAE), in the aeration
tanks (SL6.1) and in the recirculation stream (SLR). Samples were collected
at 22 h, 0 h, 3 h, 5 h, 10 h, 12 h, 14 h, 16 h, 18 h and 20 h, and the following
quality parameters were determined: temperature, pH, dissolved oxygen
(DO), conductivity, COD, BOD5, TSS, total nitrogen, nitrites, nitrates,
Kjeldahl nitrogen, total phosphorus and total coliforms (TC). At the same
time, effluent, influent, recirculation and supernatant flows were continually
measured.
In Figure 2, the major analytic results of the experimental campaigns that
took place on 12/13 of January 2005 are presented.
Integrated Water Management 431

BOD5 - 12 e 13 Jan 2005 COD- 12 e 13 Jan 2005


2500 6000

2000 5000

COD [mg O2/l]


4000
BOD5 [mg/l]

1500
3000
1000
2000

500 1000

0 0

22

10

12

14

16

18

20
22

10

12

14

16

18

20
t [h] t [h]

TSS - 12 e 13 Jan 2005 TC - 12 e 13 Jan 2005


8000 5.0E+09
7000
4.0E+09

TC [NMP/100ml]
6000
5000
TSS [mg/l]

3.0E+09
4000
2.0E+09
3000
2000 1.0E+09
1000
0 0.0E+00
22

10

12

14

16

18

20
22

10

12

14

16

18

20

t [h] t [h]

SA1 (CJ1) SA7 SAE SA5


LEGEND:
Figure 2. Graphical representation of the obtained analytic results: 12/13-Jan-2005
experimental campaigns.

3.3 Integrated modelling of the S. João da Talha


drainage system

The mathematical simulation of the environmental and hydraulic


performance of the interceptor system was made using the program MOUSE
(Modelling package will go Urban drainage and SEwers systems), developed
by the Danish Hydraulic Institute (DHI). This program carries out the
computation of unsteady flows in pipe networks and models both the
hydrological and the hydrodynamical aspects of the urban drainage systems.
Initially, the detailed physical characterization of all the components of
the drainage system (including sewers, manholes and overflow weirs) was
made. Drainage catchments were also described, including parameters such
as the area of the catchments, population served, percentage of impervious
areas, times of concentration and locations of the nodes where the
catchments are connected. The model of the system included, besides the
North and South Interceptors, the final stretch that connects the treatment
plant collection wet-well to the stretch located immediately before the
screening equipment. The performance of the weir wall located in the
collection wet-well and of the final sewer that discharges the treated effluent
into Tagus river (or the wastewater that exceeds the WWTP capacity) were
also simulated. In the node representing the estuary of the Tagus river, the
variation of the outlet water level due to tidal effects was taken into account.
432 R. Neves et al.

In Figure 3 a schematic representation of the interceptor systems’ model


is presented. The zoom in refers to the final sections of the North and South
Interceptors, next to the WWTP, and includes the Arquimedes screw pumps
(stretch ETAR – OE), the general by-pass of the WWTP (stretch ETAR –
OE-jus) and the final discharge into the estuary of the Tagus river (stretch
OE-jus – Cx.1).

Figure 3. Schematic representation of the interceptor systems’ model.

Figure 4 and Figure 5 illustrate the longitudinal profiles of the North and
South Interceptors. These profiles include the model results regarding water
levels in branches, in a certain model time instant.
N00005
IN0001250
N00250
IN0004350
N0040
00055
IN 065050
0 5
0 0
0 5
N00 0
IN 0895
0 0
IN01 5
0100
IN0011015
IN01150
IIN011205
N0130
IN 1435
0 0
0 5
N0150
IN011655
0 0
01 5
IN 0
0 0
0 5
0 0
5
IN 200
02 5
N0 10
IN 21250
0 5
0 0
N 235
IN024408
IN 255
0
N0 5
IN022660
N0275
002 0
IN 2875
0 0
0 5
0 0
0 5
IIN0300
IN0312150
IN03 05
IN0 25
IN 3330
IN0 5
N0 40
003355
IN 36550
0
IN03 5
IN0370
IN0 5
IN0 80
03 5
IN 3990
IN0405
IN 400
0 5
0 0
0 5
IN 420
IN0443205
IIN0435
N0440
550
0001

03

IN 07
IN 07
I 08

IN 09

IN0 2

IN 14

IN 16
7
IN018
IN 18
IN 19
IN 19
IN020

IN022
IN 23

IN00223

IIN025

IN 28
IN 29
IN 29

N0330

IIN0334

IN036
7
IN0338

IN 41
IN 41

044
02

03

[m]
0

IN0
IN
IIIN
IN N

IIN

IIN

IN

IIN N

IIN

IIN

IIN

IIN

IN

IN

15.0

10.0

5.0

0.0

0.0 500.0 1000.0 1500.0 2000.0 2500.0 3000.0 3500.0


[m]

Figure 4. Longitudinal profile of the North Interceptor.


Integrated Water Management 433
00 5
10

IS 5

IS 0
25

IS 5

IS 0

IS 5
50

IS 5
60

705

00 5
80

0 095
IS 905
IS 00
05

IS 0

IS 5
20

IS 130
35

40

0 145
50

55

0
01 5
70

IS 0
85

90

95

500
IS0 00

0 006

IS0 07

IS0 008

IS 16
IS 16

0240
00

00

00

00

00

00

00

00

00

01
01

01

01

01

01

01

IS0 1

01

01

01

01

01
[m]

IISS0

ISS0

0
0

S0
IIN
IS

IS

IS

IS

IS

IS

IS

IS

IS

IS

IS

IS

IS

IS
I
8.0

6.0

4.0

2.0

0.0

0.0 200.0 400.0 600.0 800.0 1000.0 1200.0 1400.0 1600.0 1800.0 2000.0
[m]

Figure 5. Longitudinal profile of the South Interceptor.

The mathematical simulation of the WWTP performance was made using


the software EFOR, developed in the decades of 80’s and 90’s by a group of
Danish consultants (Krüger TO/S and Emolet Date) in collaboration with the
“Technical University of Denmark”. In December 2000 the software was
included in the package of the DHI. In the present case study, EFOR was
used in an integrated way with the model MOUSE.
The program EFOR comprises blocks that can be interconnected by links
and that represent the inflow to the WWTP, reactors, settlers, dosing units
which allow adding organic or chemical additives to the activated sludge
system, pumps, valves, rotors and diffusers, outlets and the excess sludge
leaving the system. The characteristics of influent wastewaters can be
introduced by the user or edited (based on pre-definite types) and are
submitted to mass balances: the values not explicitly specified are estimated
through algorithms that consider the relationships between different
parameters. The program allows the implementation of control loops
referring to aeration, excess sludge, sludge recirculation and chemical
additives dosage. The controllers may be configured in order to activate or
deactivate a control device in response to the values measured by sensors
associated to the WWTP units. Different types of controllers may be used,
such as timer, step, on/off and PID (proportional, integral, derivative)
controllers.
To simulate the biological reactors of the S. João da Talha WWTP, the
CNDP model was considered. This model is based in the IWA models ASM-
1, ASM-2 and ASM-2d and is the only model in EFOR that takes into
account the dosage of chemical additives. The primary and secondary
settlers were simulated considering a simple model of two layers and a flux
model, respectively. The model of the WWTP, presented in Figure 6, was
developed in view of the physical characteristics of each treatment unit and
its equipment, as well as the operation criteria and methodologies
implemented in this WWTP.
434 R. Neves et al.

Figure 6. Schematic representation of the WWTP model.

The model was run for the periods of time coincident with the
experimental campaigns and for the rain event that occurred in 30-10-1988.
The simulation results include flows in links, pollutant concentrations and
process rates in all the simulated units. In Figure 7, Figure 8 and Figure 9 the
simulation results obtained from 22h of 12/Jan/05 to 22h of 13/Jan/05 are
presented. Figure 7 refers to the WWTP inflow (Inlet1), final effluent flow
(Outlet1), recirculation flow (SS1->AS1) and excess sludge flow (WS1 –
secondary sludge; WS2 –primary sludge). In Figure 8 the variation of the
DO and TSS in the aeration tank (i.e., MLSS) are presented. Figure 9 refers
to the final effluent characteristics in terms of the following parameters:
COD, TSS, total phosphorus, DO and total nitrogen.

Figure 7. Influent, effluent, recirculation and excess sludge flows.


Integrated Water Management 435

Figure 8. Variation of the DO and TSS in the aeration tank.

Figure 9. Final effluent COD, TSS, total phosphorus, DO and total nitrogen
concentrations.

3.4 Conclusions

The S. João da Talha case study has demonstrated the ability for
integrated modelling of the performance of sewer systems and treatment
plants, with acceptable simulation of flows and pollutant concentrations
along the treatment units.
Anyway, special difficulties were faced, in terms of simulations of
suspended solids in the second clarifier. Modelling of this case study was
expected to be especial difficult, taken into account the relevant industrial
origin of the influent.

4. MODELLING RECEIVING WATERS

In the previous chapter, the models for simulation and management of


urban waste waters and its application to the S. João da Talha WWTP were
436 R. Neves et al.

described. The drainage network models require solving a two-dimensional


problem (urban areas can be considered as bi-dimensional), but which
locally are one-dimensional (if one considers streets as lines). As a
consequence, these models require an important empirical knowledge and
are normally developed by institutions of applied character giving origin to
commercial modelling packages. The same occurs with WWTP’s, where
treatment involves well known biogeochemical and separation processes and
where the success depends on the application of practical details whose
study includes a high experimental empirical component.
The difficulties associated with the simulation of receiving waters are
related to the great number of processes involved and to its spatial-temporal
variability and also to the effects of contaminants over the biota, directly and
indirectly through habitat changes. As a consequence of this complexity, the
management tools require the simulation of processes and thus its
development it’s normally associated to research institutions.
Hydrodynamic modelling was initiated in the early 1960’s, with the birth
of computation, a decade where the first temporal discretization methods for
flows with hydrostatic pressure were published [17, 18] and developed for
two-dimensional vertically integrated models. In the 1970’s, the number of
applications was multiplied and extensive research on numerical methods
was carried out, namely on forms to minimize numerical diffusion
introduced from solving advection terms (e.g. [19, 20]). Three-dimensional
models necessary to simulate oceanic circulation had a high development in
the 1980’s, benefiting from the increase in computing capacity and in the
breakthroughs in turbulence modelling based on work since the 1970’s
which had in Rodi [21] one of its main pioneers. In the 1990’s,
hydrodynamic models were consolidated and several models with great
visibility started to emerge, e.g. POM [22], MOM [23] but also from
European schools, e.g. GHER model [24]. Benefiting from technological
advances, including both hardware and software (e.g. compilers, data
management, graphical computation), from the second half of the 1990’s, the
dawn of integrated models, coupling modules developed by several authors,
was witnessed. Turbulence modelling packages like GOTM [25] constitute
one of the first examples of this integration, but coupling GOTM to other
models constitutes a second level integration example.
Together with the development of hydrodynamic models, ecological
models were also developed. Among the pioneer models one can mention
WASP developed at EPA [26] and BOEDE model developed at NIOZ [27].
These models were developed in boxes and in former times used a time step
of one day, being the short term variability of flow (e.g. tidal) accounted
using diffusion coefficients. Ecological models have improved a lot during
the 1980’s and 1990’s, benefiting from the scientific and technological
Integrated Water Management 437

progress and have been coupled to physical (hydrodynamic) models thus


generating the present integrated models.
Current research on modelling is oriented towards operational modelling,
integrating different disciplines and assimilating as much field data as
possible, with especial emphasis for remote sensing.
Modelling at UTL followed the world trends and benefited from high
investments on computing systems in the 1980’s. The development of
MOHID system (http://www.mohid.com) was initiated at that time [28] as a
2D hydrodynamic model and was subsequently developed for becoming an
integrated modelling system for tidal flow in estuaries and progressively
generalized to waves [29], water quality [30], three-dimensional flows [31],
new numerical methods [32], extended set of different open boundary
conditions [33] and finally to be reorganized in an integrated perspective in
order to accommodate alternative modules for different processes [34]. The
model evolution enabled to couple alternative modules to compute
biogeochemical and water quality processes [35, 36, 37], the broadening to
flow through porous media [38], model water flow in a river basin [39], and
ocean circulation [40].
This model is a working tool of the environmental modelling group of
MARETEC research centre, having been used in more than 30 research
projects, 50% of which with European funds and currently has around 500
registered users in its online website.

4.1 Integrated modelling

An ideal integrated modelling system should consider the water cycle


from the moment water is evaporated from the ocean until it returns to it
through the rivers, and should also consider the biogeochemical processes
which occur during this path, from the atmosphere to the ocean itself.
Presently, there are models that study the different compartments within
the water cycle, which integration allows stepping towards the ideal
integrated model. Every time two models of adjacent compartments are
integrated, one boundary condition is eliminated. Boundary conditions are
normally a source of uncertainty to the models. Thus, an integrated model
should include: a meteorological model, a basin model (including surface
waters, the vadose zone and aquifers), a model for the estuaries and coastal
areas, a model for ocean circulation and an urban area model, as described in
chapter 2. If this model does not exist, the coupling is made admitting the
fluxes are exclusively determined by one of the compartments (e.g. the
meteorological model provides winds, heat fluxes and precipitation to the
watershed and ocean circulation models). The basin model produces river
flows which are used as a boundary condition in reservoir and estuarine
438 R. Neves et al.

models. In estuaries, subjected to tide, the downstream boundary condition is


more complex, as there, the flow is reversible. Thus, the study of estuaries
demands the dynamic coupling of ocean models and estuarine models. The
challenges when imposing boundary conditions when the flow is reversible
require the consideration of high resolution models nested into large scale
models with a coarser resolution of the computational grid. Figure 10
presents an application using nested models to simulate flow in the Western
and Eastern Scheldt (The Netherlands). A coarse grid model simulates the
southern area of the North Sea computing boundary conditions to be
imposed in the maritime boundary of the estuaries.
Figure 11 schematically represents the processes integration structure in
MOHID model, which has two main modules, one for the water column and
another for the sediments. Between these two modules and between the
water column and the atmosphere there are interface modules. Between the
water column and the sediments the interface is dynamic, allowing
information to pass in both ways. The figure also presents the processes
included in each module and the atmosphere where the processes can be
simulated by a meteorological model.
Figure 12 shows the wind velocity and temperature fields used to force
MOHID calculated by the MM5 model operated at Secção de Ambiente e
Energia of DEM - IST (http://meteo.ist.utl.pt). The dynamical coupling of
these two models is in progress and will allow in the future to improve
meteorological and oceanic forecast associated with small scale processes.

Figure 10. Example of an application of MOHID model in the Scheldts estuaries (The
Netherlands) and in the southern North Sea using a nested models system to impose boundary
conditions in the sea open boundary.
Integrated Water Management 439

Figure 11. Schematic representation of MOHID modules structure: in the upper part the water
column is represented and the sediments below. For each subsystem the main modules of the
model are indicated.

Figure 12. Example of a wind velocity and temperature field calculated by the MM5 model
operated at Secção de Ambiente e Energia do DEM/IST (http://meteo.ist.utl.pt).
440 R. Neves et al.

Figure 13. Schematic representation of the processes simulated by hydrologic model of


MOHID.

Figure 13 schematically represents a hydrographical basin, the water


fluxes and the equations that represent the flow in each component of the
basin (surface run-off, rivers and soil). This model is forced at the surface by
the atmosphere module (precipitation, radiation, heat and evapotranspiration)
and topographic information, soil properties and use to calculate the fluxes
and the properties of the water reaching reservoirs and estuaries downstream.

4.2 Tagus estuary example of a MOHID application

The Tagus estuary is one of the largest in Europe and is subjected to


important urban and agricultural loads (transported by the Tagus and Sorraia
rivers), thus being worthy of attention by the scientific community and
environmental managers from which result high quantities of data and a
great deal of questions. The Tagus estuary is therefore an excellence case for
mathematical modelling.
In the Tagus, the model has been applied in the framework of national
and international research projects and in the framework of consulting
projects for companies and state authorities, from which can be pointed
Instituto da Água (Portuguese National Water Authorities), SIMTEJO and
SANEST. The study of trophic processes and nutrient dynamics in the
estuary, with the intention of assess eutrophication risks is particularly
interesting to illustrate the potential of integrated modelling.
Integrated Water Management 441

Figure 14 shows the current velocity fields during ebb. The figure
presents the maximum velocity area in the exit channel, which give origin to
an ebb jet that together with vertical mixing processes controlling the mixing
of the Tagus river water with ocean water. Based on hydrodynamics the
main ecological processes occurring in the estuary were simulated as well as
the fate of the nutrients loaded into the estuary. In order to integrate the
results, the estuary was divided into 10 boxes and the fluxes along the
interfaces of those boxes were integrated along one year.

Figure 14. Velocity field in the Tagus estuary during ebb

Figure 15 shows the computed nitrate and phytoplankton fluxes along the
boxes interfaces represented in the figures and integrated along one year
[36]. The figure shows that the quantity of nitrate which is exported by the
estuary (15300 tons/year) during one year is almost similar to the amount of
imported nitrate (14900 tons/year, from which 11600 tons/year from the
Tagus river). The figure also shows that the estuary is a net producer of
phytoplankton (around 7000 tons of carbon and consequently 2000 tons of
nitrogen). The combined analysis of these results, and also of ammonia and
particulate organic nitrogen, show that the estuary imports nitrogen in the
form of nitrate and ammonia, and that it exports it in the form of
phytoplankton and dissolved organic nitrogen.
442 R. Neves et al.

Figure 15. Nitrate and phytoplankton fluxes in the Tagus estuary during 1 year simulated
with the MOHID model [36].

Figure 16 represents nitrate field measurements function of salinity (data


from the 2004/2005 estuary monitoring program promoted by SIMTEJO).
The figure shows the approximately linear evolution of nitrate along the
salinity gradient, in a trend line with a negative slope with a river
concentration of 1.5mgN/l and a sea concentration of 0.2 mgN/l. The figure
demonstrates that in the lower salinity areas the points tend o be below the
trend line, indicating uptake and in the higher salinity areas the points are
above the line, indicating regeneration, which is consistent with model
results. The conservative behavior of nitrate in the estuary is a consequence
of primary production limitation by light, which penetration in the water
column is limited by the turbidity associated with fine sediment resuspension
on the tidal flats, which sum up to 30% of the estuary’s area.

Figure 16. Nitrate in the Tagus estuary function of salinity. The linear trend suggests a
conservative behavior, with some uptake in low salinity areas and regeneration in higher
salinity areas.
Integrated Water Management 443

This model was used to study management scenarios, having been


concluded that there are no advantages in removing nutrients of urban origin
because primary production is not limited by nutrients but by light
availability. A scenario of 50% reduction of nutrient loads from the river
(e.g. reformulation of agricultural practices) was also tested, being verified
that this reduction is not sufficient to alter the trophic activity of the estuary.

5. FINAL REMARKS

In this paper the work done at IST is described in relation to the


contribution to integrated management of water, with special attention to
urban wastewater modelling and eutrophication of inland and coastal surface
waters. Special attention was also paid to the issue on loads of urban origin
and to the ability of the receiving waters to receive and assimilate these loads
without creating risks of eutrophication.
The text is not in-depth in terms of capacities of the presented models or
in terms of the existing capacities available at IST in this subject. However it
is illustrative of the potential of integrated water management and of the
contribution of IST to provide this objective.
The paper is also illustrative of the advantage of using modelling tools in
water management. The case study, Tagus estuary, shows that integrated
modelling is one of the most efficient ways to contribute to a sustainable
management of the estuary, namely in terms of nutrients loads. MOHID has
been used to study other management scenarios, namely in terms of water
microbiologic contamination and heavy metal sediment contamination, two
areas where the interaction with the urban and industrial effluent
management is particularly important.

REFERENCES
1. Harremöes, P., Integrated urban drainage, status and perspectives. Water Science &
Technology, Vol 45, Nº 3, pp 1-10. IWA Publishing, 2002.
2. Di Pierro F., Djordjeviü S., Kapelan Z., Khu S.T., Saviü D. and Walters G.A., Automatic
calibration of urban drainage model using a novel multi-objective genetic algorithm.
Water Science and Technology, 52(5), 43–52, 2005.
3. Seggelke K., Rosenwinkel K.-H., Vanrolleghem P.A. and Krebs P., Integrated operation
of sewer system and WWTP by simulation-based control of the WWTP inflow. Water
Science and Technology, 52(5), 195-203, 2005.
4. Rauch W., Bertrand-Krajewski J.-L., Krebs P., Mark O., Schilling W., Schütze M. and
Vanrolleghem P.A., Deterministic modelling of integrated urban drainage systems.
Water Science and Technology, 45(3), 81–94, 2002.
444 R. Neves et al.

5. Erbe V. , Frehmann T., Geiger W.F., Krebs P., Londong J., Rosenwinkel K.-H. and
Seggelke K., Integrated modelling as an analytical and optimisation tool for urban
watershed management. Water Science and Technology, 46(6-7), 141–150, 2002.
6. Schütze M., Butler D. and Beck M.B., Modelling, Simulation and Control of Urban
Wastewater Systems. Springer Verlag; ISBN 1-85233-553-X, 2002.
7. Harremöes, P. and Rauch, W., Optimal design and real time control of the integrated
urban runoff system. Hydrobiologia, Nº 410, pp 177-184, 1999.
8. Henze, M.; Grady, C.P.L.; Gujer, W.; Marais, G.V.R. and Matsuo, T., Activated sludge
model Nº 1. IAWQ Scientific and Technical Report Nº 1, London, ISSN: 1010-707X,
1987.
9. Henze, M.; Gujer, W.; Mino, T. and van Loosdrecht, M., Activated sludge Models
ASM1, ASM2, ASM2d and ASM3. IWA Scientific and Technical Report Nº 9, London,
UK, 2000.
10. Takács, I.; Patry, G.G.; Nolasco, D., A dynamic model of the clarification-thickening
process. Wat. Res. 25 (10), 1263-1271, 1991.
11. Beck, M.B., Dynamic Modelling and Control Applications in Water Quality
Maintenance. Wat. Res. 10, pp 575-595, 1976.
12. Gujer, W.; Krejei, V.; Schwarzenbach, R. and Zobrist, J., Von der Kanalisation ins
Grundwasser – Charakterisierung eines Regeneignisses im Glattal. GWA, 63(7), pp 298-
311, 1982.
13. Schütze, M.; Butler, D. and Beck, B., Development of a framework for the optimization
of runoff, treatment and receiving waters. 7th Int. Conf. Urban Storm Drainage.
Hannover, 9-13, pp 1419-1425, 1996.
14. Vanrolleghem, P.A.; Fronteau, C and Bauwens, W., Evaluation of design and operation
of the sewage transport and treatment system by an EQO/EQS based analysis of the
receiving water immission characteristics. Proc. Pp 14.35-14.46, WEF Conference Urban
Wet Weather Pollution, Québec, Canada, 1996.
15. Erbe V. and Schütze M., An integrated modelling concept for immission-based
management of sewer system, wastewater treatment plant and river t. Water Science and
Technology, 52(5), 95-103, 2005.
16. Ferreira, F.; Matos, J.; Teles, S., An Integrated approach for preliminary assessment of
the environmental performance of urban wastewater systems. Water, Science &
Technology, submitted, 2006.
17. Leendertse, J. J. Aspects of a computational model for long-period water-wave
propagation. Rand Corporation, Santa Monica, California, RM-5294-PR. 165 pp., 1967
18. Heaps NS. A two-dimensional numerical sea model. Philosophy Transactions Royal
D.B., 1969
19. Spalding. A novel finite difference formulation for differential expressions involving
both first and second derivatives. Int. J. Numer. Methods in Engineering, 4:551-559,
1972.
20. Leonard, B. P. A stable and accurate convective modelling procedure based on quadratic
upstream interpolation. Comput. Meth. Appl. Mech. Eng., 19, 59–98, 1979
21. Rodi, W., The Prediction of Free Turbulent Boundary Layers by Use of a Two-equation
Model of Turbulence, PhD Thesis, Imperial College, University of London, UK, 1972
22. Blumberg, A. F. and G. L. Mellor. A description of a three-dimensional coastal ocean
circulation model. Three-Dimensional Coastal Ocean Models, ed. N. Heaps. Vol. 4, 208
pp. American Geophysical Union, 1987
23. Pacanowski, R. C., K. W. Dixon and A. Rosati: GFDL Modular Ocean Model, Users
uide Version 1.0, GFDL Tech. Rep., 2, 46 pp., 1991
Integrated Water Management 445

24. Nihoul, J.C.J., Deleersnijder, E., and Djenidi, S. Modelling the general circulation of
shelf seas by 3D k - epsilon models. Earth Science Reviews, 26 pages 163-189, 1989
25. Burchard, H., K. Bolding, and M. R. Villarreal, GOTM - a general ocean turbulence
model. Theory, applications and test cases, Tech. Rep. EUR 18745 EN, European
Commission, 1999
26. Di Toro, D.M., Fitzpatrick, J.J., and Thomann, R.V. 1983. Water Quality
AnalysisSimulation Program (WASP) and Model Verification Program (MVP)
Documentation. Hydroscience, Inc. Westwood, NY. USEPA Contract No. 68-01-3872.
27. Ruardij, P., and J. W. Baretta. The EmsDollart Ecosystem Modelling Workshop.
BOEDE Publ. en Versl. No. 2, Texel, 1982
28. Neves, R. J. J. - Étude Experimentale et Modélisation des Circulations Trasitoire et
Résiduelle dans l’Estuaire du Sado, Ph. D. Thesis, Univ. Liège, 371 pp., 1985 (in
French)
29. Silva, A.J.R., Modelação Matemática Não Linear de Ondas de Superfície e de Correntes
Litorais, Tese apresentada para obtenção do grau de Doutor em Engenharia Mecânica.
IST, Lisboa, 1991 (in Portuguese)
30. Portela, L.I., Mathematical modelling of hydrodynamic processes and water quality in
Tagus estuary, Ph.D. thesis, Instituto Sup. Técnico, Tech. Univ. of Lisbon, 1996. (in
Portuguese)
31. Santos, A.J.P. Modelo hidrodinâmico tridimensional de circulação oceânica e estuarina.
Tese de doutoramento. Instituto Superior Técnico, Universidade Técnica de Lisboa, 273
pp., Lisboa, 1995 (in Portuguese)
32. Martins, F. Modelação Matemática Tridimensional de escoamentos costeiros e
estuarinos usando uma abordagem de coordenada vertical genérica. Universidade
Técnica de Lisboa, Instituto Superior Técnico. Tese de Doutoramento, 2000 (in
Portuguese)
33. Leitão, “Integração de Escalas e de Processos na Modelação ao Ambiente Marinho,
Universidade Técnica de Lisboa, Instituto Superior Técnico. Tese de Doutoramento,
2003 (in Portuguese)
34. Braunschweig, F., P. Chambel, L. Fernandes, P. Pina, R. Neves, The object-oriented
design of the integrated modelling system MOHID, Computational Methods in Water
Resources International Conference, Chapel Hill, North Carolina, USA, 2004
35. Trancoso, A., Saraiva, S., Fernandes, L., Pina, P., Leitão, P. and Neves, R., Modelling
Macroalgae using a 3D hydrodynamic ecological model in a shallow, temperate estuary,
Ecological Modelling, 2005
36. Saraiva, S., Pina, P., Martins, F., Santos, M., Braunschweig, F., Neves, R., EU-Water
Framework: dealing with nutrients loads in Portuguese estuaries, Hydrobiologia, 2006
(accepted for publication)
37. Mateus, M., A Process-Oriented Biogeochemical Model for Marine Ecosystems
Development. Numerical Study and Application. Universidade Técnica de Lisboa,
Instituto Superior Técnico. Tese de Doutoramento (submitted), 2006
38. Galvao, P., Chambel-Leitao, P., Neves R. and Leitao P., A different approach to the
modified Picard method for water flow in variably saturated media, Computational
Methods in Water Resources, Part 1, Developments in Water Science, Volume 55,
Elsevier, 2004
39. Braunschweig, F., Neves, R., 2006 Catchment modelling using the finite volume
approach, Relatório final do projecto http://www.tempQsim.net, Instituto Superior
Técnico, 2006
446 R. Neves et al.

40. Leitão, P. Coelho, H. Santos, A. Neves, R. et al, Modelling the main features of the
Algarve coastal circulation during July 2004: a downscalling approach. Submitted to
Journal of Atmospheric and Ocean Science, 2006 (submitted)
PART VII

PUBLIC HEALT, FOOD QUALITY


AND SAFETY
FOOD SAFETY CRISIS MANAGEMENT AND
RISK COMMUNICATION
The example of Bovine Spongiform Encephalopathy in
Portugal

Virgilio Almeida
Faculdade de Medicina Veterinária, Universidade Técnica de Lisboa, Pólo Universitário da
Ajuda, Avenida da Universidade Técnica, 1300-477 Lisboa, Portugal, vsa@fmv.utl.pt

Abstract: This paper discusses a combination of factors that fuelled the BSE crisis in
Portugal and highlights the relevant changes that the BSE epidemic compelled
in the beef chain and products of bovine origin. An emphasis is made on the
dilemma of communicating out of a food safety crisis.

Key words: BSE, risk, communication, consumer.

1. BSE BACKGROUND

Bovine Spongiform Encephalopathy was diagnosed for the first time in


England in 1986 as a fatal neurological condition [1] and it is probably the
single infectious disease that most contributed to place food safety and the
international trade of life animals and products of animal origin on the top of
the European and world political agenda.
The challenge was and it remains very high: An epidemic at European
level, of an emergent disease caused by an unconventional agent, spread by
the trade of contaminated meat and bone meal (MBM), does not produce a
measurable inflammatory response that could allow for in vivo detection of
infected bovines or the possibility of inducing artificial protection through
vaccination. The scale of the hazard was enlarged when the agent, adapted to
cattle, crossed the species barrier and first infected felines and latter human
beings.
Portugal was victim of its own strategy to control the epidemic of
Contagious Bovine Pleuropneumoniae (CBPP) that occurred on the northern

449
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 449–456.
© 2007 Springer. Printed in the Netherlands.
450 V. Almeida

regions of Entre-Douro e Minho and Trás-os-Montes in early 80’s. CBPP is


included on the list of notifiable diseases to the World Organization for
Animal Health (O.I.E.) and its eradication is considered prior because the
disease can spread rapidly, a proportion of bovines become carriers of the
Mycoplasma mycoides sub.mycoides and available vaccines do not burst a
good immunity and make the use of serological tests impracticable.
Therefore the recommended eradication strategy is a combination of
serological screening plus reactors sacrifice.
Producers need to replace slaughtered bovines and many were forced to
buy replacement stock mainly heifers. When it was detected a prevalence of
CBPP higher than 30% at herd level, stamping out was performed followed
by disinfection of premises and quarantine with payment of compensations.
In some parishes of the Entre-Douro e Minho Region, total cattle
depopulation was carried out. As the national supply of replacement bovines
mainly dairy heifers could not cope with the demand, neither on quantity nor
on quality, Portuguese farmers decided to import. During 1985-89, 8648
dairy heifers were imported from England. Out of these 51.8% went to
Entre-Douro e Minho Region dairy herds [2].
Remarkably this option was only feasible due to the 1983 imposition of
“Milk Quotas” by the European Economic Community (EEC). This policy
was set to hold milk surplus and the increasing costs of processing and
storing powder milk without demand on the international market. The
immediate result was that EEC dairy producers need to cull bovines to keep
total milk production within their quotas. This opened an “opportunity
window” for Portuguese dairy farmers to buy heifers of good genetic
potential at reasonable prices.

2. BSE EPIDEMIC

In 1990 the National Veterinary Laboratory (LNIV) confirmed the first


BSE case on a dairy cow imported from England. Another six cases, all on
dairy cows imported from England, followed this one. From 1993 BSE is
diagnosed on indigenous cattle.
It is ironic that on the outset of the Portuguese BSE epidemic is the need
to buy replacement bovines due to control measures of another epidemic
(CBPP) and a EEC policy to regulate milk surplus. Between 1990-2005 were
notified 1001 BSE cases in Portugal and only one case of the variant of
Creutzfeldt-Jakob Disease (vCJD) [3]. As illustrated on Figure 1 the
epidemic entered its regression phase on 1999. The main birth cohort is 1994
with 257 BSE cases.
Food Safety Crisis Management and Risk Communication 451

These results reflect the (i) suitability of the sequence of Public Health
and Animal Health prevention and control measures applied by Portugal
and/or by the European Union as scientific knowledge was produced and
diffused and (ii) the efficacy, since 1994, of the implementation in Portugal
of a legal frame to support a strict combination of surveillance and control
measures.

3. THE RESPONSE TO THE PUBLIC AND ANIMAL


HEALTH THREAT

The European and Portuguese scientific and technical community


showed a very high capacity of research, innovation, technology transfer and
flexibility to cope with the epidemic:

(i) New diagnostic tests were developed, tested and commercialized.


Their availability raised in Portugal the need to set up trained
abattoir teams to collect brain samples and to perform the
diagnosis test; to install at the abattoirs the equipment need to
make the test; to reinforce the freezing capacity of the plants
because the test result takes 48 hours. Only bovine carcasses that
test negative can be approved for human consumption since
January 2001. This represents a 100% covering rate of a very
costly strategy, without precedents to protect consumers.
(ii) A very exhaustive active monitoring scheme was designed, set
up and supervised at European level. More than 41 million
bovines were tested since 2001. The monitoring is targeted to
major risk groups such as emergency slaughters, fallen stock or
bovines with clinical signs of disease at the ante mortem exam on
the abattoir. 11049822 bovines were tested in 2004 on the 25 EU
member states [4]. Out of these 115017 in Portugal. The National
Sanitary Authority revealed good engagement skills with the
cattle industry. They set up a network to collect fallen stock from
farms to assure the continuous study of a representative sample
of this risk group. Until 2001 there were not any public or private
firms offering corpses’ removal services and immediate
submission of biological samples to diagnostic laboratories.
(iii) Powerful information systems were developed to secure the
traceability of life bovines and products of bovine origin. In
Portugal the National Cattle Identification and Recording System
(SNIRB) is operational since 1999. It allows the trace back of
feed cohorts of a single cow with BSE. The precision of this tool
452 V. Almeida

allows for the rapid detection of bovines that may be incubating


BSE. At EU level the setting of these traceability circuits
culminated with the approval of an EU compulsive labelling
system for bovine meat since September 1, 2000 (Regulation
(EC) nº. 1760/2000 European Parliament and Council;
Regulation (EC) nº. 1825/2000 Commission, August 25; Law nº.
323-F/2000, December 20; Dispatch nº. 25 958-B/2000 and nº.
10 818/2001). These measures make available feasible
information to the consumer increasing the transparency of the
supply.

Figure 1. Timing of major control measures during the BSE epidemic in Portugal (Nº of cases
per year of death)

4. RISK COMMUNICATION EMBARRASSMENTS

However the risk communication strategy (or absence of strategy) did not
meet the high requirements described on previous paragraphs for scientific
and technical performance. Risk communication is an interactive process of
exchange of information and opinion on risk among risk assessors, risk
managers, and other interested parties [5].
Risk communication is a key-piece on the demanding puzzle of risk
analysis. Europe and Portugal stored up gross errors of risk communication
regarding the BSE epidemic. This failure gave space for ignorance. We were
Food Safety Crisis Management and Risk Communication 453

not capable of deal with the media that enlarged fears. We could not protect
our beef production chain and they went temporarily out of the international
market. If we consider the main factors influencing consumer attitude: (i)
distrust on the institutions (ii) increasing use and dependence upon advanced
technologies (iii) realization of the lack of consequences at global scale
(iv) comprehension of how our options and behaviours contribute for the
hazard probability (v) lack confidence in the risk assessment
(http://www.ca.uky.edu/agripedia), we realize the urgency to establish
pluridisciplinary teams with professionals of Natural Sciences and Social
Sciences (Figure 2) to reduce the odds of raising “nebula” of suspicion
during risk communication that a speculative opinion or an alarmist position
may easily turn into a consumer panic wave.

Figure 2. Pluridisciplinary approach of risk communication.

In fact, the BSE epidemic in Portugal is an appropriate example of


consumer attitude towards risk communication. The first six cases of BSE,
confirmed by the National TSE Reference Laboratory, were kept secret by
the Chief Veterinary Office [6]. The media revelation of this situation gave
rise to a widely publicised political controversy. This controversy was
brought to an end by the Government declaring that the evidence of BSE
was not valid. This underscores the weaknesses of separating agricultural
and medical science, and of allowing the Ministry of Agriculture to protect
the interests of both the food consumers and the farming industry. When the
first BSE case on an indigenous cow was confirmed in 1993, the politicians
were forced to declare that BSE was present on the indigenous cattle
population. This profusion of contradictory signs led to a 61.3% drop on the
consumption of beef and bovine meat products [7]. Three years latter, Will
454 V. Almeida

and co-workers first described the variant Creutzfeldt-Jakob disease [8].


Again the politicians were forced to announce that some bovine tissues
might not be safe, after they claimed they were totally safe. Consumers felt
double deceived. Beef and bovine meat products consumption decayed 51%
[9]. After all the most likely cause of vCJD is exposure to the BSE agent,
most plausibly due to dietary contamination by affected bovine central
nervous system tissue (www.who.int/mediacentre/factsheets/fs180/en/).

5. CRISIS AS AN OPPORTUNITY

Meanwhile the country delayed four years (1990-1993) the


implementation of preventive measures to sustain a probable epidemic that
was underestimated. In deed the shape of the epidemic curve reveals two
waves (Figure 1). A first wave, from 1990 up to 1997, with a very smooth
rise of the number of cases triggered by an initial external BSE challenge. A
second wave, from 1998 until the present, marked by a sharp increase on the
number of cases achieving a peak in 1999. This was the outcome of the
progressive manufacture and circulation of BSE contaminated domestic
MBM because Portuguese rendering plants operated atmospheric batch or
continuous atmospheric processes that could not destroy the BSE agent [10].
From 2000 the epidemic begins declining due to the MBM ban in ruminant
feed enforced in July 1994.
If this measure was imposed in 1990: (i) the BSE incidence would not
reached 200 cases per million bovines over 24 months old in 1999, (ii) the
consumer exposition to the BSE agent would have been strongly reduced. A
stochastic simulation model developed at the Veterinary Epidemiology &
Economics Research Unit (UISEE) of the Lisbon Veterinary Faculty
estimates that until 2001, at least an infectiousness equivalent to 298 BSE
clinical bovines entered the food chain in Portugal [11], (iii) the export of
meat and alive cattle from Portugal would not be banned by the European
Commission in November 1998. The EU embargo was lifted just six years
latter (September 2004) but since Portuguese beef producers target above all
the internal market the negative impact of the prohibition was minor. It is
rather interesting that the Portuguese consumer perceived risks of BSE/vCJD
lead part of them to look for beef from local breeds rose on extensive
production systems located on the large plains of the South or in mountain
areas at the Northeast. This demand encouraged breeder’s associations to
advance for Protected Designation of Origin (PDO) labels. Seven beef PDO
labels have been certified but their total supply is very limited (¡Ö1.5 tonnes
in 2000). With the demand exceeding the supply and with BSE cases being
Food Safety Crisis Management and Risk Communication 455

detected by the active monitoring programme through Europe, Brazil exports


of beef to Portugal increased rapidly.

6. CONCLUSIONS

The BSE crisis began as an epidemic of a novel disease. It was followed


by a domino sequence of ambivalent risk management and poor risk
communication. Then the disease was faced with a very powerful
combination of tools from testing all bovines above 30 months old
slaughtered for human consumption, to the removal of SRM from the feed
chain to the use of sophisticated electronic tracking systems. Consumer
reduction of beef consumption gave clear signs to the market stimulating the
supply of beef PDO labels. In the meantime South American countries like
Brazil, very aggressive on Agribusiness, took advantage of this “opportunity
window” and got hold of a niche market. In the meantime the epidemic is
expected to extinguish in Portugal by 2009 [12] and there have been no more
confirmation of vCJD cases in the country.
Finally it should be emphasized that the Veterinary Authorities have now
available a collection of tools and its human resources developed a set of
skills that will certainly be precious to cope with future food safety
emergencies.

ACKNOWLEDGEMENTS

Dr.Telmo Nunes and all the members of the Veterinary Epidemiology &
Economics Research Unit (UISEE) of the Centre for Interdisciplinary
Research in Animal Health, Lisbon Veterinary Faculty, Lisbon Technical
University.

NOTES
1. The following EU Concerted Actions that supported part of the scientific results
mencioned in this paper:
2. FAIR 98-6056: Setting up of multicentric epidemiological databases and biological
sample banks for small ruminant scrapie.
3. FAIR 98-7021: Establishment of a European network for the surveillance of ruminant
TSE and the standardization and harmonization of the process and criteria for
identification of suspect cases.
4. SRTSNETWORK: European Network for Surveillance and Control of TSE in Small
Ruminants.
456 V. Almeida

REFERENCES
1. Wells, G.A., Scott, A.C., Johnson, C.T., Gunning, R.F., Hancock, R.D., Jeffrey, M.,
Dawson, M. and Bradley, R. (1987). A novel progressive spongiform encephalopathy in
cattle. Vet Rec, 121, pp 419-420
2. Almeida, V., Nunes, T., Vaz, Y., Neto, I., Melo, M. and Louzã, A.C. (2002). BSE in
Portugal – a 12 year epidemic. Proceedings of the 10th International Symposium for
Veterinary epidemiology and Economics, 17-21 November 2003, Viña del Mar, Chile,
pp 849.
3. Direcção-Geral da Saúde. Ministério da Saúde. Variante da Doença de Creutzfeldt-
Jakob. Press release, 9 June 2005.
4. Health & Consumer Protection Directorate-General (2005) Report on the monitoring and
testing of ruminants for the presence of transmissible spongiform encephalopathy (TSE)
in the EU in 2004. Eurpean Communities, 2005, pp 3-4
5. Joint FAO/WHO Expert Consultation (1998) The Application of Risk Communication
to Food Standards and Safety Matters, Joint FAO/WHO Expert Consultation. Rome,
Italy, 2-6 February 1998, pp.6
6. Gonçalves, M., 2000. The importance of being European: The Science and Politics of
BSE in Portugal. Science, Technology & Human Values 25: 417-448.
7. Almeida, J.F. et al (2001). Resumo 2001 do II Inquérito Nacional “Os Portugueses e o
Ambiente”, OBSERVA - Observatório Permanente do Ambiente, Sociedade e Opinião
Pública, pp 8-9
8. Will, R.G., Ironside, J.W., Zeidler, M., Cousens, S.N., Estibeiro, K., Alperovitch, A.,
Poser, S., Pocchiari, M., Hofman, A. and Smith, P.G. (1996). A new variant of
Creutzfeldt-Jakob disease in the UK. Lancet, 347, pp 921-925
9. Almeida, MDV and Graça P. (2000) A BSE e as atitudes dos consumidores. In: Cultura
científica e participação pública. Mª Eduarda Gonçalves (ed). Celta Editora, Oeiras 2000,
pp 243-254
10. Almeida V. (2005). Encefalopatia espongiforme bovina. Parecer para Agência
Portuguesa de Segurança Alimentar, 2005. (available for download at
www.agenciaalimentar.pt).
11. Nunes, T. (2003). Potencial de exposição do consumidor Português ao agente da
encefalopatia espongiforme bovina no período de 1987 a 2001. Dissertação de Mestrado
Mestrado de Saúde Pública Veterinária, Faculdade de Medicina Veterinária,
Universidade Técnica de Lisboa, pp 54-60
12. Report of the EFSA Working Group on the determination of the BSE risk status of
Portugal, Annex to The EFSA Journal (2004) 143, pp 9.
DEBARYOMYCES HANSENII, A SALT LOVING
SPOILAGE YEAST

Catarina Prista1 and Maria C. Loureiro-Dias2


Instituto Superior de Agronomia, Universidade Técnica de Lisboa, Calçada da Tapada,
1349-017 Lisboa, Portugal

1
cprista@isa.utl.pt , 2 mcdias@isa.utl.pt

Abstract: Debaryomyces hansenii is a very peculiar spoilage microorganism: this yeast


shows a good performance under concentrations of sodium chloride which
prevent growth of most microorganisms. Here we report aspects of this
behaviour and present data which support the theory that the salt loving nature
of D. hansenii can be explained by the capability of the membrane potassium
carriers to transport potassium into the cells, even in the presence of high
concentrations of sodium.

Key words: Debaryomyces hansenii, yeast, sodium tolerance, spoilage.

1. INTRODUCTION

Yeasts constitute a group of microorganisms that became famous through


their representative Saccharomyces cerevisiae. These organisms have long
been utilized to ferment the sugars of cereals and fruits to produce wine,
beer, other alcoholic beverages and in baking industry to raise dough. One
can say that this yeast, together with wheat, constitute certainly one of the
main pillars of western civilization. Since yeasts were discovered by Pasteur,
and in particular during the last half a century, the existence of many
different yeast species was recognized, with more than one thousand species
being considered today. Yeasts are unicellular fungi that can be found in a
wide variety of natural habitats. They are common on plant leaves, flowers
and fruits, soil and salt water. Yeasts are also found on the skin surface and
in the intestinal tracts of warm-blooded animals. In particular, yeasts are

457
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 457–464.
© 2007 Springer. Printed in the Netherlands.
458 C. Prista and M.C. Loureiro-Dias

very frequent in food environments, where they can behave either as


productive agents or they can be responsible for spoilage.

1.1 Food preservation strategies

Since the dawn of agriculture, man dealt with the necessity of preserving
food. Cereals can be easily preserved: nature produces dry seeds that can
wait for the right weather conditions to germinate. Man just had to follow
the philosophy of nature to preserve the grains, but the situation can be very
different with other crops. Grapes, for example, are produced in large
quantities during a very short period. Yeasts, like S. cerevisiae, cooperated
with man to solve the problem of grape must preservation: very rapidly they
can convert very perishable sugars into ethanol. Only carbon dioxide is
released but most nutrients stay in wine, in particular energy. During
alcoholic fermentation only approximately 5% of the sugar free energy is
lost, 95% remaining in the ethanol.
Food preservation involves essentially two strategies: either microbes are
eliminated by sterilization, or harsh conditions that generate stress are
created, preventing microbial growth. Stress conditions include low water
activity achieved by the addition of salt or sugar, or by drying, as in salted
codfish, jams and dried fruits like raisins. Traditionally, lactic acid bacteria
also cooperated in the generation of harsh acidic conditions for the
preservation of milk (yogurt, cheese), meat (sausages, ham and chouriços)
and vegetables (pickles, olives) inventing our modern delicacies. As
compared with these methodologies, very, very recently the introduction of
low temperature and artificial food preservatives represented a revolution in
the capability of mankind to manage food.

1.2 Spoilage yeasts

Yeasts are important contaminants causing spoilage in foods with high


and medium sugar content (e.g. fruit concentrates, sugar syrups, jams,
honey) and in drinks (e.g. wines, fruit juices) [6]. In this kind of
environments, bacterial growth is restricted due to low water activity (caused
by high sugar or salt concentrations), low pH and/or addition of acidic
preservatives, whereas yeast growth is favoured, spoiling the food. Therefore
yeasts may be responsible for heavy economical losses in food industry. An
important point with spoilage yeasts is that they are not vectors of diseases,
they are not involved in safety problems, but still they should be taken into
account in what concerns food preservation.
Among spoilage yeasts, a peculiar group, whose best representative is
Debaryomyces hansenii, grows in the presence of high concentrations of
Debaryomyces hansenii, A Salt Loving Spoilage Yeast 459

NaCl. During the last decade a considerable effort has been put in the study
and in the comprehension of the mechanisms mediating salt tolerance in cell
walled eukaryotic organisms in general and in spoilage yeasts in particular
[10,14]. Saccharomyces cerevisiae a moderately tolerant yeast, has been
considered as a model in these studies, and an important amount of
information concerning the processes involved in salt tolerance is now
available [11,12,19].

2. DEBARYOMYCES HANSENII AMONG SALT


TOLERANT MICROORGANISMS

To overcome the toxicity of sodium and exhibit a good performance


under high salt concentrations, most microorganisms use the strategy of
keeping a low intracellular concentration of sodium (the “sodium
excluders”). A few ones, in particular some Halobacteria, require high
intracellular concentrations of sodium for normal enzymatic activities (the
“sodium includers”) [17]. Although mechanisms of sodium extrusion are
present in D. hansenii, several authors reported unusual high intracellular
salt concentrations in this yeast [8, 14]. From this point of view, D. hansenii
may be considered as a “sodium includer” yeast, up to some extent: still an
important role is reserved for the production and intracellular retention of
compatible solutes, glycerol in particular [1, 4, 9].
The peculiar behavior of this yeast together with its ubiquity in salty
environments justifies that the consortium Génolevures selected D. hansenii,
to sequence and annotate its genome, now available at
http://cbi.labri.fr/Genolevures/. Reported data led to the conclusion that D.
hansenii seems to have a high coding capacity among yeasts, amounting to
79.2% of the genome with a putative number of 6906 detected coding
sequences [7].

2.1 What is so special in Debaryomyces hansenii?

In S. cerevisiae, while growth in mineral medium is completely inhibited


by 1.5 M sodium chloride, potassium chloride has a weak inhibitory effect at
the same concentration. However, in D. hansenii, that was able to grow for
concentrations of sodium chloride up to 2.5 M, growth was even stimulated
by 0.5 M salt. In this yeast the inhibitory effect of NaCl was identical to
KCl, indicating that a specific toxic effect of sodium is not involved [14].
Both yeasts were able to grow at potassium concentrations as low as 50 µM,
but under these conditions, while growth of S. cerevisiae was completely
460 C. Prista and M.C. Loureiro-Dias

inhibited by 0.6 M NaCl, the growth of D. hansenii was stimulated by NaCl


at concentrations up to 1 M [14].
Under several stress conditions, the salt-loving nature of D. hansenii was
even more evident [3]. This feature was especially apparent when D.
hansenii and S. cerevisiae were grown close to the maximum growth
temperature. At 34 ºC, sodium chloride clearly stimulated the growth of D.
hansenii and inhibited S. cerevisiae (Fig. 1). D. hansenii, incapable of
growth at 34 ºC, grew with a doubling time of 7 hours with 1 M NaCl. The
same shift in NaCl concentration increased twice the doubling time of S.
cerevisiae. The protective effect of salt on this yeast, well illustrated in this
experiment, is of important significance in food environments. The strategy
of preservation often involves the simultaneous utilization of several stress
agents. For D. hansenii salt cannot be considered as a stress agent. On the
contrary, it has a protective effect against other preservation strategies [3].

Figure 1. Effect of NaCl on growth of D. hansenii ( ) and of S. cerevisiae ( ) at 34 ºC, a


supra-optimal temperature.

2.2 Is Debaryomyces hansenii especially apt to get rid of


sodium?

Since some years ago, it is known that sodium extrusion is a fundamental


process for yeasts like S. cerevisiae when grown in the presence of salt [10],
and even for more osmotolerant yeasts, like Pichia sorbitophila and
Zygosaccharomyces rouxii [5, 20]. In the search for the basis of salt
Debaryomyces hansenii, A Salt Loving Spoilage Yeast 461

tolerance in D. hansenii, the sodium efflux process has been studied in


certain detail.
The existence of sodium efflux processes in D. hansenii was confirmed
by using radioactive 22Na+ and lithium (a sodium transport analogue) and
measuring cation extrusion [14]. However, a stronger efflux process in D.
hansenii than in S. cerevisiae was not observed. Two genes specifically
involved in salt extrusion were identified in D. hansenii. These genes code
for Na+-ATPases and they were cloned and characterized. Because of their
homology with the ENA genes from D. occidentalis and S. cerevisiae, they
were designated DhENA1 and DhENA2 [2] (Fig. 2). Northern analysis
showed that DhENA1 was expressed in the presence of high NaCl
concentrations, while the expression of DhENA2 also required high pH.
Heterologous expression of the genes in a mutant of S. cerevisiae lacking the
sodium efflux systems and sensitive to NaCl, recovered sodium tolerance
and the ability to extrude the cation [2]. It is important to stress that this
recovered tolerance was still far from the tolerance level of D. hansenii.
Therefore, the conclusion was that sodium extrusion alone is insufficient to
explain the high salt tolerance of D. hansenii.

2.3 What is peculiar with the uptake of potassium in


Debaryomyces hansenii?

In the first reported work on ion fluxes in D. hansenii [13], potassium


and sodium retention were determined and on the basis of long-term
transport experiments, the authors concluded that the ratio of potassium to
sodium is higher in D. hansenii than in S. cerevisiae. An additional
observation in the same report was that higher NaCl concentrations were
required to inhibit the total uptake of potassium in D. hansenii than in S.
cerevisiae.
More recently, the kinetic parameters of rubidium (a potassium transport
analogue) uptake in D. hansenii were determined and it was concluded that
this transport system was not more efficient than the one in S. cerevisiae
[15]. It was also shown that at pH 4.5, 50 mM NaCl activated the transport
of rubidium (potassium) in D. hansenii, while the effect was opposite in S.
cerevisiae. These results fit with results published previously by Norkrans
indicating that the NaCl concentration required to inhibit the total uptake of
potassium was higher in D. hansenii than in S. cerevisiae [13].
Very recent results obtained by Prista and Loureiro-Dias show the
existence of genes ortologous to the TRK (for TRansport of K+) and HAK
(for High Affinity K+ transporters) genes [16] already reported in other
yeasts (see [18] for a review). These two transporters (Fig. 2) seem to be
very similar to the homologous genes from D. occidentalis. These gene
462 C. Prista and M.C. Loureiro-Dias

products could be responsible for all or most of the K+ or Na+ influx


previously described by different authors. Kinetic studies of growth and K+
uptake indicate that, sodium does not prevent the uptake of K+.

Trk1 Hak1

Na+

Ena1,2
H+
K+/Na+ K+ H+ ATP ADP
Pma1
ATP
ADP
Cytosol

Figure 2. Schematic representation of transporters involved in cation fluxes in D. hansenii.


Sequences of genes correspondent to all the proteins have been identified.

In the framework of these studies we identified what is certainly a very


important mechanism of halotolerance. In most organisms, sodium is a
competitive inhibitor of potassium uptake. This means that when the
concentration of sodium is high, sodium enters the cells instead of
potassium, creating a situation of potassium starvation. Organisms do not
grow, because potassium is not available. The important achievement with
D. hansenii is the recognition that transport of potassium is only moderately
affected by sodium (in some cases it is even stimulated). Potassium
starvation does not occur and the cells present a good performance in high
sodium concentrations.

2.4 New perspectives

The recent publication of the whole genome sequence of D. hansenii


offered new possibilities for an integrated approach to the understanding of
halotolerance/halophily. So far, no genes specifically responsible for the
increased halotolerance of D. hansenii have been found. Probably, besides
improved potassium uptake, halotolerance requires the cooperative effect of
several factors and we expect that new interesting genes will be found. The
development of molecular tools for the manipulation of D. hansenii genes is
an urgent task. Opening clues in halotolerance in D. hansenii will bring new
Debaryomyces hansenii, A Salt Loving Spoilage Yeast 463

perspectives to the roles of D. hansenii in food environment, negative in


some cases, positive in others, the control of both food spoilage and of
cheese and sausage fermentations by D. hansenii will certainly be improved.

ACKNOWLEDGEMENTS

This work was partially supported by Fundação para a Ciência e a


Tecnologia (Project POCTI 2000/BIO/32749). C.P is a Post-doc fellow
(SFRH/BPD/20263/2004) from FCT, Portugal.

REFERENCES
1. Adler L, Blomberg A, Nilsson A. “Glycerol metabolism and osmoregulation in salt-
tolerant yeast Debaryomyces hansenii”, Journal of Bacteriology, 162, pp. 300-306, 1985.
2. Almagro A, Prista C, Benito B, Loureiro-Dias MC, Ramos J. “Cloning and expression of
two genes coding for sodium pumps in the salt-tolerant yeast Debaryomyces hansenii”.
Journal of Bacteriology, 183, pp. 3251-3255, 2001.
3. Almagro A, Prista C, Castro S, Quintas C, Madeira-Lopes A, Ramos J, Loureiro-Dias
MC. “Effects of salts on Debaryomyces hansenii and Saccharomyces cerevisiae under
stress conditions”. International Journal of Food Microbiology, 56, pp. 191-197, 2000.
4. André L, Nilsson A, Adler L. “The role of glycerol in osmotolerance of the yeast
Debaryomyces hansenii”. Journal of General Microbiology, 134, pp. 669-677, 1988.
5. Bañuelos MA, Ramos J, Calero F, Braun V, Potier, S. “Cation/H+ antiporters mediate
potassium and sodium fluxes in Pichia sorbitophila. Cloning of the PsNHA1 and
PsNHA2 genes and expression in Saccharomyces cerevisiae”. Yeast, 19, pp. 1365-1372,
2002.
6. Deak T, Beuchat LR. Handbook of Food Spoilage Yeasts, Boca Raton, CRC Press, 1996.
7. Dujon B, Sherman D, Fischer G, Durrens P, Casaregola S, Lafontaine I, De Montigny J,
Marck C, Neuveglise C, Talla E, Goffard N, Frangeul L, Aiglem M, Anthouard V,
Babour A, Barbe V, Barnay S, Blanchin S, Beckerich JM, Beyne E, Bleykasten C,
Boisrame A, Boyer J, Cattolico L, Confanioleri F, De Daruvar A, Despons L, Fabre E,
Fairhead C, Ferry-Dumazet H, Groppi A, Hantraye F, Hennequin C, Jauniaux N, Joyet P,
Kachouri R, Kerrest A, Koszul R, Lemaire M, Lesur I, Ma L, Muller H, Nicaud JM,
Nikolski M, Oztas S, Ozier-Kalogeropoulos O, Pellenz S, Potier S, Richard GF, Straub
ML, Suleau A, Swennen D, Tekaia F, Wesolowski-Louvel M, Westhof E, Wirth B,
Zeniou-Meyer M, Zivanovic I, Bolotin-Fukuhara M, Thierry A, Bouchier C, Caudron B,
Scarpelli C, Gaillardin C, Weissenbach J, Wincker P, Souciet JL. “Genome evolution in
yeasts”. Nature, 430, pp. 35-44, 2004.
8. Gonzalez-Hernandez JC, Cardenas-Monroy CA, Peña A. “Sodium and potassium
transport in the halophilic yeast Debaryomyces hansenii”. Yeast, 21, pp. 403-412, 2004.
9. Gori K, Mortensen HD, Arneborg N, Jespersen L. “Expression of the GPD1 and GPP2
orthologues and glycerol retention during growth of Debaryomyces hansenii at high
NaCl concentrations”. Yeast, 22, pp. 1213-22, 2005.
464 C. Prista and M.C. Loureiro-Dias

10. Haro, R., M.A. Bañuelos, F.J. Quintero, F. Rubio, and A. Rodríguez-Navarro. Genetic basis
of sodium exclusion and sodium tolerance in yeast. A model for plants. Physiology
Plantarum, 89, pp. 868-874, 1993.
11. Hirasawa T, Nakakura Y, Yoshikawa K, Ashitani K, Nagahisa K, Furusawa C, Katakura
Y, Shimizu H, Shioya S. “Comparative analysis of transcriptional responses to saline
stress in the laboratory and brewing strains of Saccharomyces cerevisiae with DNA
microarray”. Applied Microbiology and Biotechnology, 70, pp. 346-57, 2006.
12. Hohmann S. “Osmotic stress signalling and osmoadaptation in yeasts”. Microbiology
and Molecular Biology Reviews. 66, pp. 300-372, 2002.
13. Norkrans B, Kylin A. “Regulation of the potassium to sodium ratio and of the osmotic
potential in relation to salt tolerance in yeasts”. Journal of Bacteriology, 100, pp. 836-
845, 1969.
14. Prista C, Almagro A, Loureiro-Dias MC, Ramos J. “Kinetics of cation movements in
Debaryomyces hansenii”.Folia Microbiol (Praha), 43, pp. 212-4, 1998.
15. Prista C, Almagro A, Loureiro-Dias MC, Ramos J. “Physiological basis for the high salt
tolerance of Debaryomyces hansenii”. Applied and Environental Microbiology, 63, pp.
4005-4009, 1997.
16. Prista C, González-Hernández JC, Ramos J, Loureiro-Dias MC “Potassium transpot
systems in Debaryomyces hansenii”. Yeast, 22, pp. S184, 2005.
17. Rengpipat S, Lowe SE, Zeikus JG. “Effect of extreme salt concentrations on the
physiology and biochemistry of Halobacteroides acetoethylicus”. Journal of
Bacteriology, 170, pp. 3065-3071, 1988.
18. Rodríguez-Navarro A. “Potassium transport in fungi and plants”. Biochimica Biophysica
Acta, 1469, pp. 1-30, 2000.
19. Wadskog I, Adler L. “Ion homeostasis in Saccharomyces cerevisiae under NaCl stress”.
Yeast stress responses. In Topics in Current Genetics, Hohmann S. and Mager WH,
Eds), pp. 201-239, Springer, 2003.
20. Watanabe Y, Miwa S, Tamai Y. “Characterization of Na+/H(+)-antiporter gene closely
related to the salt-tolerance of yeast Zygosaccharomyces rouxii”. Yeast, 11, pp.829-38,
1995.
THE NEW DISEASES AND THE OLD AGENTS
The Veterinarian Perspective

Yolanda Vaz and Telmo Nunes


Faculdade de Medicina Veterinária, Universidade Técnica de Lisboa, Avenida da
Universidade Técnica, 1300-477 Lisboa, Portugal, yvaz@fmv.utl.pt, tnunes@fmv.utl.pt

Abstract: This is a changing World. Man is at the origin of many changes with impact in
animal and human health. Among them we can refer the development of fast
round-the-World transport, the industrialization and its environmental effects,
the increasing complexity of the food chain, the urbanization and the
technological development, allowing manipulation of disease agents and
creating the present information centered society. These changes have an
influence on the re-emergence of animal and human diseases which result,
among other reasons, from the microbial adaptation and change, the infection
of new populations, the use of new means of dispersion, the expansion of
vectors, or even from the improved capacity of diagnosis and better
community awareness. In this work some of these aspects are discussed and
research contribution of the Faculty of Veterinary Medicine of Lisbon is
briefly referred.

Key words: Re-emerging diseases, zoonosis research

1. INTRODUCTION

We live in a changing world. And we can see Man’s touch in many of the
changes with influence in animal and human health.
In the majority of cases the impact of those changes has been positive,
allowing an unprecedented success of the Human species. But the “reverse
of the coin” is also unavoidable.
The development of transports around the World, for tourism and for the
trade of products of animal origin, allows a fast exchange of pathogens
between continents. The intensification and industrialization had as
consequence a more long and complex food chain, were raw materials come

465
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 465–477.
© 2007 Springer. Printed in the Netherlands.
466 Y. Vaz and T. Nunes

from different origins, additives are extensively used and production and
consumption occur very far away from each other. The risk of contamination
of food with biologic and chemical agents might therefore increase, if
proactive safety methods are not correctly used. Another aspect of progress
is the changing of feeding habits like the increasing of consumption of
ready-to-eat meals (both refrigerated and frozen), fast food, exotic food and
home-delivered food. The industrialization has also originated important
climatic changes being responsible for the increase of production of gaseous
residues with green house effect. The global warming, in some regions,
originated a favorable environment for the expansion of habitats suitable for
the development of some disease agents and the survival of insects and
arthropods vectors of diseases. The urbanization and increasing density of
human population in certain areas, invades and transforms natural
environments and originates high concentration of residues, favoring the
development of synantropic animal species with potential for zoonosis
transmission, like rats and mice, pigeons and seagulls, cockroaches, flies and
mosquitoes. Current technological development allows an easy manipulation
of infection agents with potential to be used as biological weapons. The
referred development has, in other hand, originated new diagnostic tools,
with sophisticated laboratory techniques, that together with the development
of communications technologies have enabled the identification and
surveillance of several diseases, some of them perceived as “emerging
diseases”, and have increased public awareness for these issues. Another
result of technological development is the existence of a complex arsenal for
treatments against diseases. Agents, however, have also evolved in the sense
of adaptation, making resistance to antibiotics and other medicinal products
a present public health problem.
There are other changes that have also great impact in turning old agents
into new diseases like armed conflicts with destruction of social organization
and loss of capacity of intervention of medical and veterinary services; the
invasion and change of natural habitats like deforestation and irrigation; the
global growth of human population and associated poverty and lack of
proper nutrition and hygiene; the aging of population in certain areas and the
increase of other immuno-compromised groups by other reasons.
The Veterinary Faculty of Lisbon (FMV) develops research projects in
the field of animal health and veterinary public health, among other areas.
Research groups are organized in an Interdisciplinary Animal Health
Research Centre – CIISA. This Centre has developed research projects on
some of the problems described, trying to help in finding solutions for
national and international problems. These initiatives will be referred in this
work.
The New Diseases and the Old Agents – The Veterinarian Perspective 467

2. CHANGES WITH IMPACT ON DISEASES


OCCURRENCE

2.1 The development of transports and FMD

Migration has always existed in searching for better conditions for


survival. Tourism is a recent industry and has still an increasing tendency.
According to the International Tourism Organization, in 1990, 451 millions
tourist international arrivals were recorded. In 2004 this number increased to
763.2 million in which Europe participates with 416 million, representing a
business of 263 billion Euros. It is expected that in 2010, this number will
increase to 1.006 million and this figure do not include population migration.
Aerial transportation has a share of 43% and road transports 45%. The
remaining is boat and rail transport [1].
The trade of agriculture products shows also an increasing trend with a
value of transactions of 783 billion USD in 2004, more 9% than in 2000 [2].
The increase exchange of people, animals and products around the world,
increases risk of introduction and re-introduction of diseases in countries and
continents. That was the case of the large epidemic of Foot and Mouth
Disease (FMD), caused by O Pan-Asian type virus of the family
Picornaviridae, genus Aphthovirus, in the United Kingdom (UK) in 2001.
The last occurrence of this disease in the UK was in 1967-8 and although the
disease was occurring in many countries of the World, public awareness was
very low, prior to 2001. The UK epidemic started in a pig finishing unit and
the most likely source of infection was the feeding of animals with
contaminated meat or meat products inadequately processed [3]. A risk
assessment analysis on illegal imports, carried out by the UK Veterinary
Laboratories Agency and independent consultants, estimated that around
7,500 tones of illegal meat is imported annually into the United Kingdom
and that around 95kg of this illegal meat could be contaminated with FMD
virus. Between 20 and 680g of this contaminated meat could be ingested by
susceptible livestock, making it possible the occurrence of outbreaks of this
important disease [4]. In the 2001 epidemic, a Report from the Royal Society
accounted 2030 outbreaks spread across the country with about 6 million
animals culled (4.9 million sheep, 0.7 million cattle and 0.4 million pigs)
either to combat the spread of the disease or as a direct consequence of
disease control measures (so called 'welfare' slaughter). This number is
believed to be higher, maybe up to 10 million, if young animals killed 'at
foot' and not counted for compensation purposes, would be included. The
foot-and-mouth epidemic had serious consequences upon agriculture,
tourism-in both city and country-and other rural industries [5]. The UK FMD
468 Y. Vaz and T. Nunes

epidemic affected also other countries and further 401,000 animals were
eliminated. In The Netherlands, the figure was around 285,000 animals, in
France around 63,000 and in Ireland 53,000” [6]. The possibilities for
introduction of the contaminated meat or meat product in the UK are illegal
shipment on a commercial scale or personal imports from areas were similar
viruses were circulating: South Africa and Far East, but this remains
unconfirmed. However the commercialization channels helped the fast
spread of disease within the UK and to neighboring countries and were an
important risk factor for the extent of the epidemics [3].
Influenza viruses are old agents affecting animal and human populations.
Virus capacity of mutation has originated strains with different pathogenic
effects and host ranges, but birds are considered reservoir of most of the
subtypes. H5N1 was recognized in humans in 1997 for the first time,
associated with the exposure to infected live poultry [7]. Large outbreaks in
poultry and their geographical spread, especially since 2003, and other
factors like fast virus mutation, mortality in waterfowl, transmission between
mammals of the same species (humans, tigers in the zoo, experimental cats)
and extensive viral infection in the organism (almost all organs are infected
in cats), increased the concern by veterinary and medical authorities for a
virus mutation that could make possible an airborne human-to-human
transmission, originating a pandemic. The transmission of this idea to the
public raised strong concern that already affected consumption of poultry
products, even in the absence of the disease, despite the efforts of national
and international official entities responsible for public health in informing
the public about the risk posed by the consumption of those products. One of
the responses of health authorities was the increase in the level of
surveillance of the disease in both wild and domestic bird populations,
because it is believed that its control in the animal population is the best
possible approach at the present. If a pandemic ever starts the fast and
extensive network of transports will be responsible for the dissemination of
the disease, around the world within few months [8].
In the field of emergency diseases, FMV has been involved in an
international project on African Swine Fever, a disease classified in the same
list by the Office International des Epizooties (OIE). The project -“African
Swine Fever (ASF): improved diagnostic methods and understanding of
virus epidemiology and virus-host interactions”, reference QLK2-CT-2001-
02216 - is funded by the Programme Quality of Life and Living Resources,
EU. The FMV team is coordinating the project.
The Public Health and Epidemiology Group of FMV is at present
working on a project, funded by the Calouste Gulbenkian Foundation, for the
development of the existing surveillance system and the development of
The New Diseases and the Old Agents – The Veterinarian Perspective 469

epidemiological investigation and predictive models that could help in


decision making in case of outbreaks of the disease.

2.2 Industrialization and diseases important to animal


and public health

Industrialization has many effects in the epidemiologic triad interaction


the host-agent and environment. Only two aspects are referred, some direct
effects on the intricate flow of the food chain and the indirect effect caused
by pollution and global warming.

2.2.1 Industrialization and the food chain

The evolution of the food chain increased its complexity (variability of


sources of inputs) and length (number of operators, distribution to long
distances) and also in volume of production from one source, leading to an
increased risk of dissemination of pathogenic agents inadvertently present in
food, either by contaminated raw materials or process contamination.
Examples of such problems are the large outbreaks of E.coli O157 in
Japan, Scotland, Sweden (1996) and Wales (2005), the European crisis of
dioxin contamination of poultry meat and products and the occurrence of
nitrofurans also in poultry, in Portugal. Bovine Spongiform Encephalopathy
(BSE) is another example of a negative result of a change in technological
processing of animal feed and food chain complexity but it will be addressed
later when aspects related to public awareness are discussed.
From May to August 1996, 10,000 cases of E.coli O157:H7 were
reported in Japan, distributed by 14 clusters. On of the outbreaks affected
6,309 students, among other people (92 staff, 160 family members), in 62
primary schools in the city of Sakai [9]. Investigation of this outbreak
suggested radish sprouts industrially produced as the most likely cause.
E.coli O157:H7 can cause bloody diarrhea and hemolityc uremic syndrome
and in this outbreak 2 children have died. Also in 1996, an outbreak with
396 cases was reported in Scotland with 11 deaths and meat products as the
source of infection [10] and in Sweden, affecting 110 persons (50% younger
than 5 years old). The source of this outbreak was not identified but it was
probably a common source of nationwide distribution [11]. In Wales, the
2005 outbreak affected 40 schools and 158 people and the origin was traced
to contamination of drinking water from an intensive bovine farm [12].
Industrialization of animal production, rearing animals in intensive systems
with high animal densities has a strong environmental impact if effluents are
not recycled treated and disposed of correctly.
470 Y. Vaz and T. Nunes

Dioxins are byproducts of several combustion processes, with


carcinogenic effects on health. Their concentrations in the environment have
been consistently decreasing, at least in some countries of Europe, resulting
from the control of industrial pollution of air and water [13]. The most
important route of exposure of humans to dioxins is food consumption (95-
98%) and food chain can be contaminated by the pollution present in the
environment where animals and plant grow or through the accidental
introduction of contaminated materials as in the case of the dioxin incident
in Belgium in 1999. In this case a tank of recycled fats used to produce
animal feeds was accidentally contaminated by approximately 100L of an oil
containing 50 kg PCBs. Thirty farms were affected with poultry poisoning
resembling the classic chick edema disease and the diagnosis let to the
search and removal of the food chain of contaminated poultry products,
leading to a major political and food crisis [14]. However subsequent studies
confirm that the incident was too limited in time and in scale to have
increased the PCB/dioxin body burden of the general population [14].
In Portugal, the identification of nitrofurans residues in poultry meat
originated a crisis with serious economic impact. The origin of this type of
problem is usually the illegal commercialization and/or use of antimicrobials
non-authorized for food animals by the poultry industry. As consequences of
the crisis of 2002, 176 farms were under official control (171 poultry farms,
1 pig farm, 2 rabbit and 2 fish farms), 1.5 million birds were destroyed and
over 250 tons of poultry meat was removed from the market [15]. The
investigation of the problem by the national veterinary authority concluded
that 90% of the positive results of the samples analyzed showed amounts of
nitrofurans inferior of those resulting in beneficial biologic effect (10 µg/ kg)
and they could have been originated from accidental or cross contamination
(except in water) [15]. Only in 5 farms levels greater than 100 µg/ kg of a
non-authorized product - furaltadone - were found. Better control in feed
processing plants was implemented, as well as in farms and levels of official
surveillance of residues have also increased [15].
In FMV the Food Technology and Safety Group, is presently involved in
an international project “Assessment and improvement of safety of
traditional dry sausages from producers to consumers (TRADI-SAUSAGE),
reference QLRT-2001-02240, funded by the EU and coordinated by the,
INRA-France team. One of the research lines of this project is the
identification of potential risks and development of HACCP methods for this
product.
Another project related to microbial risks in the food chain is the
“Management of risks associated with the presence of Listeria
monocytogenes in sheep cheese”, AGRO nº292, financed by national
funding. The project is coordinated by the High School of Agriculture of
The New Diseases and the Old Agents – The Veterinarian Perspective 471

UTL (ISA) and has the participation of FMV and ANCOSE, a farmers
association.
The Toxicology Group has also projects concerning the identification of
chemical risks in food. One of them is “The interference of antibiotic
residues in yogurt production”, financed internally by CIISA.

2.2.2 Industrialization, pollution and diseases

Industrial pollution can have a direct effect on health. This is the case of
mercury pollution, originated from paper, plastic, batteries and other
industries, that “travel” along the food chain of aquatic animals, acquiring in
some edible species of fish, concentrations of public health importance.
Plankton is contaminated with inorganic mercury, transferring it to herbivore
fish were it is transformed in organic molecules - methyl mercury - which
further accumulate in carnivore fish like sward fish, much used and
appreciated by Portuguese consumers. Methyl-mercury, as a result of the
mother's exposure, causes profound mental retardation, cerebral palsy,
seizures, spasticity, tremors and incoordination, eye and hearing damage in
the unborn baby. Organic mercury passes into the breast milk as well [16].
FMV collaborates with other national institutions IPIMAR (Sea Research
Institute) and ASAE (Authority for Food Safety) in a project of “Evaluation
of the risk of mercury ingestion by fish consumption”, which aims at
developing different consumption and associated risk scenarios for
Portuguese consumers.
Another aspect of industrial pollution is the production of gaseous
residues with green house effect inducing global warming. A warmer
environment is favorable to de expansion of suitable habitat for certain
water- and foodborne diseases; and vector- and rodentborne diseases [16].
Examples of agents showing greater survival in the environments in past
years are Cryptosporidium and Giardia in fresh waters and Vibrio vulnificus,
and some enterovirus in marine environments, responsible for diarrhea and
debilitating disease.
Among the diseases transmitted by rodents, leptospirosis, salmonellosis
and some virus (Hantaviruses) are of importance. Vector-transmitted
diseases account for an extended list including West Nile, Dengue, Yellow
Fever and Blue Tongue viruses and ricketsial, bacterial (Lyme disease) and
parasitic diseases (malaria, leishmaniosis).
Yellow fever, a viral disease transmitted by mosquito bite, is present in
Sub-Saharan Africa and South America for a long time and re-emerged in
recent years constituting one of the main causes of disease outbreaks in
Africa. According to the WHO the population at risk in that Continent only,
472 Y. Vaz and T. Nunes

rises to 610 million [17]. The disease has a high fatality rate and the risk of
major urban epidemics is increasing [18].
The Parasitic Diseases Group is also developing a project on
cryptosporidiosis “Diagnosis of Criptosporidium in bivalves and fish,
genetic characterization and impact in public health”.
Three research groups of FMV, Infectious Diseases, Parasitic Diseases
and Epidemiology collaborate with the official veterinary services in a
programme of “Entomologic surveillance of Blue Tongue”. This viral
disease transmitted by mosquitoes (Culicoides sp) was present in Portugal in
1956 to 1960. No more cases in ruminants, the natural reservoirs, were
observed up to 2004, when the disease re-entered the country.
FMV collaborates, through the Parasitic Diseases Group, in a research
project on the “Immunopathogenicity of canine leishmaniosis”, funded by
FCT (Portuguese Foundation for Science and Technology) and coordinated
by IHMT (Institute of Hygiene and Tropical Medicine of Lisbon).

2.3 Urbanization turning old agents in new diseases

World-wide statistics indicate that in 1950, 29% of population lived in


urban areas. This figure increased to 37.2% in 1975, 48.3 in 2003 and it is
expected that in 2015, 53.5% of world’s population will live in urban
environments [19]. This trend implies the occupation of natural habitats and
an increase of concentration of human waste. These factors are known as
favorable to the development of animal species that find food, water and
shelter in these areas. The synantropic species, those who approach and
colonize humanized habitats are best represented by rats, pigeons and
seagulls and are potential reservoirs of leptospirosis, typhus, cryptococosis
and again cryptosporidiosis.
Several projects are developed at FMV regarding leptospirosis: two
Masters thesis “The application of serological and molecular biology
techniques to the diagnostic of leptospirosis” and “Contribution to the study
of canine leptospirosis in S.Miguel Island, Azores” and one PhD work
developed with the contribution of several institutions, the Angolan Health
Ministry, Universidade Nova de Lisboa and FMV, with the title “Evaluation
of the prevalence of leptospirosis in urban and peri-urban areas of Luanda
Province”.

2.4 Technological development: pros and cons

Technological development and the possibility of manipulating


biological agents are recent and opened the possibility for their use to heal
and control disease and at the same time as biological weapons.
The New Diseases and the Old Agents – The Veterinarian Perspective 473

2.4.1 Technologic development, diagnosis and the increased


awareness of re-emerging diseases

The capacity of dominating agents allowed the production of diagnostic


tests and vaccines and other tools. The development of diagnostic capacity,
together with the development of communication technologies helped in
circulating fast and kicky information on disease occurrence. Mass media
has conquered the world and television has become a major public opinion
maker. Between 2000 and 2005 an increase on 182% of internet users was
observed and at present, internet is used for news, scientific library,
surveillance programmes, etc. In consequence, part of the perception of
“new diseases” is also some old agents which are being identified and
monitored with much powerful techniques and this knowledge made
available to the public. Education and awareness of population can result in
high and unnecessary levels of concern but has been useful to force social
investment in the promotion of health. One example could be the
transmissible spongiform encephalopaties (TSEs).
The origin of BSE is still not completely clarified and the cross of species
barrier of the Scrapie, an old and well known agent, remains a possibility.
Anyhow, the introduction of new technique for the preparation of meat and
bone meal allowed the recycling in the food chain of prion-contaminated
material, originating a severe animal health problem. In Portugal, the country
most affected with the disease after the UK, 1006 confirmed cases were
identified since 1993 [20]. Infected and birth cohorte related animals have
been slaughtered and measures to protect animal and human health have
been taken, regarding animal feeding (total ban of incorporation of animal
protein in ruminants feed), guarantee of meat origin (traceability of meat and
test of all animals over 30 moths entering the food chain) and meat
preparation (removal of risk materials). Up to the present 153 human cases
were notified in the UK and 1 in Portugal, fewer victims than many
zoonoses in these countries. However, public awareness of BSE was very
strong and its reflex in the sharp drop of bovine meat consumption was very
hard on the economy of the sector. This public reaction originated profound
political and organizational changes in the European Union and influenced
trade relations worldwide.
Still regarding TSEs, technological development allowed the recognition
of “atypical” Scrapie strains recently identified first in Norway, in 1998 [21],
and then in almost all countries of the EU, including Portugal. Increasing
knowledge and the development of diagnostic tools is one of the possible
explanations for this “new” discovery.
FMV developed two lines on epidemiological research and public health
risk evaluation, in TSEs collaborating with the national veterinary authority
474 Y. Vaz and T. Nunes

(DGV) and the national veterinary laboratory (LNIV) and participated in


European Networks since 1999 (Project FAIR 7021 and the SRTSE
Network). FMV has also collaborated with the European Commission DG
SANCO in the development of the European System of TSEs notification. A
Masters thesis was also presented in 2003 with the title “Potential for the
exposure of Portuguese consumers to the BSE agent from 1987 to 2001”.

2.4.2 Bioterrorism resulting from technological development

The knowledge of infectious agents and the capacity of storing and


growing them under controlled conditions originated a new public health
threat – the bioterrorism. Anthrax, tularemia, smallpox, botulism toxin are
same of the possible agents to be used as biological weapons.
Although Bacillus anthracis is a recognized cause of mortality of wild
and domestic animals in Africa, and also humans [22], its importance has
reached the public in the crisis of the letters contaminated with white powder
that affected the United States in 2001, most of them only anthrax hoaxes
and threats [23], like those occurred in Portugal. Experts from WHO and the
US Congressional Office of Technology Assessment estimate high mortality
if anthrax spores are released by aerosol in big cities, lethality matching or
exceeding that of an hydrogen bomb [24], very different from the actual
sporadic occurrence of naturally transmitted disease.

2.4.3 Technologic development and the struggle against disease


agents

Technology also allowed the development of chemical tools for disease


control and treatment, being the antibiotics one of the most important
inventions. The industrial production of these products has generalized its
use and many times its misuse, originating bacterial resistances which
represent a serious problem in therapeutics with loss of efficacy. According
to the European Medicines Agency (EMEA) the use on antibiotics in animal
production in 1997 was extensive, with a minimum in Austria - 22 mg/Kg of
meat and a maximum in the UK - 190 mg/Kg. Portugal presented 80 mg/Kg
of meat [25]. This situation is being addressed by the limitation of antibiotics
that can be used in animal production, the surveillance of residues in the
animals entering the food chain and implementation of good practices for the
use of chemicals in animal production. The emergence of bacterial resistance
(multi-resistant tuberculosis, resistant food borne pathogens) is also a case of
known organism acquiring new characteristics that do not allow the same
control approach.
The New Diseases and the Old Agents – The Veterinarian Perspective 475

FMV is currently involved in several projects like the “Survey of


meticilin resistant Staphylococcus of animal origin”, and the
“Antibioresistance of Pseudomonas spp. of veterinary origin” developed by
the Microbiology and Immunology Group.
The Veterinary Public Health and Epidemiology group was also involved
in the project “Bovine Tuberculosis in Alentejo – a persistent zoonosis”,
reference AGRO nº125, funded by a national agriculture line of research
investment. The project is coordinated by the Regional Directorate of
Agriculture of Alentejo (DRAAL).

3. CONCLUSION: FACING RE-EMERGING


DISEASES

Several causes were pointed out in making old agents originating new
diseases: new populations are affected (hypothetically BSE), the agent uses
new means of dispersion (Foot and Mouth Disease), increase on incidence
and geographical spread (Blue Tongue), improvement of capacity of
diagnosis (atypical Scrapie), microbial adaptation and change (antibiotic
resistant bacteria).
As consequences of re-emerging diseases, degradation of animal health is
observed followed by the increment of risks to public health and to food
security as well as other socio-economic impacts. The reduction of
biodiversity can also resulted from disease in domestic and wild animals
In response to these threats it is necessary not only to keep developing
diagnosis and monitoring capacities and preventive and curative tools
(vaccines, medicines) but above all, adopt pro-active measures along the
food chain and in other animal-human interaction types. Hygiene,
accreditation and documentation of processes are necessary. It is also
necessary to further promote the international cooperation, “globalizing”
surveillance, prevention and disease control, stimulating regional integration,
interdisciplinarity, public-private cooperation and the coordination role of
international bodies [26, 27, 28]. The Manhattan Principles on “One World-
One Health” was launched in September 29, in a meeting with health experts
from around the world, in the view that health of wildlife, people,
and domestic animals is inextricably linked and should be addressed as such,
with the necessary integration of several professions, in order to “ensure the
biological integrity of the Earth for future generations” [29]. Investments in
research and in risk communication are other two fundamental aspects in
controlling and definitively wining the struggle against certain infectious
diseases.
476 Y. Vaz and T. Nunes

FMV is an institution devoted to the high education of future


veterinarians and to the continuous training of veterinarians in practice. This
involves also the development of research projects, especially in cooperation
with other institutions at various levels, from the primary production of
domestic animals and industry of products from animal origin to the official
services and international organizations. Companion and wild animal’s
populations have also been themes of interest for research. The social
recognition of the public health importance of the veterinary profession
makes it necessary the investment in research in these fields of knowledge
which FMV has tried to accomplish and hopes to develop even further in the
future.

REFERENCES
1. World Tourism Organization 2005. http://www.world-tourism.org/facts/menu.html
2. WTO 2005. http://www.wto.org/english/res_e/statis_e/its2005_e/its05_toc_e.htm
3. DEFRA. “Origin of the UK Foot and Mouth Disease epidemic in 2001”. June 2002.
http://www.defra.gov.uk/footandmouth/pdf/fmdorigins1.pdf
4. Meredith M. “Meat Smuggling - Risk Assessment and Control”. AASV News Archive2nd
April 2003. http://www.aasv.org/news/story.php?id=534
5. DEFRA. Foot and Mouth Disease. http://footandmouth.csl.gov.uk/
6. Meredith M. “Exotic Disease Responses Re-assessed”. AASV News Archive, 20th
December 2002. http://www.aasv.org/news/story.php?id=391
7. Mounts AW, Kwong H, Izurieta HS, Ho Y, Au T, Lee M, Buxton Bridges C, William
SW, Mak HK, Katz JM, Thompson WW, Cox NJ, Fukuda K. “Case-control study of risk
factors for avian influenza A (H5N1) disease, Hong Kong, 1997”. Journal of Infectious
Diseases, 180(2), pp. 505-508, August 1999.
8. WHO “Responding to the avian influenza pandemic threat – Recommended strategic
actions”. Communicable Diseases Surveillance and Response. Global Influenza
Program. WHO/CDS/CSR/GIP/2005.8.
9. WHO “Food safety - Enterohaemorrhagic Escherichia coli infection, Japan”. Weekly
Epidemiological Record, 35, pp. 267-268, 1996.
10. WHO “Food safety - Outbreak of Escherichia coli O157 infection, United Kingdom
(Scotland)”. Weekly Epidemiological Record, 50, pp. 384, 1996.
11. Ziese T, Anderson Y, de Jong B, Löfdahl S, Ramberg M. “Outbreak of Escherichia coli
O157 in Sweden”. Euro Surveillance, 1(1), pp. 2-3, 1996.
12. FSA. Food Safety Agency Wales. Response to E.coli report. Jan 2006. http://www.
food.gov.uk/news/newsarchive/2006/jan/ecolicmo
13. Buckley-Golder D. “Compilation of EU Dioxin Exposure and Health data. Summary
Report 1999”. http://europa.eu.int/comm/environment/dioxin/pdf/summary.pdf
14. Bernard A, Fierens S. “The Belgian PCB/dioxin incident: a critical review of health risks
evaluations.” International Journal of Toxicology, 21(5), pp.333-340. Sep-Oct 2002
15. DGV. Resíduos de Nitrofuranos em Portugal – Relatório Final 2003. http://www.
agenciaalimentar.pt/index.php?module=ContentExpress&func=display&ceid=56&meid
=-1
The New Diseases and the Old Agents – The Veterinarian Perspective 477

16. Patz JA, McGeehin MA, Bernard SM, Ebi KL, Epstein PR, Grambsch A, Gubler DJ,
Reiter P, Romieu I, Rose JB, Samet JM and Trtanj J. “The Potential Health Impacts of
Climate Variability and Change for the United States: Executive Summary of the Report
of the Health Sector of the U.S. National Assessment”. Environmental Health
Perspectives, 108(4) pp. 301-304, April 2002
17. WHO “Yellow Fever: a current Threat”. http://www.who.int/csr/disease/yellowfev/
impact1/ en/index.html
18. WHO “The Yellow Fever situation in Africa and South America in 2004”. Weekly
Epidemiological Record, 80(29), pp. 250-256, July 2005. http://www.who.int/wer/2005/
wer8029.pdf
19. U.N. Human Development Reports Statistics. http://hdr.undp.org/statistics/data/indic
/indic_43_1_1.html
20. DGV. Official Veterinary Authority of Portugal – Annual reports on BSE 2005.
21. Benestad SL, Sarradin P, Thu B, Schonheit J, Tranulis MA, Bratberg B. “Cases of
scrapie with unusual features in Norway and designation of a new type, Nor98”.
Veterinary Record, 153(7), pp.202-208, August 2003.
22. OIE Working Group for Wildlife Diseases. http://www.oie.int/eng/press/en_050308.htm
23. FBI Press Release, 20 December 2001. http://www.fbi.gov/pressrel/pressrel01/
anthraxhoax122001.htm
24. Inglesby TV, Henderson DA, Bartlett JG, Ascher MS, Eitzen E, Friedlander AM, Hauer
J, MacDade J, Osterholm MT, O’Toole T, Parker G, Perl TM, Russel PK, Tonat K.
“Anthrax as a Biological Weapon – Medical and Public Health Management”. Journal of
American Medical Association, 281(18), pp. 1735-1745, May 1999.
25. EMEA “Antibiotic resistance in the EU associated with therapeutic use of veterinary
medicines”. Report on the qualitative risk assessment by the Committee of Veterinary
Medical Products. July 1999.
26. Van de Venter T. “Prospects for the future: emerging problems – chemical/biological”.
Conference on International Food Trade Beyond 2000: Science-Based Decisions,
Harmonization, Equivalence and Mutual Recognition. Melbourne, Australia, 11-15
October 1999.
27. Gibbs EP. “Emerging zoonotic epidemics in the interconnected global community”.
Veterinary Record, 157, pp.673-679, November 2005.
28. Marano N, Arguin P, Pappaioanou M, King L. “Role of Multisector Partnership in
Controlling Emerging Zoonotic Diseases”. Emerging Infectious Diseases, 11(12), pp.
1813-1814, December 2005.
29. Wildlife Conservation Society. “The Manhattan Principles on “One World – One
Health”. http://www.oneworldonehealth.org/sept2004/owoh_sept04.html
THE SHARING OF URBAN AREAS BY MAN
AND ANIMALS
Pleasure and risk

Armando C. Louzã
Faculdade de Medicina Veterinária, Universidade Técnica de Lisboa, Av. da Universidade
Técnica, Lisboa 1300-477, Portugal, email: louza@fmv.utl.pt

Abstract: The interaction and close contact between humans and some animal species
date from immemorial ages. Demographic evolution of human populations and
the tendency for urban concentration have radically changed the type of
relation and fruition by man of different animal species that have follow him
up to urban environments. At the same time, there is an increase of the
biological hazards and other dangers resulting from such interaction. In
Portugal, there are few scientific publications related to the adaptation of
animal species to the urban areas and, even less, concerning physical, mental
or social impact in human health. Estimations of companion animal
populations or synanthropic animal species are difficult to be obtained. Only
through the commercialization of veterinary drugs is possible to have an
approximation of the dimension of dog and cat populations. These data show,
that in the last five years there was an increase in both populations size of
5.2% and 12.5%, respectively. There have also been efforts to calculate the
population of pigeons in Lisbon urban area. Various social and economical
indicators are presented and discussed suggesting a gradual awareness of these
problems by feed and drug companies, dog and cat breed associations and
council authorities. Lisbon Veterinary Faculty has been leading the research of
animal diseases in urban areas. Over forty scientific and technical publications
have been published addressing zoonotic diseases (leishmaniasis,
toxoplasmosis, criptosporidiosis, equinococcosis, helmintiasis, salmonellosis),
or mammary tumors and antibiotic resistance using dog and cat urban
populations as models. From the analysis of the results and conclusions of
such studies it is possible to verify that environmental alterations of natural
habitats and consequent behavioural changes in individual and animal
populations have increased the risk of physical and biological hazards for
citizens. It is also stressed that there is a need for all private and public
institutions to participate in the information and education of animal owners in
order to reduce the physical and biological risks originated by companion

479
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 479–488.
© 2007 Springer. Printed in the Netherlands.
480 A.C. Louzã

animals and of citizens to respect and collaborate with public health authorities
on promoting and maintaining a better urban environment.

Key words: Interaction, Man and Animals, Urban areas, Scientific Research, Risk.

1. INTRODUCTION

Interaction and close companionship between human and some animal


species date from immemorial ages. At primeval societies, historical ages
and in most of last millennium the expression of this particular relationship
has had a clearly rural feature.
As a result of society industrialization, the evolution of human population
demography and the tendency for urban concentration have radically
changed both the type of relation and the fruition by man of different animal
species that have follow him up in the environmental change [1]. One of the
results of such interaction was an increase of the biological hazards and
other dangers originated from animals.
As consequence of growing concentration of human populations in urban
areas, the direct contact of most citizens with food animal species ceased.
Only dog and cat species, in different times, followed urban transition and
acquired a closer social status that in some cases has been converted in true
familial integration. The interaction between people and companion animals
are normally symbiotic, and this interspecies relationship described as
human-companion animal bond [2].
In opposition, a few wild animal species adapt themselves to urban
ecosystems living in residential areas and becoming synanthropic or
domiciliated [3] although maintaining their natural behaviour.

2. HUMAN-ANIMAL INTERACTION

The present interaction of man and animals in urban zones is a result of


two universal but contradictory features. The presence and contemplation of
animal species, as pigeons, birds or butterflies, universally generates a
feeling of enjoyment on the citizens. At another level, the need for close
proximity and fulfillment of companion animals mainly dogs and cats, is a
generalized option looked for many urban residents.
In contrast, humans might show fear of these and other animal species
that can either bite, scratch, kick or intimidate, might transmit diseases or
cause repulse as rats, cockroaches or reptiles.
The Sharing of Urban Areas by Man and Animals 481

The pleasure of sharing urban areas with animals is substantiated by the


frequent and evident human expressions of satisfaction. These are the
gratitude and delight of contemplating synanthropic animal species as a
reminder of lost memory of natural landscapes. It is also the need of
companionship of humans associated to the specie gregarious instinct, being
corresponded by animal candid friendship. Sometimes, such mutual feelings
lead to an exaggerated humanization of the companion animal. However,
owners often choose companion animals due to other reasons such as
protection or social status, when certain breeds became fashionable.
Examples of such wrong choices are common in Portugal with dogs of
medium to large size breeds being kept in small apartments, without the
minimal conditions for necessary physical exercise. Also, dangerous breeds
(ex. Rottweiller or Pit Bull Terrier) are frequently used and socialized as
companion animals in confined places. In parallel, a growing demand for
exotic species or breeds is observed without owners having sufficient
knowledge of their behavioural characteristics and physiological needs.
Such conducts might become a serious urban problem of dog
intimidation and aggression and might promote psychological and
behavioural disturbances of both owner and animal. Furthermore, small
animal practitioners give notice of clients with unhealthy psychological
fixations on his/her companion animal, sometimes with a correspondent
detachment from neighbours, family and friends. Usually, such animals
suffer from pathologies of its normal behaviour and physiology. Obesity and
diabetes are also common disturbances, as well as transmissible diseases as
candidiases, which are mycoses common in humans.

2.1 The acquisition and use of companion animals

The option to live with a companion animal and to share our confined
urban space and our domestic intimacy with it should be addressed as a
serious human issue. Often, the reasons that assist to the acquisition and the
fruition of a companion animal are not the more adequate or rational.
Frequently, the option to choose certain dog or cat breed is based on
short-term and, or inadequate grounds. Such inappropriate and hazardous
conduct could give way to risks mainly affecting children, aged people and
immunocompromised individuals.
The choice of certain animal species not featured as companion animals
in urban environments (ex. rodents, reptiles) has been regarded as a risk of
introducing less common endemic zoonotic agents (ex. Salmonella in
tortoises). Such risk is even greater if exotic species, as monkeys or
marsupials, are involved.
482 A.C. Louzã

The ownership of companion animals has became gradually more diverse


and complex. Traditionally, citizens used to prefer to have as pets species
like dogs, cats, cold water and tropical fish, ornamental and sing cage-birds.
Presently, other species as laboratory animals (mice, rats, hamsters, guinea
pigs or rabbits) or species less accepted as arthropods (ex. scorpions, ants),
reptiles (ex. snakes, lizards), batrachians (frogs, toads), or even exotic
species (turtles, vipers) are found in some houses and apartments of urban
areas.
Such diversity and hazardous ownership promotes the opportunity to
acquire animals without showing the proper knowledge of the species
behaviour or the utilization of adequate animal containment attitudes. It is
becoming frequent news of attacks of children or joggers often involving
potential hazardous dog breeds, left wonder on loose.

2.2 Alternatives to recover livestock memory

The population in urban areas, and in particular the young generations,


have been able to remind the interaction with livestock that did use to live in
the proximity of humans, through different alternatives.
Presently, opportunities to enjoy, to learn and to establish direct contact
with food animal species or wild autochthonous, migratory or exotic birds
are being offered in metropolitan areas. Usually in places reproducing
natural habitats and allowing for direct contact with nature. Examples of this
are the proliferation of pedagogic parks allowing not only the knowledge of
domestic species other than companion animals, but also promoting direct
contact with these species by feeding, grooming or socializing. Such
interaction contributes to help young urbanized generations to recuperate the
rural memory of farm animals.
Zoological gardens, public parks and natural reserves not only have a
positive environmental impact on the maintenance and promotion of green
areas but also allow for the recognition of natural ecosystems and the
opportunity of urban residents to learn about wild species.

2.3 Urban animal population dimension

Estimations of companion animal populations or synanthropic animal


species are difficult to be obtained and hardly available.
In the case of companion animal species in Portugal, only through the
commercialization of veterinary drugs has been possible to have an
approximation of the dimension of dog and cat populations. It is estimated
that presently the dog population is of 1.7 to 2 million animals and the cat
The Sharing of Urban Areas by Man and Animals 483

population of 0.8 to 1 million animals. There was also, in the last five years,
an increase in both populations size of 5.2% and 12.5%, respectively.
Confirming the present increase of dimension of the companion animal
market is the global sales value referred by the animal feed sector of 55
million Euros, for 2003.
Efforts to calculate the population of pigeons in Lisbon urban area have
also been made. Unofficial figures varying from 25,000 to 170,000 birds
indicate the difficulty in assessing such animal population.

2.4 Abandonment of companion animals

It is known that the financial resources of individuals and families can


influence the demand for acquisition and or the abandonment of companion
animals. Present economical environment determines serious constraints for
a wide range of Portuguese social groups, favoring animal abandonment.
Data kindly supplied by the Urban Hygiene Division of the Lisbon
Council (Table 1) allows a look on the problem of abandonment of cats and
dogs in the urban area of Lisbon city for a period of three years.

Table 1. Animal movement in Lisbon County kennel.*


Type of movement 2002 2003 2004
Dogs Cats Dogs Cats Dogs Cats
Captured stray animals 300 803 959 1318 1028 1381
Abandoned at kennel 752 134 223 63 524 525
Total 1052 937 1182 1381 1552 1907

Animals given for adoption 185 115 258 80 411 407


* Data kindly provided by the Urban Hygiene Division of the Lisbon Council

The figures related to the abandonment of dogs and cats seem to confirm
the growing tendency of Lisbon citizens for the ownership of companion
animals. This can also be correlated with the patent expansion of registered
dog and cat breeders. As an example, the 76 Portuguese registered breeders
of 18 different cat breeds have already organized 122 market exhibitions in
the last fifteen years, nine of them in 2005 [4]. This activity is even more
important with dog breeders. There are 29 market exhibitions forseen for
2006. To stress the interest of this companion specie, the national dog
breeder association aggregates 48 breed clubs, 26 working dog clubs and 3
regional clubs [5].
Confirming such interest for companion animals, in last five years (2000-
2005), the number of pet shops has shown an increase of 37.7%.
484 A.C. Louzã

2.5 Adaptation of synanthropic animal populations

Pigeons (Columba livia) and in general birds, out of other synanthropic


animal populations, are taken with a special fondness and sympathy by the
majority of citizens.
However, the excess of these animal populations, very frequent in our
urban areas, might have a negative impact in the environmental hygiene.
Nesting, defecation and cadavers of dead birds, when excessive can become
a serious problem for council departments dealing with monument cleaning
and conservation or city hygiene, for owners of buildings or cars due to
serious pollution problems, or simply for anyone doing domestic laundry.
Often, some of these animal populations, mainly pigeons due to their
resilience and deference, easily colonize our buildings and monuments due
to the fact that they can find “ad libitum” feed, water and shelter.
Council authorities have great difficulty in preventing feeding pigeons in
city squares and parks specially by children and senior citizens. This and the
careless of many urban food industries might sometimes convert in a plague
an animal population that used to be pleasant and friendly.
It is also observed a significant increase in the number of certain avian
species, usually migratory, probably due to climatic changes or attractive
conditions of feed and shelter found in our cities. For example, different
seagull species (Larus sp) are gradually becoming resident populations, and
being highly competitive, menacing other indigenous birds. In certain areas
of Portugal, seagulls are considered an urban plague due to the dimension of
the population and its negative impact in the welfare of citizens and damage
to property. Deficient removal of urban solid residues from domestic and
industrial origin and poor hygiene of public areas are important factors
associated with the building up of such avian populations. Free access to
open air urban garbage sites located in many regions across country has
become a major seagull feed source.
Only a few animal species have well adapted to the present urban
environment. Some of them have shown a high level of success even if not
accepted by humans. This is the case of species usually considered as
plagues, either due to excessive population, resulting in spoilage, annoyance
and inconvenience, or by the risk of transmission of zoonotic diseases. Mice
and rats, cockroaches, flees or other blood-sucking arthropods (ex.
mosquitoes, sand flies and ticks) are some examples of such plagues which
are target of council, industrial or domestic control programmes.
The Sharing of Urban Areas by Man and Animals 485

2.6 Therapeutic use of animal species

The changes of living and adapting to urban environments of the


different animal species, including man, and its behavioural, health and
social consequences are not sufficiently addressed in Portugal.
The population density and the gradual disappearance of original
ecosystems were the main references in both humans and animal
populations. Clinical observations of veterinary practitioners often relate the
impact of those alterations with the pathologies exhibit by the patients. At
other level, recent developments have been acknowledged involving the use
of animals in treating or reducing the effects of certain human illnesses [2].
The main animal species concerned are dogs, cats, horses and dolphins.
In urban areas, the use by blind people of dogs is well known and
documented. Also the expanding utilisation of small dog breeds or cats to
accompany and interact with aged people is a common feature in senior
families or in resting houses. Companion animals can also encourage
physical activity and social interaction between citizens. An example is the
daily compulsory two promenades of dog owners and the open opportunity
to meet and socialize with neighbours.
At another level, a number of associations related to certain physical
disabilities and mental pathologies (autism, mongolism, epilepsy, psychiatric
illness and other central nervous system deficiencies) have gradually been
introducing the use of dogs, horses or dolphins as a successful training and
education tool to alleviate, compensate or rehabilitate from these
deficiencies.

2.7 Animal risks to urban public health

The risk for humans of sharing with animals the same urban spaces can
be specified on a few main items in relation to the characteristics of its
hazardous potential. They are: i) the transmission of pathogenic microbial
agents causing diseases on humans; ii) the occurrence of animal aggression
on its various forms (biting, scratching, kicking, peaking, stinging); iii) the
environmental pollution caused by animals when defecating, urinating or
vomiting in public spaces as sidewalks, streets, gardens, playgrounds or
green fields; and, iv) the nuisance of sound, smell and parasites (flees, ticks,
mites) that companion animals (ex. dogs, cats, birds) can create for
neighbours and citizens using public areas.
486 A.C. Louzã

3. FMV CONTRIBUTION TO ASSESS HEALTH


RISKS FROM URBAN ANIMAL POPULATIONS

Among the different research questions addressed in relation to animal


species in urban areas the impact in public health and in the environment are
the more frequently studied. In the last decade, the risk of infectious agents
of companion or synanthropic animals being transmissible to humans has
been the main target. Also, a few studies were developed related to the use
of companion animals as models in comparative medicine issues or the use
of animals as biomarkers of environmental pollution for either urban
pollution elements as lead or chromium or animal contribution in the spread
of antibio-resistance factors.
At the Veterinary Medicine Faculty of Lisbon (FMV), twenty eight
research projects and ten master and doctorate dissertations concluded or
being developed are related to subjects dealing with infections or diseases of
companion, sport or synanthropic animals. A special line of comparative
pathology research is dedicated to the study of bitches’ mammary tumours.
There are also research projects on the behaviour disturbances of companion
animal species.
Over forty scientific and technical publications referred to animals in
urban areas have been published in national and international journals. The
main issues studied are those either more commonly seen or most seriously
prevalent in the country. Diseases like dog leishmaniasis, cryptosporidiosis
in companion animals, zoonotic mycosis as cryptococcosis, dermatophytisis
and candidiasis, dog and cat nematode infections with zoonotic potential,
eccinochocosis-hydatidosis, salmonellosis and other zoonoses in
synanthropic birds, and antimicrobial resistance of staphylococci were
among those more often addressed [6].

4. CONCLUSION

From the analysis of the results and the conclusions of these studies is
possible to note that environmental alterations of natural habitats and
consequent behavioural changes in individual and animal populations have
increased the risk of physical and biological hazards for citizens. Also, close
contact with pets is a major cause of zoonotic hazard. At other level, feeding
errors might conduct to serious metabolic diseases to animals and greater
expenses for the owners. Frequent examples are obesity and dermatological
pathologies.
Taking into consideration the presented facts and using the analogy of
what has being happening in other European countries [1, 2] it is probable
The Sharing of Urban Areas by Man and Animals 487

that future evolution of Portuguese human-animal interaction will be marked


by the expansion of both companion and synanthropic animal populations.
It is anticipated that one of the factors supporting a better fruition and use
of animal species in urban areas might be a gradually increase knowledge of
animal owners, and citizens in general, of the characteristics and behaviour
of different animal species. There are also signs that best awareness is been
shown on ethical and law bonding attitudes by urban human population.
The role played by companion animal breeder associations, by the
assistant veterinarians, by animal feed and drug industries, by audio-visual
media sector and by different enterprises and professionals dealing with live
animal or animal related products is becoming central in the information and
the education of everyone interacting with animals.
A significant development of therapeutic applications of the human-
companion animal interactions is envisaged mainly in the areas of
rehabilitation of physical and mental illness and of geriatric support.
A new legal environment concerning the ownership, the use, the welfare
and the identification of companion animals has been recently established
[7]. Explicit obligations to be fulfilled by the companion animal owners are
being implemented and need a particular attention from those using public
places. In parallel, urban population is much more aware of animal welfare
and environmental hygiene issues, gradually becoming a positive factor
pressing for change to more adequate and civilized conducts. It is foreseen
that such improvements will definitely support risk reduction originated
from animals interacting with humans in urban areas.

ACKNOWLEDGEMENTS

Most of the data referred in this paper was only available due to the kind
help of a number of colleagues and persons responsible for public and
private services dealing with companion animal and domiciliated species. I
wish to specially thank Dr. Luisa Costa Gomes, Dr. Ana Moura, Dr.
Henrique Simas, Dr. Marta Santos, Dr. Telmo Nunes, Prof. Yolanda Vaz,
Prof. Ilda Rosa, Prof. Miguel Saraiva Lima and Prof. José Robalo Silva for
their valuable contributions.

REFERENCES
1. Swabe J. Animals, Disease and Human Society, London, Routledge, 1999.
2. Ormerod EJ, Edney TB, Foster SJ, Whyham MC. "Therapeutic applications of the
human-companion animal bond", Veterinary Record, 157, 689-691, 2005.
488 A.C. Louzã

3. WHO/WSAVA. Guidelines to reduce human health risks associated with animals in


urban areas, Geneva, WHO, 1981.
4. Clube Português de Felinicultura (2006). (http://www.cofelinicultura.web.pt)
5. Clube Português de Canicultura (2006). (http://www.cpc.pt/index/index.php)
6. CIISA/FMV. Five-year Report of CIISA. Lisbon, FMV, 2003
7. Decretos-Lei 312, 313, 314, 315. DR N.290, I-A Serie, Lisboa, Portugal, Dez. 17, 2003,
8436-8473.
PART VIII

HEALTH AND SPORT SCIENCES


PHYSICAL ACTIVITY AND
CARDIORESPIRATORY FITNESS
With Special Reference to the Metabolic Syndrome

Luís Bettencourt Sardinha


Laboratório de Exercício e Saúde, Faculdade de Motricidade Humana, Universidade Técnica
de Lisboa, Estrada da Costa, 1499-006 Lisboa, Portugal, email: lsardinha@fmh.utl.pt

Abstract: Physical inactive adults have higher incidence of cardiovascular and total
mortality. Unfit subjects tend also to have higher mortality rates. The
metabolic syndrome increases with age and tends to increase cardiovascular
mortality. Higher levels of physical activity and cardiorespiratory fitness in
children, adolescents, and adults improve metabolic syndrome features.
Current physical activity guidelines for children, adolescents and adults lack
evidence-based health-related criteria. There are biological, developmental,
health and quality of life reasons for promoting physical activity in children
and adolescents. However, the evidence base for these related criteria and the
best means of promoting physical activity in children are scarce. Data from
accelerometer studies suggest that the majority of children up to the mid-teens
meet the recommended 60 minutes a day of moderate intensity physical
activity. These studies have improved our capacity to measure several
dimensions of physical activity. However, there remains some debate about
the recommended levels of light, moderate and vigorous physical activity to
improve energy balance, metabolic health, and prevent overweight and
obesity. Data from the European Youth Heart Study with objectively measured
physical activity (proportional actigraphy) suggest new recommendations
based on metabolic health and the metabolic syndrome, i.e. the clustering
metabolic cardiovascular risk factors such as elevated blood pressure, obesity,
dyslipedemia, disturbed insulin and glucose metabolism.

Key words: Physical activity, cardiorespiratory fitness, metabolic syndrome, children,


adolescents, adults, recommendations.

491
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 491–510.
© 2007 Springer. Printed in the Netherlands.
492 L.B. Sardinha

1. EPIDEMIOLOGY OF PHYSICAL ACTIVITY AND


CARDIORESPIRATORY FITNESS - TOTAL AND
CARDIOVASCULAR MORTALITY

Long-term prospective follow-up studies have assessed the relative risk


of death from any cause and from specific diseases associated with physical
inactivity [1–3] and low cardiorespiratory fitness [4, 5]. These epi-
demiological studies provided the scientific evidence for current physical
activity guidelines for adults.
Both men and women who reported increased levels of physical activity
were found to have reductions in relative risk (by about 20%–35%) of death
[1, 6]. Recent investigations have revealed even greater reductions in the risk
of death from any cause and from cardiovascular disease. For instance, being
fit or active was associated with a greater than 50% reduction in risk [7].
Furthermore, an increase in energy expenditure from physical activity of
1000 kcal (4200 kJ) per week or an increase in physical fitness of 1 MET
(metabolic equivalent) was associated with a mortality benefit of about 20%.
Physically inactive middle-aged women (engaging in less than 1 hour of
exercise per week) experienced a 52% increase in all-cause mortality, a
doubling of cardiovascular-related mortality, and a 29% increase in cancer-
related mortality, compared with physically active women [8]. These relative
risks are similar to those for hypertension, hypercholesterolemia and obesity,
and they approach those associated with moderate cigarette smoking.
Moreover, it appears that people who are fit but have other risk factors for
cardiovascular disease may be at lower risk of premature death than people
who are sedentary with no risk factors for cardiovascular disease [9, 10].
In male drivers and conductors of London buses [2], a lower annual total
incidence of coronary heart disease was found among conductors compared
with their driver colleagues (1.9 per 1000 year-1 in conductors compared
with 2.7 per 1000 year-1 in drivers). When sudden deaths alone were
examined, the comparison was even more striking: deaths among drivers
were more than twice as high.
In the Harvard Alumni Health study, Paffenbarger and associates
surveyed the physical activity and health of nearly 17,000 Harvard alumni to
investigate all-cause mortality [3]. Questionnaire data were used to quantify
exercise expenditure in terms of caloric expenditure. One report from this
seminal work focused on graded levels of total activity with no fewer than
eight subdivisions within the range of <500 kcal week-1 to>3500 kcal week-
1
.Risk reduction from all forms of activity was apparent in relation to the full
range of activities. The difference between almost none and high total
physical activity appeared to be greater independently of vigorous sports.
However, Paffenbarger noted that participation in vigorous sports would be
Physical Activity and Cardiorespiratory Fitness 493

expected to be most common among alumni expending •2000 kcal week-1


[3].
After genetic and other familial factors are taken into account, leisure-
time physical activity is associated with reduced mortality [11]. Maintaining
or taking up light or moderate physical activity reduces mortality and heart
attacks in older men with and without diagnosed cardiovascular disease.
These results support public-health recommendations for older sedentary
people to increase physical activity, and for active middle-aged people to
continue their activity into old age [12]. Furthermore, bicycling to work
decreased risk of mortality by approximately 40%, after multivariate
adjustment including leisure time physical activity. Within the moderately
and highly active persons, sports participants experienced only half the
mortality of non-participants. Leisure time physical activity was inversely
associated with all-cause mortality in both men and women in all age groups.
Benefit was found from moderate leisure time physical activity, with further
benefit from sports activity and bicycling as transportation [13].
A computer assisted literature search was performed to examine the
association of physical activity with all cause mortality in women [14]. It
was concluded that, by adhering to current guidelines for physical activity
and expending about 4200 kJ of energy a week, women can postpone
mortality. The mean magnitude of benefit experienced by women is similar
to that seen in men. Although earlier studies have been conducted primarily
in men, this review showed that there is convincing evidence that physical
activity can also avert premature mortality in women. Accumulating at least
30 minutes of moderate intensity physical activity on most days of the week
can postpone mortality in women, as well as men.
Cardiorespiratory fitness is a phenotype that is dependent on genetics and
individual level of physical activity. One of the most relevant
epidemiological studies looking at the effects of cardiorespiratory fitness on
mortality is the Aerobics Center Longitudinal Study, performed at the
Cooper Institute of Aerobics Research in Dallas. This long-standing research
project provides the most comprehensive data regarding the relative role of
maximal oxygen consumption, regardless of gene-interaction effects. One of
the published studies provides strong data supporting the hypothesis that a
low level of cardiorespiratory fitness reduces life expectancy [4]. The all-
cause mortality rates of the lowest fitness subjects were higher. An important
finding was that the age-adjusted mortality rate dropped dramatically from
the lowest fitness levels to the intermediate, and levelled-off near the highest
fitness levels. This means that there is not a direct relationship between
cardiorespiratory fitness and age-adjusted mortality. Based on this finding, it
is important to highlight the fact that to achieve positive values in this
outcome variable (age-adjusted mortality), there is no need for high levels of
494 L.B. Sardinha

the exposure variable (cardiorespiratpry fitness). As suggested in Figure 1, a


minimum level of cardiorespiratory fitness can be defined, and it is about 10
and 9 METs for men and women. These fitness levels can be attained for
virtually all people who engage in a regular physical activity program.

160
Men
140 Women
Age-Adjusted Mortality
(10 000 Persons/year)

120
35 mL/kg/min 31.5 mL/kg/min
100 10 (METs) 9 (METs)
80

60

40

20

0
6 7 8 9 10 11 12

VO2max (METs)
Figure 1. Age-adjust mortality in men and women according to cardiorespiratory fitness
levels. Adapted from reference [4].

In a meta-analysis of published data, the relative role of physical activity


and cardiorespirsatory fitness on cardiovascular and coronary heart disease
mortality was assessed [15]. This comprehensive analysis suggested that
there is a dose-response relationship between both physical activity and
cardiorepiratory fitness, and mortality. As depicted in Figure 2, the gradient
was found to be higher for cardiorespiratory fitness.
Based on the current scientific-based evidence, it is well established that
active individuals have high levels of cardiorespiratory fitness and, in
controlled experimental trials, increases in exercise result in increases in
fitness. Policies, legislation and communities should promote sustainable
physical activity interventions and environments. The community also
should look to the business community for ideas and perhaps even
partnerships on how to broadly market evidence-based interventions.
Evidence strongly supports the conclusion that physically active men and
women have higher survival rates and live longer. Similar conclusion can be
drawn for cardiorespiratory fitness. Some of the physiological mechanisms
that induce this survival effect may be related to the metabolic syndrome or
to some of its features. Some classical cardiovascular risk factors seem to
cluster in this atherogenic syndrome. Physical activity and cardiorespiratory
fitness may have independent effects on these features or may have some
Physical Activity and Cardiorespiratory Fitness 495

form of interaction to induce positive health effects on children, adolescents


and adults.
1.0
p<0.02

0.8

0.6

0.4
Cardiovascular Disease
Relative Risk

0.2

0.0 Cardiorespiratory Fitness


1.0 Physical Activity

0.8

0.6

0.4

Coronary Heart Disease


0.2

0.0
0 25 50 75 100

Percent
Figure 2. Dose-response mortality relative risk for physical activity and cardiorespiratory
fitness based on a meta-analysis of published studies. Adapted from reference [15].

2. THE METABOLIC SYNDROME – DEFINITION


ISSUES

Over the past two decades, a striking increase in the number of people
with the metabolic syndrome has taken place worldwide. This increase is
associated with the global epidemic of obesity and diabetes. With the
elevated risk not only of diabetes but also of cardiovascular disease, there is
an urgent need for strategies to prevent the emerging global epidemic.
The metabolic syndrome can be identified according to a variety of
criteria. Table 1 indicates the different criteria for the metabolic syndrome
definition. For adolescents the definition uses normative data for abdominal
obesity and hypertension [16]. The IDF definition rationale behind this new
classification was to provide a more useful definition that could be used
496 L.B. Sardinha

across several research programs and clinical groups, and thus enable
appropriate comparisons between studies, and to standardize clinical
diagnoses, Furthermore, it was intended to be a better predictor of adverse
risk outcomes [17]. The WHO criteria emphasize the importance of insulin
resistance as an underlying etiology for the metabolic syndrome and are
more accepting of pharmacological interventions [18]. The NCEP (ATP III)
criteria rely on readily accessible clinical criteria and assume that overweight
and decreased physical activity are responsible for the majority of metabolic
syndrome cases in western society [19].

Table 1. Metabolic syndrome definitions.


Metabolic syndrome definition in adolescents
ƒ Abdominal obesity: waist circumference t 90th percentile NHANESIII
ƒ Low HDL cholesterol d 1 mmol/l
ƒ Hypertriglyceridemia t 1.3 mmol/l
ƒ Hypertension t 90th percentile for ht, age, gender
ƒ High fasting glucose t 6 mmol/l

IDF definition of metabolic syndrome


ƒ Central obesity: waist circumference > 80 cm
Together with at least two of the following components:
ƒ Raised triglyceride level: > 1.7 mmol/l OR treatment for this abnormality
ƒ Reduced HDL cholesterol: < 1.29 mmol/l OR treatment for this abnormality
ƒ Hypertension: raised arterial pressure > 130/85 mmHg OR antihypertensive
medication
ƒ Diabetes: raised fasting plasma glucose (> 5.6 mmol/l) or previously diagnosed type
2 diabetes

WHO definition of metabolic syndrome


ƒ Clinically diagnosed diabetes OR high fasting glucose (fasting plasma venous
glucose > 6.1 mmol/l) OR insulin resistance (highest quarter HOMA score)
Together with at least two of the following components:
ƒ Hypertension: raised arterial pressure > 140/90 mmHg OR antihypertensive
medication
ƒ Dyslipidaemia: raise plasma triglycerides (> 1.7 mmol/l) OR low high density
lipoprotein cholesterol (< 1.0 mmol/l)
ƒ Central or general obesity: waist to hip ratio > 0.85 in women OR body mass index
> 30 kg/m2

P IATP III definition of metabolic (insulin resistance) syndrome


Any three (or more) of the following:
ƒ High fasting glucose (fasting plasma venous glucose > 6.1 mmol/l)
ƒ Hypertension: raised arterial pressure > 130/85 mmHg OR antihypertensive
medication
ƒ Raise plasma triglycerides (> 1.7 mmol/l)
ƒ Low high density lipoprotein cholesterol (< 1.0 mmol/l)
ƒ Central obesity (waist circumference > 88 cm)
Physical Activity and Cardiorespiratory Fitness 497

The WHO definition places more emphasis on insulin resistance as an


underlying mechanism for the metabolic syndrome. Thus, one might focus
on reducing obesity and increasing physical activity to improve insulin
sensitivity as the NCEP definition requires. However, the WHO definition is
also more accepting of the possible use of insulin-sensitizing intervention,
particularly pharmacological intervention, in nondiabetic subjects. The IDF
emphasizes abdominal obesity as the major marker. Even though a thorough
analysis of the main issues related with theses definitions is out of the scope
of this review, it is important to recognize that currently there is an ongoing
debate about this topic [20]. The current definitions for use in adulthood are
very problematic in relation to CHD risk/prediction. One of the problems in
these definitions is that one can be diagnosed as not having the syndrome
and hence in a clinical sense diagnosed as 'healthy' but still have marked
hypertension or marked dyslipidaemia.
Aside from the discussion about the more relevant criteria for the
metabolic syndrome definition, or the need for such a definition, it is know
that the cluster of its features tends to increase with age in both sexes. The
prevalence of the metabolic syndrome as defined by NCEP in NHANES III
is <10% in individuals aged 20–29 years, 20% in individuals aged 40–49
years, and 45% in individuals aged 60–69 years [21]. According to these
figures, it may be useful to suggest that the estimated prevalence of the
metabolic syndrome is an individual's age minus 20. As indicated in Figure
3, the metabolic syndrome tends to increase cardiovascular mortality. In a
study of slightly more than 1,000 males from Kuopio, Finland, Lakka et al.
[22] showed a 3.5-fold increase in risk for cardiovascular disease mortality.

Figure 3. Cardiovascular disease mortality in subjects with and without the metabolic
syndrome. Adapted from reference [22].
498 L.B. Sardinha

In children and adolescents, the definitions are still more problematic,


because it is plausible to consider that the syndrome may require a diagnosis
of clinical importance. Although studies with pediatric population have
examined metabolic syndrome abnormalities [23, 24], scientific-based
health-related cut-offs points for the various components of the metabolic
syndrome have not been defined in children and adolescents. In the
European Youth Heart Study, a continuous summary variable for the
metabolic syndrome based on a Z-Score was developed from the following
metabolic risk indicators: obesity marker, insulin, glucose, total cholesterol,
HDL-cholesterol, LDL-cholesterol, triglycerides, and blood pressure [25],
This score thus only applies to the population in study, as it is derived from
the SD units of the sample mean. The metabolic syndrome Z-Score was
calculated by averaging the Z-Scores of the measured variables.
This approach seems appropriate to look at associations of relative
variance, such as the relationship between physical activity and this
summary score. The same is to say, to analyze the relationship between
objective exposures and outcomes. However, this approach leaves out some
clinical-related issues. The option between a continuous and a categorical
approach is very much dependent on the research question. The former
seems more appropriate for correlational studies and the latter more
appropriate for prevalence and clinical analysis.

3. PHYSICAL ACTIVITY AND


CARDIORESPIRATORY FITNESS IN RELATION
TO THE METABOLIC SYNDROME

The etiology of cardiovascular diseases is complex, and consists of a


spectrum of risk factors. Even thought the clinical expression of most
cardiovascular diseases occurs during adulthood, it is well recognized that
the risk factors may incubate in children and adolescents and is
increasingly recognised as a paediatric problem. Cardiovascular risk
factors such as high blood pressure, obesity, insulin, serum lipids and
lipoproteins tend to cluster during childhood, which has been shown to
be a better measure of cardiovascular health in children [26] and adults [27].
An important finding is that longitudinal data suggest that variables
associated with the metabolic syndrome track moderately well from
adolescence to adulthood [28].
The energy expenditure related to physical activity and the maximum
oxygen consumption related to cardiorespiratory fitness induce selective
changes in some of the metabolic syndrome features in children and adults.
Physical Activity and Cardiorespiratory Fitness 499

3.1 Effects of physical activity and cardiorespiratory


fitness on metabolic syndrome features in children

High quality data and scientific reports related to this theme are scarce.
Accurate assessments of physical activity and cardiorespiratory fitness in
children are difficult to perform, and only recently scientists were able to
measure objectively physical activity in large samples.
The European Youth Heart Study (EYHS) is a multi-centre, international
study, addressing the prevalence and aetiology of cardiovascular disease risk
factors, including physical activity, in children aged 9 and 15 years. Physical
activity data were collected from selected areas in four European countries –
Denmark (city of Odense), Portugal (island of Madeira), Estonia (city and
county of Tartu) and Norway (city of Oslo). Physical activity was
objectively assessed for 4 consecutive days, which included both weekend
days. Four days of measurement was selected as the optimal balance
between obtaining a sufficiently long measurement period to ensure a
representative measure of the child’s habitual activity and the logistical
limitations of a large field-based study [29]. Clustering tendencies were
evident in children and adolescents with low cardiorespiratory fitness. An
inverse relationship was found between cardiorespiratory fitness and
cardiovascular risk factors in children and adolescents. In one of the EYHS
reports, it was found that cardiorespiratory fitness was the variable with the
highest and most consistent association with the cardiovascular risk factors
in children [30].
Physical activity was inversely associated with metabolic risk and the
potential beneficial effect of physical activity may be greatest in children
with low fitness. As suggested by Figure 4, physically active children with
high cardiorespiratory fitness have a lower metabolic syndrome Z-score than
those that are less physically active and with a low cardiorespiratory fitness.
In other words, children that are unfit should be a main target for behavioral
interventions to increase physical activity. This Z-score approach defined the
metabolic risk clustering in a continuous scale, which avoided the less
sensitive and more error prone dichotomous analysis. The interaction
between objectively measured physical activity and cardiorespiratory fitness
indicates that the former behaviour and the latter physiological attribute
should be considered as different exposures in children.
500 L.B. Sardinha

Figure 4. Relationships of quartile (Q1-Q4) of physical activity and the metabolic syndrome
Z-score, stratified by cardiorespiratory fitness. Adapted from reference [30].

From the pooled EYHS database, a graded relation between


cardiorespiratory fitness and waist circumference, sum of skinfolds, and
systolic blood pressure was found in children and adolescents (aged 9 and
15) [31]. This study was the first to show that there is a curvilinear graded
relation between cardiorespiratory fitness and markers of the metabolic
syndrome such as waist circumference and systolic blood pressure, in
children and adolescents. The greatest difference in these health variables
was observed between low and moderate fitness levels. Systolic and diastolic
blood pressure showed a curvilinear relation with cardiorespiratory fitness,
and this physiological parameter explained 2% of the variance in systolic
blood pressure. Figure 5, indicates this graded relation between
cardiorespiratory fitness and systolic blood pressure in 9 year old boys.
Based on the shape of this relationship, the greatest benefit may be achieved
when increasing the fitness from low to moderate level. The strongest
relation was seen for waist circumference. Waist circumference is a
morphological marker of abdominal adiposity, which has been shown to
relate to an unfavourable cardiovascular risk profile in children and
adolescents. Again, it is important to direct action towards those children and
adolescents who are the least physically fit.
Physical Activity and Cardiorespiratory Fitness 501

Figure 5. Relationship between cardiorespiratory fitness expressed in Watts/kg and systolic


blood pressure in 9 year old boys. Adapted from reference [31].

Data from the EYHS suggest that children who are physically active for
>2 h/d at an intensity level broadly equivalent to walking were significantly
leaner than those who accumulated <1 h physical activity/d at the same
intensity level [32]. This may imply that •60 min of moderate physical
activity each day, similar to the current recommendation for health-related
physical activity in young people is the minimal amount required to improve
body composition in children. Figure 6 shows the graded relationship
between the amount of physical activity and the sum of five skinfolds. A
significant difference was observed between those who accumulated >2 h of
moderate and vigorous physical activity per day and those who accumulated
<1 h/d at this intensity level. This finding may suggest that it is necessary to
accumulate >2 h of moderate and vigorous physical activity per day in order
to induce a more pronounced reduction in trunk and total body adiposity.
During and after sexual maturation, anthropometric indices of obesity are
associated with elevated blood pressure levels [33]. One of the most
deleterious effects of paediatric obesity is related to the risk to
cardiovascular health, including its contribution to hypertension, insulin
resistance, dyslipidemia, type II diabetes, and low-grade systemic
inflammation, all of which may accelerate vascular disease and coronary
atherosclerosis.
502 L.B. Sardinha

48

46
Sum of Skinfolds (mm)

* p<0.05
44 * p<0.05

42
*
40 *

38

36
< 60 min/d 60-90 min/d >90-120 min/d >120 min/d

Physical Activity
Figure 6. Geometric mean sum of five skinfold thicknesses stratified by time spent at
moderate and vigorous physical activity. Adapted from reference [32].

This graded relationship between physical activity and trunk and total
adiposity clearly suggest the important role of increased energy expenditure
to prevent obesity. Similar to adults, increased adiposity in children and
adolescents is associated with a worsening of several metabolic and
cardiovascular health variables. Although chronic diseases such as
cardiovascular disease, hypertension, and dyslipidemia are infrequent in
youth, some risk factors have indeed shown to be present in young samples
and to track from childhood to adult life [34].
In the Portuguese sample of the EYHS, body composition was assessed
with dual-energy x-ray absorptiometry (DXA). The results of this study
showed that DXA derived total body fatness variables and physical activity
estimates are associated with features of the metabolic syndrome in boys and
girls aged 9 years [35]. However, this association may be mediated by
central adiposity, suggesting that, early in life, the central fatness phenotype
has an atherogenic role that needs to be prevented with energy-balanced
strategies, including increased physical activity. As depicted in Figure 7, the
correlations between physical activity dimensions (total counts and
counts/min) and a metabolic syndrome Z-score were significant. The novel
finding was that when controlling for central fat, these relationships were no
longer significant. This suggests that the healthy effect of physical activity in
some features of the metabolic syndrome may be mediated by changes in
Physical Activity and Cardiorespiratory Fitness 503

trunk fatness, which further emphasises the role of abdominal obesity for
metabolic health.

TotalCounts
Total Counts Counts/min
0
Coefficients of Correlation

-5

-10

-15
Adjusted for
Central Fat
Adjusted for
Adjusted for *
-20 ** Central Fat
Central Fat * p<0.05
* p<0.05

Figure 7. Coefficients of correlation between a metabolic syndrome z-score and total counts
and counts/min in 9-year old girls and boys. Adapted from reference [35].

It is well recognized that physical activity is important for normal


growth, development of functional capacities and overall well-being. More
specifically, within the scope of metabolic health, there is emerging new
evidence which further supports the importance of physical activity for
healthy growth. Intervention studies in children add some more detailed
evidence to the role of physical activity to improve inflammatory markers
associated with pathogesis of the atherosclerosis. Twenty-five overweight
children were assessed for brachial artery flow-mediated dilation (FMD),
nitroglycerin-induced dilation, C-reactive protein, lipids, glucose, insulin,
oral glucose tolerance, body composition, cardiorespiratory fitness and blood
pressure [36]. Twenty of these children were equally and randomly assigned
to either 8 weeks of stationary cycling or to a non-exercising control group.
After the intervention, significant improvements were observed in the
exercise group compared with the control group for cardiorespiratory fitness
HDL cholesterol, and FMD area under the curve. It was concluded that 8
weeks of aerobic exercise improves fitness, HDL cholesterol, and
endothelial function in overweight children. Other intervention studies with
children and adolescents also indicate that different physical activity
protocols improve features of the metabolic syndrome through some
identified anti-atherogenic mechanisms [37, 38].
504 L.B. Sardinha

3.2 Effects of physical activity and cardiorespiratory


fitness in metabolic syndrome features in adults

The information regarding the influence of physical and cardiorespiratory


fitness in several features of the metabolic syndrome in adults is more
comprehensive than what is available for children. The cross-sectional
associations of leisure-time physical activity and cardiorespiratory fitness
with the metabolic syndrome were investigated in a population-based sample
of 1069 middle-aged [39]. Men who engaged in at least moderate-intensity
physical activity (greater than or equal to 4.5 METs) were 60% more likely
to have the metabolic syndrome than those engaging in greater than or equal
to 3.0 h·wk-1. Low-intensity (<4.5 METs) leisure-time physical activity was
not associated with the metabolic syndrome. Men with a cardiorespiratory
fitness <29.1 mL·kg-1·min-1 were almost seven times more likely to have the
metabolic syndrome than those with a cardiorespiratory fitness >35.5 mL·kg-
1
·min-1. It was concluded that a sedentary lifestyle and especially poor
cardiorespiratory fitness were associated with the metabolic syndrome. A
novel suggestion was the inclusion of a low cardiorespiratory fitness as a
feature of the metabolic syndrome. In an ethnically diverse sample of women
the adjusted odds ratio for the metabolic syndrome was 0.18 (95%
confidence interval, 0.33 to 0.90) for women in the highest category of
moderate-intensity physical activity, compared with women in the lowest
category [40]. Similar associations were observed for the metabolic
syndrome with vigorous-intensity physical activity and maximal treadmill
duration as a surrogate of cardiorespiratory fitness.
In a prospective study with 9007 men and 1491 women, baseline
cardiorespiratory fitness was measured as duration in a maximal treadmill
test [41]. It was found that age-adjusted incidence rates were significantly
lower across incremental thirds of fitness in men and women, and that low
cardiorespiratory fitness is a strong and independent predictor of incident
metabolic syndrome in both genders. It is interesting to note that, besides
cardiorespiratory fitness, muscular strength was found to be inversely
associated with metabolic syndrome incidence, independent of age and body
size [42]. This finding further suggests the importance of combined exercise
programs to increase both cardiorespiratory fitness and muscular strength for
primary prevention of metabolic syndrome. This effect of different types of
exercise seems to be independent of macronutrient intake. In a cross-
sectional study, with adjustment for macronutrient intake from 3-day dietary
records, cardiorespiratory fitness was inversely associated with the
prevalence of metabolic syndrome [43]. As indicated in Figure 8, the trend
was similar to normal and overweight weight subjects.
Physical Activity and Cardiorespiratory Fitness 505

Figure 8. Odds of prevalent metabolic syndrome by fitness groups stratified by body mass
index (BMI) categories after adjustment for covariables. Adapted from reference [43].

In a sample of 19 223 men, aged 20 to 83 years, the relative risks of all-


cause and cardiovascular mortality were 1.29 and 1.89, respectively, for men
with the metabolic syndrome compared with healthy men [44]. After the
inclusion of cardiorespiratory, the associations were not significant. The
relative risks comparing unfit with fit men for all-cause mortality were 2.18
in healthy men and 2.01 in men with the metabolic syndrome, whereas the
relative risks for cardiovascular mortality for unfit vs. fit men were 3.21 in
healthy men and 2.25 in men with the metabolic syndrome. The upper panel
of Figure 9 shows that the relative risks of the metabolic syndrome for all-
cause and cardiovascular mortality were different when they were
unadjusted or adjusted for cardiorespiratory fitness. The lower panel
indicates a significant dose-response relationship between cardiorespiratory
fitness and mortality in men with the metabolic syndrome. Taken together,
these data provide convincing evidence for physical activity having a
protective effect against mortality risk in men with the metabolic syndrome.
Intervention studies have noted significant reductions in metabolic risk
factors associated with the metabolic syndrome including hyperinsulinemia,
hyperlipidemia, hypertension, abdominal obesity, and selected inflammatory
markers. As was also found in children and adolescents, physical activity
and cardiorespiratory fitness have selected healthy effects on the metabolic
syndrome features of adults [45í47]. However, it should be recognized that
506 L.B. Sardinha

most benefits are transient, which further suggest the need for a lifelong
commitment to physical activity.

Figure 9. Upper panel: Relative risks of all-cause and cardiovascular disease (CVD) mortality
associated with the metabolic syndrome before and after the inclusion of cardiorespiratory
fitness (CRF) as a covariate in 19 223 men aged 20 to 83 years from the Aerobics Center
Longitudinal Study. Lower panel: Relative risks of all-cause and cardiovascular disease CVD
mortality in men diagnosed as having the metabolic syndrome across baseline tertiles of CRF
Numbers atop bars indicate the number of deaths. Adapted from reference [44].
Physical Activity and Cardiorespiratory Fitness 507

4. SUMMARY

The metabolic syndrome, although only more recently defined and


investigated, exhibits a prevalence that is increasing steadly and epitomizes
the integrative nature of modern chronic disease, given its endocrine,
metabolic, and cardiovascular underpinnings. Most notable is the relation
between physical activity, cardiovascular fitness and the metabolic
syndrome, considering information showing that the mortality risk in
sedentary vs. physical active and unfit vs. fit men and women with the
metabolic syndrome is similar to the same comparison in healthy
counterparts. Many of the studies discussed, although not designed to
investigate the effect of exercise and/or diet on the metabolic syndrome per
se, have shown amelioration of risk factors comprising the metabolic
syndrome, including insulin resistance, blood pressure, lipid levels,
inflammation, and endothelial dysfunction. Clustering tendencies were
evident in children, adolescents, and adults with low cardiorespiratory fit-
ness. An inverse dose-response relationship was found between physical
fitness and cardiovascular risk factors in children, adolescents, and adults.
Physical fitness was the variable with the highest and most consistent
association with the cardiovascular risk factors in children and adults.

REFERENCES
1. Paffenbarger RS, Wing AL, Hyde RT. Physical activity as an index of heart attack risk in
college alumni, Am J Epidemiol, 108, pp. 161–175, 1978.
2. Morris JN, Heady JA, Raffle PAB, Roberts CG, Parks JW. Coronary heart disease and
physical activity of work, Lancet, 21, pp. 1053–1057, 28., pp. 1111–1120, 1953.
3. Paffenbarger RS, Laughlin MD, Gima AS, Black RA. Work activity of longshoremen as
related to death from coronary heart disease and stroke, N Engl J Med, 282, pp. 1109–
1114, 1970.
4. Blair SN, Kohl HW III, Paffenbarger RS, Clark DG, Cooper KH, Gibbons LW. Physical
fitness and all-cause mortality: a prospective study of healthy men and women, JAMA,
262, pp. 2395–2401, 1989.
5. Blair SN, Kohl HW, Barlow CE. Paffenbarger RS, Gibbons LW, Macera CA. Changes
in physical fitness and all-cause mortality, JAMA, 273, pp. 1093–1098, 1995.
6. Macera CA, Hootman JM, Sniezek JE. Major public health benefits of physical activity,
Arthritis Rheum, 49, pp. 122-128, 2003.
7. Myers J, Kaykha A, George S, Abella J, Zaheer N, Lear S, Yamazaki T, Froelicher V.
Fitness versus physical activity patterns in predicting mortality in men, Am J Med, 117,
pp. 912í918, 2004.
8. Hu FB, Willett WC, Li T. Adiposity as compared with physical activity in predicting
mortality among women, N Engl J Med, 351, pp. 2694í2703, 2004.
9. Blair SN, Kampert JB, Kohl HW, Barlow CE, Macera CA, Paffenbarger RS, Gibbons
LW. Influences of cardiorespiratory fitness and other precursors on cardiovascular
disease and all-cause mortality in men and women, JAMA, 276, pp. 205í210, 1996.
508 L.B. Sardinha

10. Wessel TR, Arant CB, Olson MB, Johnson BD, Reis SE, Sharaf BL, Show LJ, Handberg
E, Sopko G, Kelsey SE, Pepine CJ, Merz NB. Relationship of physical fitness vs body
mass index with coronary artery disease and cardiovascular events in women, JAMA, 29,
pp. 1179-1187, 2004.
11. Urho MK, Jaakko K, Seppo S, Markku K. Relationship of leisure-time physical activity
and mortality: the Finnish twin cohort, JAMA, 279, pp. 440-444, 1998.
12. Wannamethee SG, Shaper AG, Walker M. Changes in physical activity, mortality, and
incidence of coronary heart disease in older men, Lancet, 351, pp. 1603-1608, 1998.
13. Andersen LB, Schnohr P, Schroll M, Hein HO. All-cause mortality associated with
physical activity during leisure time, work, sports, and cycling to work, Arch Intern Med,
160, pp. 1621-1628, 2000.
14. Oguma Y, Sesso HD, Paffenbarger RS, Lee IM. Physical activity and all cause mortality
in women: a review of the evidence, Br J Sports Med, 36, pp. 162 – 172, 2002.
15. Willimas PT. Physical fitness and activity as separate heart disease risk factors: a meta-
analysis, Med Sci Sports Exerc., 33 (5), pp. 754 – 761, 2001.
16. Cook S, Weitzman M, Auinger P, Nguyen M, Dietz WH. Prevalence of a metabolic
syndrome phenotype in adolescents: findings from the Third National Health and
Nutrition Examination Survey, 1988-1994, Arch Pediatr Adolesc Med., 157, pp. 821-
827, 2003.
17. The IDF Consensus Worldwide Definition of the Metabolic Syndrome, International
Diabetes Federation, 2005.
18. Alberti KG, Zimmet PZ. Definition, diagnosis and classification of diabetes mellitus and
its complications. Part 1: diagnosis and classification of diabetes mellitus provisional
report of a WHO consultation, Diabet Med, 15, pp. 539–553, 1998.
19. Executive Summary of The Third Report of The National Cholesterol Education
Program (NCEP) Expert Panel on Detection Evaluation and Treatment of High Blood
Cholesterol in Adults (Adult Treatment Panel III), JAMA, 285, pp. 2486–2497, 2001.
20. Kahn R, Buse J, Ferrannini E, Stern M. The metabolic syndrome. Time for a critical
appraisal: joint statement from the American Diabetes Association and the European
Association for the Study of Diabetes, Diabetes Care, 28(9), pp. 2289-2304, 2005.
21. Ford ES, Giles WH, Dietz WH. Prevalence of the metabolic syndrome among US
Adults: findings from the Third National Health and Nutrition Examination Survey,
JAMA, 287, pp. 356-359, 2002.
22. Lakka HM, Laaksonen DE, Lakka, TA, Niskanen LK, Kumpusalo E, Tuomilehto J,
Salonen JT. The metabolic syndrome and total and cardiovascular disease mortality in
middle-aged men, JAMA, 288, pp. 2709-2716, 2002.
23. Druet C, Dabbas M, Baltakse V, Payen C, Jouret B, Baud C, Chevenne D, Ricour C,
Tauber M, Polak M, Alberti C, Levy-Marchal C. Insulin resistance and the metabolic
syndrome in obese French children, Clin Endocrinol (Oxf), 64(6), pp. 672-678, 2006.
24. Invitti C, Maffeis C, Gilardini L, Pontiggia B, Mazzilli G, Girola A, Sartorio A,
Morabito F, Viberti GC. Metabolic syndrome in obese Caucasian children: prevalence
using WHO-derived criteria and association with nontraditional cardiovascular risk
factors, Int J Obes (Lond), 30(4), pp. 627-633, 2006.
25. Riddoch CJ, Edwards D, Page A, Froberg K, Andersen SA, Wedderkopp N, Brage S,
Cooper A, Sardinha LB, Harro M, Klasson-Heggebo L, Van Mechelem W, Boreham C,
Ekelund U, Andersen LB. The European Youth Heart Study – Cardiovscular disease risk
factors in children: Rational, aims, study design, and validation of methods, J Physl Act
and Health, 2, pp. 115-119, 2005.
26. Andersen LB, Wedderkopp N, Hansen HS, Cooper AR, Froberg K. Biological
cardiovascular risk factors cluster in Danish children and adolescents: the European
Youth Heart Study, Prev Med, 37(4), pp. 363-367, 2003.
27. Aizawa Y, Kamimura N, Watanabe H, Aizawa Y, Makiyama Y, Usuda Y, Watanabe T,
Kurashina Y. Cardiovascular risk factors are really linked in the metabolic syndrome:
Physical Activity and Cardiorespiratory Fitness 509

this phenomenon suggests clustering rather than coincidence, Int J Cardiol., 109(2), pp.
213-218, 2006.
28. Andersen LB, Hasselstrom H, Gronfeldt V, Hansen SE, Karsten F. The relationship
between physical fitness and clustered risk, and tracking of clustered risk from
adolescence to young adulthood: eight years follow-up in the Danish Youth and Sport
Study, Int J Behav Nutr Phys Act., 1(1), pp. 1-9 2004.
29. Riddoch CJ, Bo Andersen L, Wedderkopp N, Harro M, Klasson-Heggebo L, Sardinha
LB, Cooper AR, Ekelund U. Physical activity levels and patterns of 9- and 15-yr-old
European children, Med Sci Sports Exerc., 36(1), pp. 86-92, 2004.
30. Brage S, Wedderkopp N, Ekelund U, Franks PW, Wareham NJ, Andersen LB, Froberg
K. Features of the metabolic syndrome are associated with objectively measured physical
activity and fitness in Danish children: the European Youth Heart Study (EYHS),
Diabetes Care, 27(9), pp. 2141-2148, 2004.
31. Klasson-Heggebo L, Andersen LB, Wennlof AH, Sardinha LB, Harro M, Froberg K,
Anderssen SA. Graded associations between cardiorespiratory fitness, fatness, and blood
pressure in children and adolescents, Br J Sports Med., 40(1), pp. 25-29, 2006.
32. Ekelund U, Sardinha LB, Anderssen SA, Harro M, Franks PW, Brage S, Cooper AR,
Andersen LB, Riddoch C, Froberg K. Associations between objectively assessed
physical activity and indicators of body fatness in 9- to 10-y-old European children: a
population-based study from 4 distinct regions in Europe (the European Youth Heart
Study), Am J Clin Nutr., 80(3), pp. 584-590, 2004.
33. Freedman DS, Perry G. Body composition and health status among children and
adolescents, Prev Med, 31, pp. 34-53, 2000.
34. Goran MI, Malina RM. Fat distribution during childhood and adolescence: implications
for later health outcomes, Am J Hum Biol, 11, pp. 187-188, 1999.
35. Quitério A, Ornelas R, Sardinha LB. Abdominal adiposity measured by DXA mediates
differently in boys and girls the relationship between physical activity levels and
metabolic syndrome features, Obesity Research, 13, A 91, 2005.
36. Kelly AS, Wetzsteon RJ, Kaiser DR, Steinberger J, Bank AJ, Dengel DR. Inflammation,
insulin, and endothelial function in overweight children and adolescents: the role of
exercise, J Pediatr, 145(6), pp. 731-736, 2004.
37. Watts K, Beye P, Siafarikas A, Davis EA, Jones TW, O'Driscoll G, Green DJ. Exercise
training normalizes vascular dysfunction and improves central adiposity in obese
adolescents, J Am Coll Cardiol, 43(10), pp. 1823-1827, 2004.
38. Nassis GP, Papantakou K, Skenderi K, Triandafillopoulou M, Kavouras SA,
Yannakoulia M, Chrousos GP, Sidossis LS. Aerobic exercise training improves insulin
sensitivity without changes in body weight, body fat, adiponectin, and inflammatory
markers in overweight and obese girls, Metabolism, 54(11), pp. 1472-1479, 2005.
39. Lakka TA, Laaksonen DE, Lakka HM, Mannikko N, Niskanen LK, Rauramaa R, and
Salonen JT. Sedentary lifestyle, poor cardiorespiratory fitness, and the metabolic
syndrome, Med Sci Sports Exerc, 35, pp. 1279–1286, 2003.
40. Irwin ML, Ainsworth BE, Mayer-Davis EJ, Addy CL, Pate RR, Durstine JL. Physical
activity and the metabolic syndrome in a tri-ethnic sample of women, Obes Res, 10, pp.
1030–1037, 2002.
41. LaMonte MJ, Barlow CE, Jurca R, Kampert JB, Church TS, Blair SN. Cardiorespiratory
fitness is inversely associated with the incidence of metabolic syndrome: a prospective
study of men and women, Circulation, 112(4), pp. 505-512, 2005.
42. Jurca R, Lamonte MJ, Barlow CE, Kampert JB, Church TS, Blair SN. Association of
muscular strength with incidence of metabolic syndrome in men, Med Sci Sports Exerc.,
37(11), pp. 1849-1855, 2005.
43. Finley CE, LaMonte MJ, Waslien CI, Barlow CE, Blair SN, Nichaman MZ.
Cardiorespiratory fitness, macronutrient intake, and the metabolic syndrome: the
Aerobics Center Longitudinal Study, J Am Diet Assoc, 106(5), pp. 673-679, 2006.
510 L.B. Sardinha

44. Katzmarzyk PT, Church TS, Blair SN. Cardiorespiratory fitness attenuates the effects of
the metabolic syndrome on all-cause and cardiovascular disease mortality in men, Arch
Intern Med, 164, pp. 1092–1097, 2004.
45. Katzmarzyk PT, Leon AS, Wilmore JH, Skinner JS, Rao DC, Rankinen T, Bouchard C.
Targeting the metabolic syndrome with exercise: evidence from the heritage Family
Study, Med Sci Sports Exerc, 35, pp. 1703–1709, 2003.
46. Lavrencic A, Salobir BG, Keber I. Physical training improves flow-mediated dilation in
patients with the polymetabolic syndrome, Arterioscler Thromb Vasc Biol, 20, pp. 551–
555, 2000.
47. Gill JM, Malkova D. Physical activity, fitness and cardiovascular disease risk in adults:
interactions with insulin resistance and obesity, Clin Sci (Lond), 110(4), pp. 409-425,
2006.
ERGONOMICS: HUMANS IN THE CENTRE OF
INNOVATION

Anabela Simões1 and José Carvalhais1


1
Faculdade de Motricidade Humana, Universidade Técnica de Lisboa, Estrada da Costa,
1495-688 Cruz Quebrada, Portugal, asimoes@fmh.utl.pt

Abstract: Human-centred design of innovative systems in the field of Transports is


crucial for safety and efficiency issues. For utilitarian or working purposes, a
system is designed and developed in such a way it should fit the human
characteristics and the function it is designed for. Even automatic and complex
systems are controlled by people, meaning that their features should be
integrated from the design phases until the implementation and functioning in
order to ensure a well succeeded life cycle. The value of the contribution of
Ergonomics to fit systems and tasks to human is well recognised, particularly
concerning the physical features of humans and machines. However,
technological development and innovation are bringing about new problems
and, consequently, new research needs. Actually, the main research questions
related to the human interaction with modern and complex systems focus on
the human information processing and cognitive functioning, stressing
decision making, attention, mental workload and fatigue, which are
responsible for the success of any task performance but could compromise the
expected results and safety. The study of human activity and behaviour allows
us to understand human diversity and variability, the instability of the human
activity over time, due to fatigue, health, ageing, etc., as well as the evolution
of human expertise and people’s motivation and commitment in their tasks
performance. This article will focus on these new research questions applied to
the field of TRANSPORTS, where the main technological innovation is
represented by the introduction of Intelligent Transport Systems (ITS).
Ongoing and recently carried out projects will be referred to frame and explain
the following research topics: the cognitive resources in performing additional
tasks to driving, fatigue and drowsiness in driving and in traffic control rooms,
as well as ITS and special needs of older drivers.

Key words: Human Factors; Human-centred design; Human-system interaction; Mental


workload; Fatigue; Stress; ITS.

511
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 511–529.
© 2007 Springer. Printed in the Netherlands.
512 A. Simões and J. Carvalhais

1. INTRODUCTION

Human-centred design takes into account human variability in terms of


individual structural and functional features. For utilitarian or working
purposes, a system is designed and developed in such a way it should fit
human characteristics and the function it is designed for. Even automatic and
complex systems are controlled by people, meaning that their features should
be integrated from the design phases until the implementation and
functioning in order to ensure a well succeeded life cycle. Actually, people
interact with systems, which should be easy to use, safe, comfortable,
reliable, accessible, in one word, user-friendly. Moreover, the limits of
human functional abilities should be considered in the design of working
schedules or during the performance of demanding and prolonged tasks,
such as long distance driving. Human-system interaction, with a particular
purpose and aiming a targeted result, is realized through the performance of
specific tasks and sub-tasks. By means of a complex process, developed
from perception to action, systems, machines and any equipments are
controlled, driven, manipulated, producing good and expected results, but,
sometimes, unexpected and undesirable events may occur.
The study of human activity and behaviour allows us to understand
human diversity and variability, the instability of the human activity over
time, due to fatigue, health, ageing, etc., as well as the evolution of human
expertise and people’s motivation and commitment in their tasks
performance. Reasons for failures, errors, violations or, simply, bad results,
should be investigated and studied in order to search their causes and prevent
further identical events. Human factors that explain and are responsible by
human behaviour and performance are frequently missed or sub valorised.
The value of the contribution of Ergonomics to fit systems and tasks to
humans is well recognised, particularly concerning the physical features of
humans and machines. However, the technological development and
innovation are bringing about new problems and, consequently, new
research needs. Actually, the main research questions related to human
interaction with modern and complex systems focus on human information
processing and cognitive functioning, stressing decision making, attention,
mental workload and fatigue, which are responsible for the success of any
task performance but could compromise the expected results and safety.
This article will focus on these new research questions applied to the
field of TRANSPORTS, where the main technological innovation is
represented by the introduction of Intelligent Transport Systems (ITS). The
success of ITS depends on a human-centred design and accurate studies of
the effects of their use in common or new tasks, such as driving vehicles,
using a public transport system or operating in a traffic control room. Human
Ergonomics: Humans in the Centre of Innovation 513

variability, particularly concerning ageing, is a major concern as well, due to


the existing demographic projections. Therefore, several European projects
are focusing on these human factors aspects, aiming at explaining the
influence of the systems use on behaviour of different categories of users in
order to identify particular needs that should be integrated in the systems
design.
Ongoing and recently carried out projects will be referred to frame and
explain the following research topics: cognitive resources in performing
additional tasks to driving, fatigue and drowsiness in driving and in traffic
control rooms, as well as ITS and special needs of older drivers

2. CONCEPTUAL FRAMEWORK

2.1 Human-system interaction

Human-system interaction can be viewed as a dialogue between an


individual and a system as the action produced by him or her will generate a
reaction from the system and consecutive reactions in both senses until the
accomplishment of the interaction purpose. When interacting with a system,
an individual performs a task or a set of tasks following an external stimulus
or a particular intention. Having a particular purpose, the person reacts
producing an activity that should be viewed as a complex process starting
with the perception of an internal or external stimulus to finalize with the
action on the system interface after the appropriate information processing at
a central level. In order to ease these interactions, systems should be user-
friendly and therefore designed with target people in mind. This principle is
valid for any kind of system, machine, equipment, etc., i.e., any tool to be
used by people to achieve specific purposes.
Modern technological systems are characterized by increasing
complexity as they can be dynamic or even intelligent, being modified by
use, and are usually operated in more complex environments, imposing
complex and sometimes concomitant tasks. Therefore, the cognitive
demands related to the interactions with modern and complex systems are
very high and most of these interactions have safety concerns. This means
that a committed error or omission could have disastrous consequences. In
order to prevent undesirable events, the systems design should take into
considerations human structural and functional characteristics, as well as
individual limits for processing information
514 A. Simões and J. Carvalhais

2.2 Human performance

The concept of human performance represents individual ability to


understand, act or identify, which is explained by the interaction of three
factors: expertise, motivation and environment [5]. The lack of one of these
factors, such as motivation, could compromise human performance despite
the presence of an adequate environment and a great expertise; in the same
way, a high motivation and an adequate environment will not be enough to
ensure a good performance when the required expertise doesn’t exist. In a
labour context, technical and organisational conditions should allow the
expression of the workers’ expertise and motivation. The success of any
activity depends therefore on the will shared by a group of individuals to
accomplish an assigned mission.
Understanding human performance means that one must be aware of the
importance of human factors involved in the performance of a task and the
technical and organisational conditions for its performance. The result of an
action depends on the context and circumstances of action, which can
influence human perception, information processing or decision making.
Aiming at optimising human-system interactions, the adequate conditions for
the task performance should be identified and provided, eliminating any
elements that could lead to a poor performance. Human factors are therefore
variables of the human functioning, either influenced by environmental
conditions, or equipments and tools, or task features or established
organisational conditions. This knowledge about human characteristics and
functioning should lead the design of any working or utilitarian system,
particularly in the cases of safety concerns regarding people and the
environment.
Nowadays society, organisations and the related technological
development impose new tasks and different demands in terms of accurate
and efficient actions performed in useful time. In order to ensure a good
cooperation between humans and technologies, and, consequently, the
expected success of human-system interactions, factors that can affect
human performance should be identified and understood. The diversity of
human characteristics and its variability over time do not fit the frequent
assumption of a stable and constant human activity for the design of working
systems or the management of industrial processes. Human factors should
therefore be viewed as variables that influence individual abilities and,
consequently, human performance. Then, the success of performed actions,
as well as committed errors or omissions, recoveries, difficulties or the
sequence of factors leading to an accident should be taken into consideration
in order to eliminate any constraints and achieve the expected results.
Ergonomics: Humans in the Centre of Innovation 515

2.3 Human variability

In the field of Ergonomics, human variability is a very uncomfortable


reality in the context of systems design as it reflects the instability of human
activity. The average “man” does not exist, neither in structural or functional
terms. Therefore, a system must be designed in such a way to fit the features
of a target population. This means that it must be flexible enough to fit the
diversity of potential users and their variability over time, resulting from
ageing, variations in health conditions, fatigue, workload, etc.
Time represents a factor of variability in various terms:

x the ageing effect in individual functional abilities,


x the cumulated experience as a way of compensating for age-related
functional declines,
x the temporal structure of tasks and its effects in terms of allocated
time for tasks performance.

Age-related functional declines are very variable in their modalities,


intensity and timing, leading to a greater variability in older people. Due to
the recent demographic projections, the design of working and utilitarian
systems should take into consideration the characteristics of older users.
Recent policies in EU tend to extend active life [2], meaning that we will
have older workers in our labour force and each person will spend more than
40 years at their jobs. Therefore, the working design (workstations, tools and
schedules) must become adaptable to the worker’s needs in any period of his
or her career. In the context of the design of utilitarian systems, sooner or
later, a great part of potential customers will be 60 years or older [13]. This
will be particularly true in the context of driving, where mobility needs are
increasing, older drivers represent the most growing segment of the driving
population and the introduction of intelligent transport systems (ITS) could
have a potential compensating effect but should be designed to fit the
functional characteristics and limitations of older people [9].

2.4 Ergonomics and human factors

According to the International Ergonomics Association [18], Ergonomics


(or Human Factors) has been defined as “the scientific discipline concerned
with the understanding of interactions among humans and other elements of
a system, and the profession that applies theory, principles, data and methods
to design in order to optimize human well-being and overall system
performance”. This definition integrates both dimensions of Ergonomics:
theory and practice. Actually, practice represents the roots of Ergonomics as
516 A. Simões and J. Carvalhais

theoretical constructs emerged from actual needs that imposed the search for
knowledge on human sciences to satisfy the identified needs. Nowadays,
ergonomics practice is carried out by trained and certified professionals who
apply the Ergonomics theory and methods to the design of systems,
equipments, tools, etc., fitting them and the corresponding tasks to humans.
As an applied science, the object of Ergonomics – human activity
(professional or utilitarian) – should be viewed in relation to its context of
application and specific purposes. Being the optimisation of human-system
interactions the general objective of Ergonomics, the criteria of safety,
comfort and efficiency represent the major concerns in everyday practice. In
order to achieve this objective, Ergonomics previews two types of approach:

x An intervention on the system, process or product aiming at


o fitting the structural and functional characteristics of the
potential users and the previewed tasks,
o eliminating any factors of discomfort or risk;
x An intervention on the individuals in order to provide them with
appropriate training, technical support or documentation to ease the
human-system interactions.

The ergonomic analysis, centred on the task demands and performance,


will allow us to identify the focus of the intervention: on the system, the
person or both. The systems design with people in mind brings economical
advantages represented by an increase in well-being: more comfort, safety
and systems efficiency. In a labour context, improvements in productivity
and reduction of costs, particularly those related to absenteeism and
accidents, are the major benefits of Ergonomics.
As an applied science, important developments of Ergonomics have
occurred in various fields of application that have nowadays a great
expression in publications and scientific meetings: Software Ergonomics,
Industrial Ergonomics, Hospital Ergonomics, Transport Ergonomics, etc.
These applications allow us to identify particular tasks and technical and
organisational conditions that represent the specific invariants of the context.
In addition, the expertise in a particular field of application leads to a clear
and effective integration of the Ergonomics criteria and methods in the
systems design.

2.5 Human-centred design

A human-centred design approach is a participative design process


setting up three different entities: the system under development (the
artefact), the task to be performed and the potential user [1]. In this process,
Ergonomics: Humans in the Centre of Innovation 517

both the artefact and task are progressively defined and developed according
to the available technology and specific requirements for the task
performance. A first prototype of the artefact and a version of the task being
available, users are required to test them in order to allow for an activity
analysis in real situation. This procedure, representing the methodological
approach of Ergonomics, will allow us to identify system requirements for
improvement and the user’s needs for the successful and safe use of the
system, including needs for training. In addition, a user profile will be
defined and, by means of an iterative process, the system will be
successively tested and improved according to the data collected in the
performed tests.
The interaction with innovative technological systems imposes particular
demands and creates new needs for training. Working in a traffic control
room or driving a vehicle in complex traffic conditions imposes sustained
cognitive demands that lead to high mental workload levels. In these
situations, the limited processing resources could be overloaded and create
opportunities for errors, omissions or accidents. Similarly, in monotonous
situations, particularly during the night, the high demands in terms of
sustained attention, resulting from fighting drowsiness, determine declines in
the task performance allowing for errors and omissions that could lead to
accidents. The National Highway Traffic Safety Administration (NHTSA)
has identified driver inattention as a causative factor in 25–30% of crashes
[10]. According to the authors, an inattentive driver may be temporarily
distracted by something inside or outside the vehicle, may be drowsy or
fatigued, or may simply have his or her mind on something other than
driving. Crashes involving drivers who have fallen asleep at the wheel are
especially likely to result in serious or fatal injuries. In this context, the
human-centred design approach should be extended to the environment
design, working schedules and dissemination of recommendations for safety.
In addition, preventive countermeasures should be produced on the basis of
current errors, omissions and violations. Having this purpose, tests with
users should integrate the human error approach aiming at collecting useful
data for their prevention.
Human-centred design requires therefore a multidisciplinary approach,
involving teams composed of experts from different scientific and
technological fields (engineering, human factors/ergonomics, social sciences
and the particular application domain), as well as potential users that should
participate in iterative usability testing.
518 A. Simões and J. Carvalhais

2.6 Applications to the field of Transports

The field of Transports is nowadays characterised by a rapid and


intensive technological development due to the emergence of Intelligent
Transport Systems (ITS), changing the uses and the practices in transport.
The formal definition of ITS – and the one that U.S. DOT (Department of
Transports) uses to guide its program developments – contained in the
National ITS Program Plan [4], a strategic document written jointly by ITS
America and U.S. DOT with cooperation and input from a wide cross
section of the public. According to the above-referred document, “ITS
improve transportation safety and mobility and enhance productivity through
the use of advanced communications technologies. ITS encompass a broad
range of wireless and wire line communications-based information and
electronics technologies. When integrated into the transportation system's
infrastructure, and in vehicles themselves, these technologies relieve
congestion, improve safety and enhance American productivity”.
Although the introduction of these systems aims at fitting the mobility
needs of people and goods and ensuring safety, a human-centred design
based on deep research on human factors involved in the performance of the
corresponding tasks is a main requirement. One of the goals of such systems
integrated in vehicles is to present information to the driver so that it is
quickly understood and not distracting [4]. Human factors and ergonomics
research is being performed on this issue aiming at ensuring road safety. One
current effort is to create guidelines for the development and implementation
of ITS with the goal of creating safe, efficient and effective interactions
between the driver and the system.
In the different transport contexts, vehicles are equipped with advanced
driver assistance systems (ADAS) and in-vehicle information and
communication systems (IVIS) that lead to significant changes in the driving
task and could give rise to safety problems as they affect the driver’s ability
to process the amount of information required to make adequate driving
decisions in useful time. In traffic control rooms of the different modes of
transport, operators are dealing with new information and communication
systems that introduce new job demands and, consequently, new training
needs. Moreover, the increasing transport networks and mobility needs lead
to traffic congestions and transport operations over 24 hours a day. Finally,
the human variability and some people’s mobility impairments impose the
study of solutions for ensuring mobility rights to each citizen despite their
limitations to human activity. In addition, the recent demographic
projections regarding the ageing of modern societies introduce new research
topics on the specific needs and characteristics of this group of people.
Ergonomics: Humans in the Centre of Innovation 519

Although the interfaces of the existing ITS have been developed


complying with human factors and ergonomic design guidelines, human-
system interaction, as well as their joint integration in the vehicle require
more research addressing the driver’s mental workload, attention availability
and situation awareness to safely perform multiple tasks in driving situations
at different levels of complexity.

3. RESEARCH NEEDS

Together with the increasing complexity of road environment and traffic


density, the emergence of ITS introduces changes in common transport and
travelling tasks, such as driving a vehicle. These changes give rise to safety
problems as they affect the driver ability to process the amount of
information required to make adequate driving decisions in useful time. The
use of in-vehicle information and communication systems (IVIS) bring about
human-machine interactions as additional tasks to driving, which require
deep research on the driver’s cognitive activity, particularly, the driver’s
mental workload and attention availability to perform safely multiple tasks
in driving situations at different levels of complexity. The introduction of
these systems may increase the mental workload imposed by driving as the
corresponding human-system interaction represents an additional task,
requiring drivers to switch their attention between the system and the driving
task. This fact has the potential to disrupt the overall performance,
particularly in people who cannot rapidly switch their focus of attention.
According to Moss and Triggs [7], driving performance varies inversely with
switching time. Kanheman, Ben-Ishai and Lotan (cited by the same authors)
found a significant correlation between attention switching time and accident
rate.
The study of the driver’s cognitive activity while interacting with IVIS is
essential for safety improvement as it allows us:

x to understand driver behaviour,


x to create a knowledge basis to guide the systems design and
integration,
x to define conditions and priorities for the safe use of each system.

Human variability among drivers, particularly in terms of age and


experience, is another major concern in this context. Moreover, age-related
perceptive and cognitive declines could have negative effects on the driving
task performance, but older drivers develop some compensatory strategies
based on previous experience. However, the above-referred changes in the
520 A. Simões and J. Carvalhais

driving task, resulting from the interaction with IVIS, could inhibit any
compensatory behaviour due to the lack of previous experience. As the
number of older drivers has increased significantly, any study in this field
should integrate a sample of drivers being 60 years old and over.
The amount of visual information coming from a complex and dynamic
environment and from in-vehicle visual displays could conflict with the
driver cognitive resources to process it correctly and safely in useful time.
Auditory inputs from IVIS could be useful but they could as well have a
negative impact on driving. Actually, a too big flow of vocal information
exchanged with the driver increases the driver mental workload as it
represents an additional task that could use allocated resources that are
necessary to drive safely. In this case, the interaction with IVIS should be
subjected to a definition of priorities according to traffic conditions and the
relevance of the information to the driving task.
In the field of public transport (PT), ITS are bringing about several
changes regarding travellers and professionals. Public transport systems tend
to be more accessible as most barriers are being removed and useful
information provided in real time allow for the increase of mobility. In this
context, some research needs concern the systems interfaces and the type
and quality of the information displayed. However, the needs of mobility
impaired people for information regarding the use of PT systems are a major
concern in terms of providing useful information on transport modes, trip
planning, ticketing, accessibility facilities, etc. Transport professionals in
public transport operators face as well particular changes regarding the use
of new information and communication technologies in vehicles and in
traffic control rooms. The implementation of these systems creates new
needs for training and new criteria for personnel selection, which should be
clearly identified. However, in the human factors perspective, a major
research problem concerns the 24 hours society and the related working
schedules. Actually, contemporary society imposes continuous operations
under time pressure, but the human operator is not prepared to carry out a
continuous activity along the day. The conflict between human capabilities
and the 24 hours society leads to driver’s fatigue and stress, which usually
can be the cause of several accidents.
All the above-referred research needs can be expressed in a set a topics to
be studied in the following particular scenarios: different transport modes
(road, rail, air and maritime), different driving environments and
corresponding vehicles, traffic control rooms in different contexts and public
transport systems. The object of research is the invariant in different
scenarios: human factors determining the subject’s activity, behaviour and
performance. With the general purpose of optimising human-system
interactions, research is carried out to find explanations for critical
Ergonomics: Humans in the Centre of Innovation 521

behaviours and to study adequate solutions that can be converted into design
guidelines or safety countermeasures.

3.1 Research topics

Ongoing and already carried out research at the Ergonomics Department


of UTL/FMH is being developed in the frame of National and European
funded projects covering the following topics:

x Mental workload, fatigue, stress and drowsiness in different


scenarios:
o light railway system (technical, environmental and
organisational working conditions of drivers and traffic
controllers);
o long distance passengers and goods transportation systems
(technical, environmental and organisational working
conditions of drivers).
x Cognitive resources in performing additional tasks to driving:
o mental workload and task performance (using a navigation
system and hands-free mobile telephone);
o errors and omissions (causes and effects);
o human variability (older and novice drivers).
x Road safety:
o identification of training needs related to the use of ITS
(different groups of drivers);
o human factors aspects in road safety campaigns;
o evaluation methodology to assess fitness to drive (elderly
and drivers with cognitive impairments resulting from
stroke).
x Accessibility of public transport systems
o classification of mobility impaired users groups and
identification of their special needs;
o inventory of use cases for the development of a travel
information system for mobility impaired users.

In the case of R&D projects, a usability evaluation is integrated in the


methodology but it does not constitute a research topic as it just represents a
methodological tool for the system development.
522 A. Simões and J. Carvalhais

4. RESEARCH PROJECTS

4.1 National research projects

4.1.1 In-Vehicle Information and Communication Systems and


Driver Behaviour: Analysis and Evaluation (2006-2008)

This project is being carried out and is funded by the National


Foundation for Science and Technology (FCT). It is centred on the study of
the impact of multiple visual and auditory inputs from in-vehicle information
systems (IVIS) on the driver behaviour, aiming at:

x assessing the driver mental workload while interacting with IVIS


in different categories of roadway configuration and traffic
density;
x identifying the categories of traffic situations that represent
higher risk for driving together with the performance of an
additional task;
x defining priorities for the interaction with IVIS, in function of the
information relevance for the driving task.
x identifying the changes on the driving task produced by the
interaction with IVIS that should be considered for training and
licensing procedures.

In addition, the project aims to identify age-related differences in the


driving task performance based on the driver cognitive resources for
information processing and decision-making in function of the traffic
situation.
Two experiments will be carried out in 2006: one in a driving simulator
equipped with a simulated navigation system and a hands-free mobile
telephone and the other in an instrumented vehicle equipped with an actual
navigation system and a hands-free mobile telephone. Experimental
protocols have been setup to collect the relevant data for mental workload
assessment and driving performance analysis. The expected results could be
expressed in terms of useful recommendations for road safety improvement
that should be provided to the automotive industry and transport authorities,
as well as identified needs for further research. In addition, the results of this
research will be compared to others issued from similar research projects on
a basis of the expected knowledge sharing at the COST Action352 and
HUMANIST NoE.
Ergonomics: Humans in the Centre of Innovation 523

4.1.2 Ergonomic Study of Long Distance Coach and Truck Drivers


in Portugal (2004-2005)

Working conditions of long distance drivers are influenced by time-


related constraints, as well as driving conditions, in terms of intensive traffic
or high monotony. Time pressures imposed by both transport systems lead
them to exceed the allowed driving durations. In addition, driving long
distances, in monotonous conditions and often during the night, represent
working constraints that are recognized to increase drivers’ workload and
fatigue. Desmond & Hancock [3] suggest a model of fatigue based on the
perceptual-motor capability involved in the task performance for prolonged
periods. This model makes a distinction between active and passive fatigue,
the first one resulting from continuous and prolonged task-related
perceptual-motor adjustment. Due to this type of adjustment, active fatigue
is a frequent occurrence in driving, particularly in complex driving
environments with high attentional demands. Passive fatigue, in contrast, is
related to situations where the person appears to do nothing for long periods,
like driving in monotonous situations, particularly during the night.
Therefore, both driving for prolonged periods in monotonous situations or
driving in a complex environment lead to fatigue. In addition, driver stress is
considered a safety problem, the association between stress and accident risk
being well established [6]. According to Mathews & Desmond [7]], overload
of attention may be a problem for the unstressed driver, but stress factors,
such as worry and fatigue, tend to impair functional attentional efficiency, so
that the stressed driver becomes especially vulnerable to overload.
Truck and coach drivers being submitted to both situations, as they drive
long distances in the most variable environments and due to a great lack of
data regarding these professional groups in Portugal, a study of their
working conditions was carried out. This study aimed at identifying the main
working constraints related to the performance of their multiple tasks. In
addition, adequate strategies for managing fatigue, drowsiness and stress
should be developed and disseminated. For this purpose, questionnaires and
interviews were applied to a sample of drivers from each context (73 truck
drivers and 156 coach drivers). In a second phase, one vehicle from each
context was instrumented with video cameras in order to record the driver
behaviour during the driving task performance. Due to constraints imposed
by companies, just a very small sample participated in this second phase (1
truck driver recorded during a complete working week and 8 coach drivers
recorded during an entire working day). The collected data were analysed
after self-confronting interviews.
524 A. Simões and J. Carvalhais

4.1.3 Fatigue and Mental Workload of Railway Drivers and Traffic


Controllers (2003-2004)

This study was developed in a light railway network in Portugal. It was


focused on two different working contexts: the tramway drivers and the
operators of the traffic control centre. The purpose was to provide health and
safety guidelines for both working situations, based on an ergonomic
analysis of the exposure to stress and fatigue factors. The first analysis step
consisted of the application of a questionnaire as a guide for interviews
aiming at a subjective assessment of working schedules and identification of
any constraints. After having this characterisation, a task analysis was
carried out in a real working situation, based on video recordings. The clear
identification of fatigue symptoms and the establishment of its relation with
job demands provided grounds for health and safety recommendations on
several issues.
Regarding tramway drivers, the core of recommendations addressed the
organisation of working schedules, as well as specific proposals for the
location of particular devices in the cabin and specifications for the
organisation and placement of railway signs. For traffic controllers, the most
critical aspect concerned the high variability of tasks and its implications
over job demands. As for the systems used in traffic regulation, such as
communication devices, recommendations were provided to improve their
adequacy to this high variability of working scenarios. The analysis carried
out on safety and operation procedures gave way to a modification of
management’s concerns in this regard and to a search for more adequate
structures of procedures.
Knowing that the major constraints related to both working contexts are
inevitable, the goal was to provide means for their minimization and control,
mostly through a more efficient management of working schedules and
workstations improvement.

4.2 European research projects

4.2.1 Human Centred Design for Information Society Technologies –


HUMANIST Network of Excellence (2004-2008)

Human factors and cognitive engineering expertise exist in Europe but


are scattered. For addressing this fragmentation of research capacities,
HUMANIST gathers the most relevant European research institutes involved
in Road Safety and Transport to contribute to the eSafety initiative and to
improve road safety by promoting human centred design for ITS (IVIS and
Ergonomics: Humans in the Centre of Innovation 525

ADAS). This integration will allow us to increase societal benefits of ITS


implementation, to harmonise ITS approaches among State Members, to
react quickly to any new technological developments and to face
international challenges by producing the state of the art research,
identifying knowledge gaps and avoiding redundancy of research activities.
The goal of HUMANIST is to create a European Virtual Centre of
Excellence on HUMAN centred design for Information Society
Technologies applied to Road Transport with a coherent joint program of
activities, gathering research, integrating and spreading activities. Integrating
research activities will allow us to manage and consolidate the NoE structure
by promoting the mobility of researchers, optimising the pool of existing
experimental infrastructures and setting up electronic tools (common
database, web-conference, e-learning) for knowledge sharing. Spreading
Activities will allow the wide spreading of knowledge from HUMANIST, by
organising debates with RT\D projects on eSafety and relevant stakeholders,
promoting harmonisation with standardisation and pre-normative bodies,
setting up training programmes, and by promoting and disseminating
research results to a wide audience.
HUMANIST is sponsored and supported by ECTRI and FERSI
networks.

4.2.2 ASK-IT (2004-2008)

The ASK-IT integrated project aims to develop an Ambient Intelligence


(AmI) space for the integration of functions and services for Mobility
Impaired (MI) people across various environments, enabling the provision of
personalised, self-configurable, intuitive and context-related applications and
services. Within it, MI people related infomobility content is collected,
interfaced and managed within Subproject 1, encompassing transport,
tourism and leisure, personal support services, work, business and education,
social relations and community building related content. Other subprojects
(SP2, SP3, SP4 and SP5) complete the ASK-IT project. The work carried out
within ASK-IT SP1 comprises the definition and modelling of infomobility
needs of MI people.
The main activities developed in this project were: the users groups and
use cases definitions. For the first activity, it was assumed that the concept of
MI extends beyond the traditional definitions of disability and ageing. It
refers to any activity limitation that prohibits the free movement of a person.
In order to define different User Groups (UG) of MI people, some important
references were used, such as International Classification of Functioning,
Disability and Health [14], TELSCAN [11], INCLUDE [17] and Senior
Watch Projects [20], as well as age categories proposed by Waller, 1991
526 A. Simões and J. Carvalhais

[12]. Use Cases (UC) can be defined as a collection of possible scenarios for
the system use [15]. UC is a powerful tool to preview and analyze the
functionality of a system and can be an effective tool if they are developed in
a systematic and coherent manner [16]. The main concern in extracting UC
for ASK-IT was the development of a methodology to support it. The
Unified Modelling Language™ (UML®) from the Object Management
Group was used to specify, visualize and modelling UC [19]. The main
outcome from this activity was a Use Cases Model comprised of two
complementary parts: UC descriptions and UML UC diagrams. Both
activities developed within SP1 represent important tools for the system
development as the contents for all are defined on the basis of the categories
of users and the scenarios for the system use.

4.2.3 COST Action 352 – The Influence of In-Vehicle Information


Systems on Road Safety (2004-2008)

The main objective of the Action is to enhance road safety through the
proper use of In-Vehicle Information Systems (IVIS). In order to create a
scientific basis for the development of a safety evaluation methodology,
rules for drivers education and training, as well as road traffic and vehicle
equipment legislation in the relevant area, this Action aims, more
specifically, at:

x Establishing the effects of increasing amounts of information


available to drivers, through IVIS ;
x Demonstrating how they contribute to driver distraction in road
environments where outside information is normally provided.

Different groups of drivers (novice, experienced, elderly and


professional) and the misuse of IVIS are taken into consideration in the
research studies that are being carried out in the consortium. The first
publication issued from this Action will be edited by INRETS in 2006.

4.2.4 Campaigns and Awareness-raising Strategies in Traffic safety


– CAST (2006-2008)

The CAST proposal is in the form of a Specific Targeted Research


Project aimed at meeting the European Commission needs for enhancing
traffic safety by means of effective road safety campaigns. CAST will
develop an evaluation tool and a design tool for road safety mass media
campaigns. These tools will enable the EC to design and to implement such
campaigns and to evaluate their (isolated) effect on traffic accidents and
Ergonomics: Humans in the Centre of Innovation 527

other performance indicators. CAST will then validate and exploit these
tools by testing the evaluation tool on an EU-funded campaign (Euchires)
and by using the design tool to design and implement a pan-European
campaign to support the implementation of a measure that will recently be
taken by the EU at that time.
More precisely, CAST will develop two tools: an evaluation tool aiming
at isolated effects of road safety campaigns on road crashes and other
outcome variables (various safety performance indicators like safety
awareness or seatbelt wearing rates, prevalence of drink driving, etc.); and a
design tool for road safety campaigns based on existing research and new
results produced in CAST. The evaluation tool will enable a thorough
evaluation of a single campaign, while developing a tool for the joint
evaluation of several campaigns, i.e. meta-analysis is considered to be
beyond the scope of this proposal. The design tool will be developed as a
manual consisting of clear guidelines to design and implement a road safety
campaign.

5. EXPECTED RESULTS

Each project having its particular domain and specific objectives and
results, it can be said that the expected results from the research carried out
at the Ergonomics Department of UTL/FMH in the field of Transports will
have important positive impacts at the following levels:

x Road safety improvement, resulting from:


o recommendations for regulations and definition of priorities
for the interaction with IVIS depending on the complexity of
the traffic situation and the driver’s limitations for
information processing;
o identification of useful compensatory behaviours of older
drivers and their special needs regarding the safe and
controlled use of ITS;
o identification of training needs of different groups of drivers
related to the use of ITS.
x Improvement of working conditions of operators in transport
systems by means of providing recommendations for:
o adequate working schedules of drivers and traffic
controllers;
o stress and fatigue management;
o drowsiness management and prevention.
x Public transport accessibility resulting from:
528 A. Simões and J. Carvalhais

o providing information to PT operators on different


categories of travellers and their special needs regarding
each step of the travel chain in order to remove existent
physical and informational barriers;
o design and development of user-friendly and inclusive
information systems according to the functional limitations
of each category of potential users and travel related
requirements.

6. CONCLUDING REMARKS

In this changing technological environment characterising the Transports


sector, the applied research in the domain of Human Factors and Ergonomics
is essential for a safe, comfortable and effective mobility of all citizens and
also for economic reasons. If human factors recommendations will be taken
into account since an early stage of a system design, several problems will
be avoided and late corrections will be more expensive and less effective.
Actually, solutions in a late phase of a system development will be limited,
more expensive and take more time. In addition, the involvement of human
factors experts in R&D projects should be continuous as the implementation
of a new system and its regular functioning will require a final evaluation
and, eventually, some adjustments.
This approach should be extended to different modes of transport and
other professionals like passengers attendants. In the context of transports,
two main reasons justify and impose the need for research on human factors:
(1) the technological development and the continuous needs for the human
adaptation to a complex and changing world; (2) the 24 hours society and the
working schedules with their impacts on health and safety, as well as on
productivity.
Finally, a related research topic that was not yet included due to
laboratory limitations is situation awareness (SA) in different applications
involving complex tasks in dynamic environments: driving vehicles and
traffic control. The methodological approach in SA will allow us to explain
critical errors and omissions in the tasks performance. However, this
research topic is being introduced in projects that are about to start.
The safety concerns together with other specific aims like mobility and
productivity increase, lead researchers, systems designers and industrials to
take into account the human structural and functional characteristics, as well
as the human diversity and variability. Humans in the centre of innovation
are a major requirement for the success of any new product in the market.
Ergonomics: Humans in the Centre of Innovation 529

REFERENCES

Publications
1. Boy, G., Ingénierie Cognitive: IHM et Cognition. Hermès Science. Paris, 2003
2. Commission of the European Communities, Report requested by the Stockholm
European Council: Increasing labour force participation and promoting active ageing,
Brussels, 2002
3. Desmond, P. & Hancock, P. – Active and Passive Fatigue States. Hancock, P. &
Desmond, P. (Eds.) Stress, Workload and Fatigue, LEA Publishers, New Jersey, 2001,
pp. 5-33
4. Euler, G. W. and Robertson, H. D. (Eds.), National ITS program plan: Intelligent
Transportation Systems, Washington D.C., ITS America, 1995
5. Keravel, F., La Fiabilité Humaine et Situation de Travail, Masson, Paris (1997)
6. Mathews, G. A Transactional Model of Driver Stress. Hancock, P. & Desmond, P. (Eds.)
Stress, Workload and Fatigue, LEA Publishers, New Jersey, 2001, pp. 133-163
7. Mathews, G. & Desmond, P, Stress and driving performance: implications for design and
training. Hancock, P. & Desmond, P. (Eds.) Stress, Workload and Fatigue, LEA
Publishers, New Jersey, 2001, pp. 211-231
8. Moss, S. A. & Triggs, T. J., Attention Switching Time: A Comparison between Young
and Experienced Drivers. Noy, Y. I. Ergonomics and Safety of Intelligent Driver
Interfaces, LEA Publishers, New Jersey, 1997, pp. 381-392
9. OECD, Ageing and Transport: Mobility Needs and Safety Issues, OECD, Transport,
Paris, 2001
10. Stutts, J., Knipling, R R., Pfefer, R., Neuman, T R., Slack, K L. and Hardy, K.K., A
Guide for Reducing Crashes Involving Drowsy and Distracted Drivers, NCHRP Report
500, Vol. 14, TRB, Washington D.C., 2005
11. TELSCAN Project (1997): “Inventory of ATT System Requirements for Elderly and
Disabled Drivers and Travelers”; Deliverable 3.1, WP 3
12. Waller, P. (1991): The Older Driver. Human Factors, vol. 33(5), pp. 499-505
13. World Health Organization (2002), Active Ageing: A Policy Framework, Geneva
14. World Health Organization, International Classification of Functioning, Disability and
Health: ICF, WHO, 2001

Websites
15. Alistair Cockburn website
http://alistair.cockburn.us/crystal/articles/o/ucai/usecasealternateintro.html
16. Ferg, S. (2003) “What’s wrong with Use Cases” in http://www.ferg.org/papers/ferg--
whats_wrong_with_use_cases.html
17. INCLUDE project website http://www.stakes.fi/include/incc301.html
18. International Ergonomics Association (IEA) website http://www.iea.cc/ergonomics/
19. Object Management Group website http://www.omg.org
20. SeniorWatch project website http://www.seniorwatch.de
DEVELOPMENTS IN BIOMECHANICS OF
HUMAN MOTION FOR HEALTH AND SPORTS

Jorge A.C. Ambrósio1 and João M.C.S. Abrantes2


1
Instituto Superior Técnico, Universidade Técnica de Lisboa, 1049-001 Lisbon, Portugal,
e-mail: jorge@dem.ist.utl.pt

2
Faculdade de Motricidade Humana, Universidade Técnica de Lisboa, 1495-688 Cruz Quebrad,
Portugal, e-mail: jabrantes@fmh.utl.pt

Abstract: The characterization of the human motion has a fundamental importance in


scientific areas as diverse as medicine, sports, physical therapy, vehicle
dynamics or general engineering applications. One of the most challenging
problems consists in the evaluation of the internal forces in the human body
and in their control, including muscles, ligaments or anatomic joints, without
using intrusive techniques. Experimental procedures based on photogrametry
and force platforms are used to collect the data required for the numerical
methods, based on inverse dynamics and optimization techniques, to calculate
the internal forces. A discussion of the formulations used in the context of the
State-of-Art and of the applications presented to sports and health sciences
cases helps appraising the collaborative work reported here.

Key words: Gait analysis, Joint moments of forces, Redundant muscle forces, Clinical
analysis, Multibody dynamics.

1. INTRODUCTION

The study of the human motion activities for health, sports, entertainment
or for the design of supporting equipment requires the use of experimental
and numerical tools that allow identifying the mechanisms of force
transmission and their control inside the human body [1]. Sports sciences
rely on the knowledge of the kinetics of the human body to devise better
training practices and to help athletes to excel [2]. Designers of
entertainment and health supporting equipment require that the structures
and mechanisms of the human body are well understood and their limits

531
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 531–553.
© 2007 Springer. Printed in the Netherlands.
532 J.A.C. Ambrósio and J.M.C.S. Abrantes

identified so that the equipment can contribute to improve the human


performance or that its functional specifications are compatible with safe
limits. Also in medicine the biomechanics is supporting to devise new
treatments and eventually helping to design surgeries [3].
A good example concerns the support that biomechanical models,
analysed in the context of the human motion, provide to the understanding of
pathological gait associated to cerebral palsy. Typical effects of the cerebral
palsy are pathologic motion performances due to range restrictions caused
by muscle shortening and/or constant muscle contraction and/or joint
acampsia. Surgical interventions, such as tendotomy, myotomy or tendon
transfer, all used to release a dynamical or fixed contraction of muscle tissue,
neurotomy to relax a spastic palsy and ostetomy are chosen if classical
therapies fail or are insufficient. Most of the surgical approaches are
irreversible which creates the need for methods that help surgeons to assess a
priori the consequences of changes in the biological system either leading to
alternative therapy approaches avoiding a surgical intervention or, if
intervention is unavoidable, reducing stress for the patient by an optimized
operation planning [3]. Current diagnosis methods have such a large
tolerance interval (approx. 30%-40%), that computer simulations can
provide a major contribution if dynamics can be predicted with tolerances of
10%-20%. This situation is illustrative of what aims to be achieved with
more flexible, reliable and efficient tools for biomechanical analysis.
The research work in biomechanics is typically multidisciplinary,
requiring that knowledge in applied mechanics, physiology, experimental
mechanics, and medical sciences or sports sciences is used simultaneously.
Research teams in Mechanical Engineering Institute and Faculty of Human
Motricity, both of the Technical University of Lisbon, have joined efforts
and developed a long standing collaborative work. The expertise of these
groups includes, in a complementary form, experimental procedures,
numerical methods and modeling for biomechanics and applications to
sports and health sciences, exemplified by the research described here. [4-8]
Firstly the methods typically used in biomechanics, including the general
formulations for the equations of motion, their numerical solution, the
experimental data acquisition and its treatment are described and
contributions by the research group highlighted. A fundamental part of the
research in biomechanics consists in the construction of biomechanical
models. Advanced models of the human body, including the relevant muscle
apparatus for selected tasks are developed and described also. Then,
different applications of the methodology developed by the research team to
clinical cases and to sports activities are presented and discussed. Some of
the advances in the state-of-art, for which the team is responsible, are also
highlighted and perspectives for future development are presented.
Developments in Biomechanics of Human Motion for Health and Sports 533

2. METHODS AND PROCEDURES

The analysis carried with the biomechanics of motion requires that a set
of equilibrium equations for the models developed is available and that
proper solution methods are used. Furthermore, the data required for any
study needs to be collected consistently with the methodologies used.

2.1 Equilibrium equations

The biomechanical model is represented by a multibody system. In this


approach the anatomical segments are the rigid bodies, the muscles, tendons,
ligaments are responsible for some of the internal forces, and the anatomical
joints are modeled by kinematic joints. For a multibody system the
kinematic joints are described by a set of algebraic equations in the form [9]

ĭ(q, t ) 0 (1)

where q is the generalized coordinates vector and t is the time variable. The
anatomical joints are typically modeled as time independent constraints
while the prescribed motion of the joints or the length variation of the
muscles are examples of time dependent constraints.

Body j
&
sPj

&
Body i sPi

T t

Figure 1. Joint actuator associated with the knee joint.

A joint moment-of-force is a kinematic constraint that constitutes a


simplified representation of the lumped moment caused by all muscles that
cross a particular anatomical joint. Its mechanical model is a joint actuator,
represented for the knee in Figure 1, where the angle between two adjacent
bodies about the axis of the joint is a function of time. The kinematic
constraint associated to the joint moment-of-force is [7]

ĭ ( jo int,1) (q, t ) { scPTi scPj  scPi scPj cos T t 0 (2)


534 J.A.C. Ambrósio and J.M.C.S. Abrantes

Tensor Tensor
Fasciae Fasciae
Latae Semimembranosus Latae

Figure 2. Muscle actuators defined with two or more points.

where vectors scPi and scPj are fixed to the adjacent bodies of the joint and
T(t) is a prescribed function of time obtained experimentally.
Muscles, such as those for the lower extremity muscle apparatus shown
Figure 2, are also introduced in the equations of motion of the multibody
system as kinematic constraints described as point-to-point kinematic driver
actuators [7]. For a muscle actuator, with origin and insertion located in points
n and m of rigid bodies i and j, as depicted by Figure 3, the constraint is

T
) ( MA,1) (q, t ) rm  rn rm  rn  L2nm (t ) 0 (3)

where rm and rn are the global position vectors of the origin and insertion
points, respectively, and Lnm(t) is the muscle total length.

Body
body j
n

&
rnrn L t
& m
Z rm
rm
body i
Body
O Y
X

Figure 3. Muscle actuator defined between points n and m of rigid bodies i and j.

Differentiating Equation (1) with respect to time yields the velocity


constraint equation. After a second differentiation with respect to time the
acceleration constraint equation is obtained,

ĭq q
 J (4)
Developments in Biomechanics of Human Motion for Health and Sports 535

in which )q is the Jacobian matrix of the constraint equations, q is the


acceleration vector and J is the right hand side of acceleration equations.
Using the Lagrange multipliers technique, the constraint equations for a
multibody system written together with the second time derivative of
constraint equations (4) are [9]

ª M ĭTq º ­q
 ½ ­g ½
« »® ¾ ® ¾ (5)
¬«ĭq 0 »¼ ¯ Ȝ ¿ ¯ Ȗ ¿

where M is the global system mass matrix, containing the mass and
moments of inertia of all bodies, and g is the generalized force vector that
contains all external forces and moments applied on the system and all
internal forces that are not due to kinematic constraints. Ground reaction
forces exemplify external forces while ligament and contact between
anatomical segments exemplify internal forces. The vector of Lagrange
multipliers, O, is physically related to the joint reaction forces and to the
muscle and joint moments-of-force by [9],

g (c) ĭTq Ȝ (6)

The moments-of-force are obtained from the definition of the kinematic


constraint through their relation with the Lagrange multipliers expressed by
Equation (6). Also for the muscle actuators, a Lagrange multiplier is
associated to each muscle of the locomotion apparatus. The physical
dimension of this multiplier, used in the context of these actuators, is a force
per unit of length.

Om1
m1
Im1 = Om2
m2

Im2 = Om3
Im3 m3

Figure 4. Describing more complex muscle actuators.

Muscle actuators defined with more than two points, such as the one
displayed in Figure 4, are described using multiple two-point muscle
actuators, labeled respectively m1, m2 and m3 in Figure 4. The Lagrange
multipliers Om1, Om2 and Om3 , calculated by using Equation (5) and appearing
536 J.A.C. Ambrósio and J.M.C.S. Abrantes

in the reaction force Equation (6), are associated to muscle actuators m1, m2
and m3, respectively. Because a muscle must have a constant force per unit
of length from its origin to its insertion the Lagrange multipliers associated
with each segment of the muscle must be equal. In the tensor fasciae latea
the Lagrange multipliers must be Om1 = Om2 = Om3 = OTFL [10].
To set up the inverse dynamics problem let it be assumed that the
prescribed motion of the model is fully known and consistent with the
kinematic constraints of the biomechanical model. Furthermore, let it be
assumed that the force vector is partitioned into a vector of know forces
gknown and unkown forces gunknown. Let the vector of unknown forces be
represented by

g unknown CT funknown (7)

where matrix C is used to map the space of the forces into the space of
coordinates that describes the system and, consequently, its structure is
dependent on the particular type of force applied. In Equation (5) the only
unknowns are the Lagrange multipliers O and the unknown applied forces
funknown. Therefore, the first line of Equation (5) is now re-written in the form

­ O ½
ª¬ )qT  CT º¼ ® ¾ ^Mq+g known ` (8)
¯funknown ¿

The solution of Equation (8) is obtained for a finite number of time


instants which depends on the sampling required for the solution and,
eventually, on the sampling of the system kinematics. The solution obtained
for a particular time instant is fully independent from the solution obtained
for any other time instant. Furthermore, this form of the inverse dynamic
analysis requires that any unknown moment is applied about known axis.
Therefore, the modeling of anatomical joints through spherical joints is not
possible when using this formulation.
The solution of the linear system of Equations (8) is unique if the number
of independent kinematic constraints and unknown forces is equal to the
number of coordinates of the biomechanical system. Otherwise, the solution
is not unique due to the redundant set of forces and/or constraints used and
the solution of the problem has to be obtained by defining suitable criteria
and using optimization methodologies. In what follows it is always assumed
that the muscle actions and joint moments-of-force are represented by
kinematic constraints.
Developments in Biomechanics of Human Motion for Health and Sports 537

2.2 Data acquisition and processing

Inverse dynamic analysis of a biomechanical system requires the


knowledge of its motion and all external applied forces. The motion of the
system is described by the kinematic information necessary to define the
position and orientation of each anatomical component during the analysis
period. The external applied forces, obtained using force plates, provide all
information necessary for the construction of the system force vector.
The motion of the system consists of the trajectory of a set of anatomical
points located at the joints and extremities of the subject under analysis, as
depicted in Figure 5. In the present work, these curves are obtained through a
digitization process in which the images collected by four video cameras are
used to reconstruct the three-dimensional coordinates of the anatomical
points. This reconstruction process uses DLT [11, 12] to convert the two-
dimensional coordinates of the video images into three-dimensional
Cartesian coordinates.

23
23
21 21
15 22 16 22
15 Ł16

14 17 14 Ł17
5 6 5 Ł6
13 19 18 13 Ł18
11 12 20 12 Ł19
11 Ł20

4 7 4 Ł7

2 38 9
3 Ł8
1 10
1Ł10 2 Ł9

Figure 5. Set of anatomical points.

The motion reconstruction process, by nature, introduces high frequency


noise into the trajectory curves of the reconstructed anatomical points. This
noise occurs due to several factors such as digitalization errors and the finite
resolution of the two dimensional images. In order to make these trajectories
suitable for use in the inverse dynamic analysis, a filtering procedure is
applied with the objective of reducing the noise levels. A Butterworth 2nd
order, low-pass filter [13, 14], with the zero phase shift technique is applied
with properly chosen cut-off frequencies, reducing the noise levels and
smoothing the trajectory curves.
In the present work three force plates are used in the measurement of the
ground reaction forces. In Figure 6, the apparatus of the gait lab, with the
three force plates and the four video cameras, is presented. The ground
reaction forces are obtained independently for each foot during the trial. In
order to reduce the noise levels in the external force curves and center-of-
538 J.A.C. Ambrósio and J.M.C.S. Abrantes

pressure curves, the Butterworth 2nd order low pass filter is applied with cut-
off frequencies ranging from 10-20 Hz for the forces and 3-5 Hz for the
center-of-pressure curves. The choice for the correct cut-off frequency is
based on a residual analysis [14].

Top View
Cam #3 Cam #2

Plate #1 Plate #2 Plate #3


Subject Forward Direction

Cam #4 Cam #1

Figure 6. Overall apparatus of the gait lab.

2.3 Biomechanical model

A biomechanical model suitable for human motion analysis requires that


the relevant anatomical segments are described, the muscle forces are
represented and that the skeletal-muscle apparatus is included. A model of
the human body is defined using 16 anatomical segments, being the rigid
bodies presented in Table 1 and illustrated in Figure 7. The model has 44
degrees-of-freedom that correspond to 38 rotations about 26 revolute joints
and 6 universal joints, plus 6 degrees-of-freedom that are associated with the
free body rotations and translations of the base body [7].

Table 1. Physical characteristics of anatomical segments and rigid bodies for the 50th-
percentile human male, with reference to Figure 6.
Description Body Length CM Location Mass (Kg) Moments of Inertia (10-2 Kg.m2)
i Li (m) di (m) di (m) mi (Ixx/Iyy/Izz)i
Lower Torso 1 0.275 0.064 0.094 14.200 26.220/13.450/26.220
Upper Torso 2 0.294 0.101 0.161 24.950 24.640/37.190/19.210
Head 3 0.128 0.020 0.051 4.241 2.453/2.249/2.034
R Upper Arm 4 0.295 0.153 - 1.992 1.492/1.356/0.248
R Lower Arm 5 0.250 0.123 - 1.402 1.240/0.964/0.298
Hand 13 0.185 0.093 0.045 0.489 0.067/0.146/0.148
L Upper Arm 6 0.295 0.153 - 1.992 1.492/1.356/0.248
L Lower Arm 7 0.376 0.180 - 1.892 1.240/0.964/0.298
Hand 14 0.185 0.093 0.045 0.489 0.067/0.146/0.148
R Upper Leg 8 0.434 0.215 - 9.843 1.435/15.940/9.867
R Lower Leg 9 0.439 0.151 - 3.626 1.086/3.830/3.140
Foot 15 0.069 0.271 0.035 1.182 0.129/0.128/2.569
L Upper Leg 10 0.434 0.215 - 9.843 1.435/15.940/9.867
L Lower Leg 11 0.439 0.151 - 3.626 1.086/3.830/3.140
Foot 16 0.069 0.271 0.035 1.182 0.129/0.128/2.569
Neck 12 0.122 0.061 - 1.061 0.268/0.215/0.215
Developments in Biomechanics of Human Motion for Health and Sports 539

3
L19 , L20 L24 , L25
12
6
L33
4 2 7
L21, L22, L23
14 d17 , d18
L31 , L32 d21,d22,d23
d26 , d27

5 1 d15 , d16 L7
10 d28 , d29 L17 , L18 d7
L26 , L27
13 8 d6, d7 , d8
L6, L7 , L8
11 L15 , L16
9 d4 , d5 L28 , L29
16
d9 , d10 L14 , L30
L14, L30

15 d2 , d3 L4 , L5
d11, d12 d14, d30
L9 , L10 d14, d30

L2 , L3
L11 , L12 L33/2 d33
zi
d1, d13
d33
d1,d13
xi yi L1 , L13
L1 , L13
d31 , d32

(a) (b)

Figure 7. Biomechanical model with 16 anatomical segments: (a) Topology of the model; (b)
Reference for the length and center of mass of each anatomical segment.

This is a general-purpose biomechanical model that can be applied to any


type of dynamic analysis. Due to the kinematic structure, where no spherical
joints is used it can be applied in inverse dynamic analysis. The model
presented is only one of many that can be used to the biomechanical analysis
of different human motion tasks. However, its application to a particular
individual requires that its anatomical segments are properly scaled.

3. CLINICAL APPLICATIONS

The two fundamental requirements for gait are stability, which means the
ability to maintain the equilibrium in an upright position, and locomotion
that results in an activity that sequences the steps and the stride. Although,
these two requirements are fundamental for gait there are other factors that
contribute to achieving associated objectives. The musculo-skeletal system
needs to have a functional morphology of bones and anatomical joints that is
in good shape and has acceptable levels of muscular strength. The muscle
tone results from the central nervous system at the sub-cortical level. This
tone needs high enough values so that the gravity force can be counteracted
but still not too high so that lengthening is allowed and consequently the
displacement of the extremities or intersegmental motion can take place.
The muscle nerves allow for antagonist and agonist actions that lead to
skilled movements according to certain objectives. The regulation of the
locomotion depends on some important aspects such as the vision, especially
540 J.A.C. Ambrósio and J.M.C.S. Abrantes

when some sensory inputs are altered. Figure 8 presents a case where the
subject has no problems at the cognitive level but shows some control
problems both at the level of the synchronism between the antagonist and
agonist actions and at the muscle tone. The muscle spasticity leads to the
inability to control the angular velocity between anatomical segments. In this
case, the crutches supplement the support capabilities during the different
phases of the gait.

Figure 8. Patient with a spastic gait in the setup of the gait laboratory.

The knee angles, presented in Figure 9 for a reference subject (normal) and
spastic gait, control the joints. Notice that the slope of the angular velocity
curve corresponding to the heel-in instant which has opposite signs for a
normal and a spastic gait that is associated to the lack of control between the
agonist and antagonist actions.

Figure 9. Knee angle variation for the spastic gait with relation to the normal gait.
Developments in Biomechanics of Human Motion for Health and Sports 541

Delayed Peak Flexion

70
60 Normal
Knee Angle (deg) Spastic
50

Excursion
40

30
20
Toe-Off
10 Toe-Off
(spastic)
(normal)
0
0 20 40 60 80 100
% Gait Cycle

Long Stance Phase (Unstable)

Figure 10. Particular characteristics of the knee angle during a gait cycle.

When comparing the intersergmental knee angles, defined according to


Figure 10, it visible that for the pathological gait the support phase is longer,
i.e., it is about 70% of the gait cycle instead of the usual 60%. This suggests
an instability in the support phase that is translated into a quicker swing
phase. The reduction on the angle excursion is typical of the spascity of the
muscle groups that act upon the knee rotational motion. This reduced
excursion is not only a result of the lack of control of the maximum flexion
but also of the impossibility that the subject has to do an appropriate
extension. For the normal gait a positive variation of the angular velocity
leads to the ability to generate enough intersegmentar forces that end in
adequate force levels and damping in the support. The opposite is verified
for the spastic subject for which the damping of the support does not take
place and the complete lower extremity acts as a block.
25

20
Ankle moment (Nm)

Normal
15 Spastic

10

0 Toe-Off Toe-Off
(normal) (spastic)
-5
0 20 40 60 80 100
% Gait Cycle

Figure 11. Net moment of force at the ankle joint during a gait cycle.
542 J.A.C. Ambrósio and J.M.C.S. Abrantes
2.0

1.5

generation
Ankle power (Watt)
Normal

Power
1.0 Spastic

0.5

absorption
Power
-0.5 Toe-Off Toe-Off
(normal) (spastic)
-1.0
0 20 40 60 80 100
% Gait Cycle

Figure 12. Power generated and absorbed at the ankle joint during a gait cycle.

The angular velocities are directly associated to the joint net moments-of-
force that act in each of the joints of the lower limbs, as depicted by Figure
11 which represents the ankle moment-of-force during the stride. In the first
part of the support phase it is observed that for the subject with the normal
gait the anterior muscles of the leg dominate while in the spastic subject the
posterior leg muscle system dominate, during all support phase, which in any
case is longer for the spastic gait, but without reaching the usual values
during push-off.
Figure 12 shows the association between the ankle moment-of-force and
angular velocity. In a first phase of both gait cases there is energy
absorption, characterized by a negative power, while in a second phase the
joint power is positive, which means that energy is being produced.
However, the trend in the ankle power must be observed together with the
behavior of the ankle moment-of-force. The energy absorption phase the
spastic subject is not due to an effective control of the angular velocity and
joint moments-of-force but, instead, due to an inversion of the angular
displacement of the ankle and to a high dominance of the posterior muscles
of the leg. The spastic subject grounds the anterior part of the foot first. The
reference subject controls the ankle angular displacement such a way that
while the foot approaches the ground the heel starts contact first. In the final
phase of the support both reference and spastic subjects develop some power
with the difference that the subject with normal gait such power is associated
to the ankle extension and the work of the posterior muscles of the leg while
in the spastic subject the ankle almost does not extend and the muscles of the
leg remain contracted.
The case presented here illustrates in brief that the analysis of the
different parameter measured and calculated by using the methodologies
described provide valuable information for the clinical exam of patients.
This should not be understood as having the potential to replace clinical
exams but simply of being a very promising procedure to supply important
diagnosis tools that can support medical exams.
Developments in Biomechanics of Human Motion for Health and Sports 543

4. MUSCLE FORCE SHARING PROBLEM

The studies presented up to this point use a model for the locomotion
apparatus where the muscle actions are lumped about the anatomical joints
through the net moments-of-force. However, more complex models for the
human motion require a realistic description of the skeleton-muscle
apparatus and of its dynamics. The individual muscle model and the muscle
apparatus used in the complete biomechanical model are described here.

4.1 Muscle dynamics

The dynamics of muscle tissue can be divided into activation dynamics


and muscle contraction dynamics [15], as indicated in Figure 13. The
activation dynamics generates a muscle tissue state that transforms the
neural excitation produced by the central nervous system, into activation of
the contractile apparatus. The activation dynamics, although not
implemented in this work, describes the time lag between neural signal and
the corresponding muscle activation [10].

Neural Signal Activation Muscle Activation Muscle Contraction Muscle Force


Dynamics Dynamics

Figure 13. Dynamics of muscle tissue.

The muscle contraction dynamics requires that a mathematical model of


the muscle is introduced. In the present work the Hill muscle model is
applied to the simulation of the muscle contraction dynamics. The model,
depicted in Figure 14, is composed of an active Hill contractive element
(CE) and a passive element (PE). Both elements contribute to the total
muscle force Fm(t).

a m (t ), Lm (t ), Lm (t )
Contraction Dynamics Input

CE

F m (t ) F m (t )
Output Output

PE

Lm

Figure 14. Contraction dynamics using a Hill-type muscle model.


544 J.A.C. Ambrósio and J.M.C.S. Abrantes

In the Hill muscle model, the contractile properties of the muscle tissue
are controlled by its current length lm(t), rate of length change lm (t ) and
activation am(t). The force produced by the active Hill contractile element,
for muscle m, is

Fl m (l m (t )) Flm (lm (t ))
m
FCE (a m (t ), l m (t ), lm (t )) m
a m (t ) (9)
F 0

where F0m is the maximum isometric force and Fl m (l m (t )) and Flm (lm (t )) are
two functions that represent the muscle force-length and force-velocity
dependency, respectively [10, 15]. These two functions are approximated
analytically by [10]

ªª § m 4 2º
9 l ( t ) 19 · º 1 ª 9 § l m ( t ) 19 · º
 « « ¨ m  ¸ »  « ¨ m  ¸ » »
«« 4 ¨ l 20 ¸¹ »¼ 4 «¬ 4 ¨© l0 20 ¸¹ »¼ »
m m m ¬« ¬ © 0
Fl (l (t )) F e
0
¼»
(10)

and

­
°0 l0m ! lm (t )
°
° § § 5lm (t ) · ·
° ¨ arctan ¨  lm ¸ ¸
° © ¹ ¸ lm / 5 t lm (t ) t lm
Flm (lm (t )) ® F0m ¨¨1  0
¸ 0 0 (11)
° ¨ arctan(5)
¸
° ¨ ¸
° © m ¹
° S F0  F0m
lm (t ) t l0m / 5
°¯ 4arctan(5)

where l0m is the muscle resting length and l0m is the maximum contractile
velocity above which the muscle cannot produce force. The passive element
is independent of the activation and it only starts to produce force when
stretched beyond its resting length l0m. The force produced by the passive
element is approximated by [7]:

­0 l0m ! l m (t )
°° F m 3
m
FPE (l m (t )) ®8 m03 l m  l0m 1.63l0m t l m (t ) t l0m (12)
° l0 m
°̄ 2 F0 l m (t ) t 1.63l0m
Developments in Biomechanics of Human Motion for Health and Sports 545

Equation (12) shows that the force produced by the passive element is
only a function of the muscle length, being its value completely determined
during the total time of the analysis. Since the force produced by the passive
element is not an unknown it is treated here as an external force, which is
directly applied to the rigid bodies interconnected by the muscle. The forces
produced by the contractile element are the only unknown forces. Note that
if the Lagrange multiplier represents muscle activation, the associated
muscle force is calculated using Equation (8).

4.2 A model for the locomotion apparatus

A muscle locomotion apparatus, with thirty-five muscle actuators is used


to simulate the right lower extremity intermuscular coordination. The muscle
apparatus and a brief description of each muscle action [16] are presented in
Table 2. The physiological information regarding the muscle definition is
obtained from the literature [17, 18] and compiled in a muscle database. The
whole muscle apparatus is presented in Figure 15.

Figure 15. Lower extremity muscle apparatus

Table 2. List and description of the lower extremity muscle apparatus [7].
Nr Muscle Name Muscle Action
1 Adductor Brevis Adducts, flexes and helps to laterally rotate the thigh.
2 Adductor Longus Adducts and flexes the thigh; helps to laterally rotate the hip.
3 Adductor Magnus Thigh adductor; superior horizontal fibers also help to flex the
thigh, while vertical fibers help extend the thigh.
4 Biceps Femoris Flexes the knee, and rotates the tibia laterally; long head extends
(long head) the hip joint.
5 Biceps Femoris Flexes the knee, and rotates the tibia laterally; long head extends
(short head) the hip joint.
6 Extensor Digitorum Extend toes 2 – 5 and dorsiflexes ankle.
Longus
546 J.A.C. Ambrósio and J.M.C.S. Abrantes

Table2. List and description of the lower extremity muscle apparatus [27] (continued).
Nr Muscle Name Muscle Action
7 Extensor Hallucis Extends great toe and dorsiflexes ankle.
Longus
8 Flexor Digitorum Flexes toes 2 – 5; also helps in plantar flexion of ankle.
Longus
9 Flexor Hallucis Flexes great toe, helps to supinate ankle; weak plantar flexor of
Longus ankle.
10 Gastrocnemius Plantar flexor of ankle.
(lateral head)
11 Gastrocnemius Plantar flexor of ankle.
(med. head)
12 Gemellus Rotates the thigh laterally and helps to abduct the flexed thigh.
(inf. and superior)
13 Gluteus Maximus Major extensor of hip joint; rotates laterally the hip; superior
fibers abduct the hip; inferior fibers tighten the iliotibial band.
14 Gluteus Medius Abductor of thigh; anterior fibers help to rotate hip medially;
posterior fibers help to rotate hip laterally
15 Gluteus Minimus Abducts and medially rotates the hip joint.
16 Gracilis Flexes the knee, adducts the thigh, helps to medially rotate the
tibia on femur.
17 Iliacus Flexes the torso and thigh with respect to each other.
18 Pectineus Adducts the thigh and flexes the hip joint.
19 Peroneus Brevis Everts foot and plantar flexes ankle.
20 Peroneus Longus Everts foot and plantar flexes ankle; helps to support the
transverse arch of the foot.
21 Peroneus Tertius Dorsiflexes, everts and abducts foot.
22 Piriformis Lateral rotator of the hip joint; helps abduct the hip if it is flexed.
23 Psoas Flex the torso and thigh with respect to each other.
24 Quadratus Femoris Rotates the hip laterally; also helps adduct the hip.
25 Rectus Femoris Extends the knee.
26 Sartorius Flexes and laterally rotates the hip joint and flexes the knee.
27 Semimembranosus Extends the thigh, flexes the knee, and also rotates the tibia
medially, especially when the knee is flexed.
28 Semitendinosus Extends the thigh and flexes the knee, and also rotates the tibia
medially, especially when the knee is flexed.
29 Soleus Powerful plantar flexor of ankle.
30 Tensor Fasciae Lata Helps stabilize and steady the hip and knee joints by putting
tension on the iliotibial band of fascia.
31 Tibialis Anterior Dorsiflexor of ankle and invertor of foot.
32 Tibialis Posterior Principal invertor of foot; adducts foot, plantar flexes ankle,
helps to supinate foot.
33 Vastus Intermedius Extends the knee.
34 Vastus Lateralis Extends the knee.
35 Vastus Medialis Extends the knee.
Developments in Biomechanics of Human Motion for Health and Sports 547

A set of data for the muscles described in Table 2, including the


coordinates of their insertion points, the physiological cross section area and
the coordinates for the via points of curved muscles, are found in [18].

4.3 Solution of the redundant problem by optimization

The human muscle system is highly redundant and, consequently, there is


an infinite set of possible solutions for muscle forces that lead to the same
motion. The aim of optimization techniques to find, from all the possible
solutions, the one that minimizes a prescribed objective function, subjected
to a certain number of restrictions or constraints as:

minimize F 0 (b)
(13)
­ f j (b) 0 j 1,..., nec
°
subject to: ® f j (b) t 0 j nec  1 ,..., ntc
°blower d b d bupper i 1,..., nsv
¯ i i i

where the vector of the unknown parameters, or design variables, is b with


the components bi bounded respectively by blower and bupper, F 0 (b) is the
objective or cost function to minimize and f j (b) are constraint equations
that restrain the state variables. In Equation (13), nsv represents the total
number of design variables and ntc the total number of constraint equations
in which nec are of the equality type.
A cost function must reflect the inherent physical activity or pathology
and to include relevant physiological characteristics and functional
properties, such as the maximum isometric force or the electromyographic
activity [28]. Some of the most commonly used cost functions are:

nma
m 2
F 0 (b) ¦ F
m 1
CE (14)

designated by sum of the square of the individual muscle forces, when


applied to the study of human locomotion this cost function is considered to
fulfill the objective of energy minimization. This cost-function does not
include any physiological or functional capabilities [19];

nma
m 3
F 0 (b) ¦ V
m 1
CE (15)
548 J.A.C. Ambrósio and J.M.C.S. Abrantes

known as sum of the cube of the average individual muscle stress this cost
function, introduced by Crowninshield and Brand [20], is based on a
quantitative force-endurance relationship and on experimental results. It
includes physiological information, namely the value of the physiological
cross sectional area of each muscle and it is reported to predict co-activation
of muscle groups in a more physiologically realistic manner [19]. The
interested reader may find a list of other suitable objective functions and
their description in references [18, 19].
Different optimization packages can be used in the solution of the
optimization of the redundant muscle forces such as DOT 5.0 [21] or NAG
library [22]. The successive quadratic programming algorithms is used being
the optimization problem is subject to linear and/or nonlinear constraints.

5. APPLICATION TO SPORTS ACTIVITIES

A 23 years old male, with a height of 168 cm and a body mass of 68 kg,
carried out several jumps, as shown in Figure 16, from which one was
selected as representative of the task. In each trial, the body motion and the
force plate data were recorded and their synchronization ensured. The
coordinates of all anatomical points were manually digitized. To avoid
discrepancies in the estimation of the muscle forces, a technique proposed by
Silva and Ambrósio was applied to make the kinematic data consistent [7].

Figure 16. Views from the four cameras at different instants of the analysis and the
position of the 23 anatomical points.

The time characteristics of the measured ground reaction forces are


shown in Figure 17. They are, as expected, quite different from the
characteristic data obtained in a gait analysis [8]. The maximal vertical
reaction is almost four times larger than its equivalent in gait. The peak
value of the medial-lateral component is seven times higher than its analogue
in gait. The peak value of the anterior-posterior component is even larger
then the one occurring during jumps performed on a trampoline [23].
Developments in Biomechanics of Human Motion for Health and Sports 549

3500 300
Rz - fo rce plate
3000 Rz - video frame 200

Rx and Ry [N]
2500 100
0
Rz [N]

2000
-100
1500
-200 Ry - fo rce plate
1000
-300 Ry - video frame
500 -400 Rx - fo rce plate
0 Rx - video frame
-500
0 0.08 0.16 0.24 0.32 0 0.08 0.16 0.24 0.32
Time [s] Time [s]

(a) (b)

Figure 17. Components of the ground reaction force: (a) vertical; (b) anterior-
posterior and medial-lateral.

The results obtained by the inverse dynamics analysis for the net
moments-of-force of the right leg during the jump, depicted in Figure 18
clearly show a considerable loading of the ankle joint of the supporting leg.
For this reason, a strong activity of the plantar-flexors muscles spanning this
joint is expected. There is also a very large peak of the net torque at the hip
joint occurring at the beginning of the contact phase of the foot with the
ground, which happens at t = 0.08 s. It may be anticipated that oscillations of
the time-force characteristics of some muscles spanning knee and hip joints
are observed as a consequence of this fact.

0 2
Net torque [Nm/kg]I

-1
Net torque [Nm/kg]I

0
-2
-2
-3
-4
-4 Knee
A nkle Hip
-5 -6
0.00 0.08 0.16 0.24 0.32 0.40 0.48 0.00 0.08 0.16 0.24 0.32 0.40 0.48
Time [s] Time [s]

Figure 18. Normalized net torques at the sagittal plane of the major joints of the
lower extremity.

A static optimization approach is used for the solution of the muscle


forces distribution in the supporting leg of the jumper. The time
characteristics of the forces of selected muscles are presented in Figure 19.
The vertical lines on each picture indicate the beginning and the end of the
take-off phase happening at the instants 0.08 s and 0.24 s, respectively. At
the start of the analysis the jumper is in an airborne trajectory. The trunk is
in a forward-lean position, the hamstrings, the short head of biceps femoris
550 J.A.C. Ambrósio and J.M.C.S. Abrantes

(with a small time delay) and the gluteus maximus are working hard to raise
the trunk to an upright position, which is achieved when the foot contacts the
ground. At this moment very strong contractions of glutei and vasti muscles
occur to neutralize the impact forces generated during ground contact. The
gluteus maximus works as a hip extensor and the vasti group extends the
knee joint. It is worth pointing out that the rectus femoris, as a muscle that
spans two joints, is not involved in this action.

Iliopsoas Hamstrings
500
Force [N]

iliacus semitendinosus
400 600

Force [N]
psoas semimembranosus
300 400 b. femoris long head
200
100 200
0 0
0 0.08 0.16 0.24 0.32 0.4 0.48 0 0.08 0.16 0.24 0.32 0.4 0.48
Time [s] Time [s]

Gluteus maximus Biceps femoris short head


600 600
Force [N]
Force [N]

anterior
400 middle 400
posterior
200 200
0 0
0 0.08 0.16 0.24 0.32 0.4 0.48 0 0.08 0.16 0.24 0.32 0.4 0.48
Time [s] Time [s]

Gluteus medius Rectus femoris


500 anterior
1200
Force [N]
Force [N]

400 middle
300 800
posterior
200 400
100
0 0
0 0.08 0.16 0.24 0.32 0.4 0.48 0 0.08 0.16 0.24 0.32 0.4 0.48
Time [s] Time [s]

Gluteus minimus Vastus


300 900
anterior medialis
Force [N]
Force [N]

200 middle 600 intermedius


posterior lateralis
100 300
0 0
0 0.08 0.16 0.24 0.32 0.4 0.48 0 0.08 0.16 0.24 0.32 0.4 0.48
Time [s] [s]

Adductor magnus Triceps surae


600 1600
superior medial
Force [N]
Force [N]

400 middle 1200 lateral


inf erior soleus
800
200
400
0 0
0 0.08 0.16 0.24 0.32 0.4 0.48 0 0.08 0.16 0.24 0.32 0.4 0.48
Time [s] Time [s]

Figure 19. Muscle forces distribution in the supporting leg of the jumper.

There is also a noticeable activity of hamstrings that, as antagonists of the


knee extensors, stabilize the movement during the application of external
Developments in Biomechanics of Human Motion for Health and Sports 551

loads. Simultaneously, the calf muscles i.e., the gastrocnemius and the
soleus, generate a powerful ankle plantarflexion in order to push the body
forward. The common action of all muscles mentioned above stretches the
lower leg out as well, which is very useful when carrying the maximal
ground reaction force in the middle of the support phase. A remarkable
excitation of hip adductors at this time stabilizes the movement of the
jumper and helps getting the legs together in the air later. The final activity
of the glutei, biceps femoris short head, iliopsoas and hamstrings causes
knee bending and thigh uprising in order to achieve an appropriate airborne
position.

Normalized force
1.0 2 force-lengt h relat ion
0.8
Activation

force-velocit y relat ion


1.5
0.6
0.4 1
0.2 0.5
0.0 0
0.00 0.08 0.16 0.24 0.32 0.40 0.48 0.00 0.08 0.16 0.24 0.32 0.40 0.48
Time [s] Time [s]

Figure 20. Activation level, force-length and force-velocity


dependencies for soleus.

During the take-off phase there are considerable changes in the activation
levels of the muscles spanning the knee joint, particularly in the case of the
weaker knee flexors. The soleus, the most powerful plantarflexor of the
ankle having relatively short fibers, with a length about 3 cm, is extremely
sensitive to the kinematic data used to estimate its instantaneous length and
velocity. It can be argued, that the soleus being fully activated, as shown in
Figure 20, cannot produce enough amount of force. As a result, the two
heads of the gastrocnemius work at the edge of their eccentric possibilities,
as observed in Figure 19. The intensified activity of the vasti muscles, also
verified, leads to bigger oscillations as they rotate the shank in the same
direction of that governed by the soleus.
The application to the jump show how the force data obtained by the
analysis of the task relates with the exercise. Both muscle forces and
activations are fundamental quantities that allow for the analyst to devise
training strategies to strengthen particular muscle groups, if that is the case,
or to find techniques that make better use of the athlete’s potential. The same
type of data can also provide information on effectiveness of physical
rehabilitation therapies when collected periodically during the evolution of a
particular treatment. A final remark concerns the difficulties encountered
with the manual digitalization of the kinematic data. Certainly, the type of
analysis illustrated in this application case can only be used for clinical
applications if an automatic procedure, easy to follow, is devised first.
552 J.A.C. Ambrósio and J.M.C.S. Abrantes

6. CONCLUSIONS

The aspects relevant to the biomechanical analysis of different types of


human motricity tasks have been described, based on their methodological
aspects, and applied to cases of normal and pathological gait. The work that
reflects the collaboration between two groups at the Technical University of
Lisbon, emphasized the procedures required to acquire and treat the
experimental data, the methods developed to perform the analysis of
determinate and redundant biomechanical systems, the construction of
appropriate biomechanical models and their application to different aspects
of life sciences. Although the techniques developed and applied constituted,
at the time of their publication, the State-of-Art there are challenges that are
worth tackling today. The treatment of the internal forces of the human body
as an optimal problem, with better physiological fidelity, is certainly one of
them. The improved biofidelity of the biomechanical models through more
correct anatomical joints, instead of the current mechanical joints, is another
issue that needs to be handled. A better description of contacts will play a
role on this issue. The correct integration in the numerical models of data
collected experimentally is another topic of major interest improving the
correlation between models and human subjects.

ACKNOWLEDGEMENTS

The work described is the result of a long collaboration between the


research teams headed by the authors, supported by the Portuguese
Foundation for Science and Technology (FCT). The authors gratefully
acknowledge the support of FCT through projects PBIC/C/CEG/2348/95 and
POCTI/P/EME/14040/2001, the collaboration of Centro de Medicina de
Reabilitação do Alcoitão and to M. Silva, M. Kaplan, I. Barbosa, A.
Czaplicki, J. Jacinto, M. Motez, O. Jesus, coworkers in the researches
described here.

REFERENCES
1. Nigg B., Herzog W. Biomechanics of the Musculo-Skeletal System, Toronto, Canada,
John Wiley and Sons, 1994.
2. Abrantes J. Proceedings of the ISBS’96 XIV International Symposium on Biomechanics
in Sports, Funchal, Portugal, June 25-29, 1996.
3. Strobach D, Kecskeméthy A, Steinwender G, Zwick B “A Simplified Approach for
Rough Identification of Muscle Activation Profiles Via Optimization and Smooth Profile
Patches”, Proc of ECCOMAS Thematic Conference Multibody Dynamics, (Goicolea J,
Cuadrado J, Garcia Orden J, eds.), Madrid, Spain, June 21-24, 2005, pp. 1-17.
Developments in Biomechanics of Human Motion for Health and Sports 553

4. Silva M., Ambrósio J., “Kinematic Data Consistency in the Inverse Dynamic Analysis of
Biomechanical Systems”, Multibody System Dynamics, 8, pp. 219-239, 2002
5. Ambrósio J., Lopes G., Costa J., Abrantes J., “Spatial Reconstruction of the Human
Motion Based on Images from a Single Stationary Camera”, Journal of Biomechanics,
34, pp. 1217-1221, 2001
6. Silva M., Ambrósio J., “Sensitivity of the Results Produced by the Inverse Dynamic Analysis
of a Human Stride to Perturbed Input Data”, Gait and Posture, 19, pp. 35-49, 2004
7. Ambrósio J., Silva M. “A Biomechanical Multibody Model with a Detailed Locomotion
Muscle Apparatus”, Advances in Computational Multibody Systems (J. Ambrósio, ed.),
Dordrecht,The Netherlands, Springer, pp. 155-184, 2005
8. Czaplicki A., Silva M., Ambrósio J., Jesus O., Abrantes J., “Estimation of the Muscle
Force Distribution in Ballistic Motion Based on Multibody Methodology”, Computer
Methods in Biomechanics and Biomechanical Engineering, (Accepted 2006).
9. Nikravesh P, Computer-Aided Analysis of Mechanical Systems, Englewood-Cliffs, New
Jersey, Prentice-Hall, 1988.
10. Silva M., Human Motion Analysis Using Multibody Dynamics and Optimization Tools.
Ph.D. Dissertation, Instituto Superior Técnico, Technical University of Lisbon, Lisbon,
Portugal, 2003.
11. Addel-Aziz, Y., Karara, H., “Direct Linear Transformation from Comparator
Coordinates into Object Space Coordinates in Close-Range Photogrammetry”,
Proceedings of the Symposium on Close-range Photogrammetry, Falls Church, Virginia,
1971, pp. 1-18.
12. Giakas, G., Baltzopoulos, V., “A Comparison of Automatic Filtering Techniques
Applied to Biomechanical Walking Data”, J. Biomechanics, 30, pp. 847-850, 1997.
13. Winter, D., Biomechanics and Motor Control of Human Movement, 2nd Ed., Toronto,
Canada, John Wiley & Sons, 1990.
14. Silva, M., Ambrósio, J., “Consequências da Filtragem e Consistência Cinemática dos
Pontos Anatómicos na Dinâmica Inversa do Corpo Humano” (Consequences of Filtering
and of the Kinematic Consistency of the Anatomic Points in the Inverse Dynamics of the
Human Body), Proceedings of the VI National Congress in Applied and Computational
Mechanics, 17-19 April, Aveiro, Portugal, 2000, pp. 167-180.
15. Zajac F., “Muscle and tendon: properties, models, scaling, and application to biome-
chanics and motor control”, Critical Reviews in Biomedical Engineering, 17, pp. 359-
411, 1989.
16. Richardson M., Lower Extremity Muscle Atlas, http://www.rad.washington.edu/atlas2/ ,
University of Washington - Department of Radiology, Washington., 2001.
17. Yamaguchi G., Dynamic Modeling of Musculoskeletal Motion, Boston, Massachussetts,
Kluwer Academic Publishers, 2001.
18. Carhart M., Biomechanical Analysis of Compensatory Steping: Implications for
paraplegics Standing Via FNS., Ph.D. Dissertation, Department of Bioengineering,
Arizona State University, Tempe, Arizona, 2000.
19. Tsirakos D., Baltzopoulos V., Bartlett R., “Inverse Optimization: Functional and
Physiological Considerations Related to the Force-Sharing Problem”, Critical Reviews in
Biomedical Engineering, 25, pp. 371-407, 1997.
20. Crowninshield R, Brand R., “Physiologically Based Criterion of Muscle Force
Prediction in Locomotion”, Journal of Biomechanics, 14, pp. 793-801, 1981.
21. Vanderplaats R&D, DOT – Design Optimization Tools – USERS MANUAL – Version
5.0, Colorado Springs, Colorado, 1999.
22. V. Numerics, IMSL FORTRAN Numerical Libraries – Version 5.0, Microsoft Corp. , 1995.
23. Blajer W., Czaplicki A., “Modeling and Inverse Simulation of Somersaults on the
Trampoline”, Journal of Biomechanics, 34, pp. 1619-1629, 2001.
PART IX

URBANISM, TRANSPORTS,
ARCHITECTURE AND DESIGN
URBANISATION TRENDS AND URBAN
PLANNING IN THE LISBON METROPOLITAN
AREA

João Cabral1, Sofia Morgado2, José Luís Crespo3 and Carine Coelho4
Faculdade de Arquitectura, Universidade Técnica de Lisboa, 1349-055 Lisboa, Portugal

1
jcabral@fa.utl.pt, 2smorgado@fa.utl.pt, 3jcrespo@fa.utl.pt, 4carinecoelho@gmail.com

Abstract: This article analysis the relationship between urbanisation processes and the
functions and role of planning, using the case study of the Lisbon
Metropolitan Area (LMA). It is based on two research projects on the LMA,
one already finished and another one still in progress developed by research
teams in four schools of the UTL. The information comes from two different
levels of analysis: i) the study of urbanisation processes through different
decades for the identification of tendencies and for an understanding of the
development of the urban metropolitan condition, and ii) the study of different
planning proposals and regulations for assessing the efficacy of the urban and
governance regulatory system. The article is organised in four parts. The first
part summarises and defines concepts for analysing urbanisation processes and
trends. The second part refers to the results and the conclusions arising from
the study of land use dynamics and changes taking place in the LMA through
different periods. In the third part these results are evaluated in face of
municipal planning proposals and land use regulations. The last part tries to
draw conclusions based on convergences and divergences between
urbanisation dynamics and conditions for planning control and planning
practice, particularly in terms of the role of public investment policies.

Key words: Metropolitan area, Lisbon, urbanisation trends, urban planning, infrastructural
networks, urban density, centrality.

557
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 557–572.
© 2007 Springer. Printed in the Netherlands.
558 J. Cabral et al.

1. INTRODUCTION

The relationship between urbanisation processes and the functions and


role of planning has always been a classical object of research amongst
urban planners and planning professionals. The most developed approaches
and published material evolve around the question of the capacity for the
effective intervention of public policies and on the role of planning in the
formation of cities. This capacity is also associated with the model of
rational planning developed and consolidated by the modern movement and
in the period of post-war reconstruction under the welfare state.
The growing intensity and size of the urbanised world, combined with
diversified and fragmented processes of urbanisation and a relative
inefficacy of the planning bodies and instruments, has raised the question of
the viability and justification of traditional systems of regulation. Thus, what
is at stake for an understanding of the relationship between urbanisation and
planning are forms of reading, analysing and evaluating urban dynamics
namely in terms of availability and allocation of resources and in the
involvement of users and consumers.
Recent studies and authors have, however, identified concepts and sets of
criteria for evaluating and proposing alternative urban development models
which can also help to structure adequate urban management policies and
instruments [14, 18]. The concepts of centrality, density and urban ecology
can constitute some of those indicators.
This article is based on material resulting from two recent research
projects on the Lisbon Metropolitan Area (LMA), one already finished1 and
another one still in progress2. Both studies provide relevant contributions to
the questions raised above.
The study of urbanisation processes through different decades allows, on
the one hand, the identification of tendencies which are critical for an
understanding of the development of the urban condition and the
sustainability of the territories. On the other hand, the analysis of different
planning proposals, regulations and their impacts allows for an evaluation of
the efficacy of the urban and governance regulatory system.
In the LMA it is possible to verify that urbanisation processes have been
determined by the characteristics of the Estuaries, particularly the Tagus,
shaping the spatial and functional structure of the metropolitan area. The
mapping and surveys of spatial changes throughout different periods
provides the critical information. It is possible to identify a clear evolution
from a structure based on road and railway axis converging into the city of
Lisbon and on the infrastructural ring along the Tagus, to a network system
emerging after the 1990’s resulting from the construction of the high and
free-way road system.
Urbanisation Trends and Planning in the Lisbon Metropolitan Area 559

The improvements in mobility levels and the development of the


productive sectors allowed for a dissemination of central uses and functions,
simultaneously more dispersed and accessible. As a result, the location of
central functions in newly developed areas in outer rings with high levels of
connectivity originated a constellation of centralities combined with
dispersed, low density, forms of urban growth.
From a different perspective, the comparison between analysis of land
use changes in the LMA and the proposals and expectations expressed in the
planning instruments, namely the zoning regulations of the 18 municipal
master plans, shows some contradictions. These contradictions refer to
differences between planning objectives and proposals and the land use
changes resulting from a straight application of the normative regulations
predominantly based on zoning criteria. The use of other indicators, namely
urban density, levels of infrastructure and formation of centralities suggest
that alternative forms for public policy evaluation must be used, as national
and corporative public investment has been determinant shaping urban
dynamics regardless of development control by the municipal planning
systems.
This article is organised in four parts. The first part summarises and
defines concepts for analysing urbanisation processes and trends. The
second part refers to the results and the conclusions arising from the study of
land use dynamics and changes taking place in the LMA through different
periods. In the third part these results are evaluated in face of municipal
planning proposals and land use regulations. The last part tries to draw
conclusions based on convergences and divergences between urbanisation
dynamics and conditions for planning control and planning practice,
particularly in terms of the role of public investment policies in urban
formation and the use of indicators and relevant information for the
formulation of adequate urban development models.

2. ANALYTICAL CONCEPTS

The development of industrialisation followed by the growth of a service


and an information economy shaped the complex and dynamic evolution of
the LMA. As part of this evolution, the urban and rural boundaries of the
region were negotiated and geographic imperatives determined the
formation of territorial and polycentric settlements.
When the urban phenomenon was contained one could talk of two
different systems, the city or the urban and the natural or the rural. The city
was the centrality, the principal locus of density – urban, multifunctional,
demographic, where infrastructural Axis converged. The city today is an
560 J. Cabral et al.

overlapping of different layers and the product of different development


approaches and policies, simultaneously urban and non-urban. The city re-
invents itself under hybrid mixed uses and meanings and extends into the
natural and rural spaces which were their live support and geographical
setting.
Metropolis is now the name of an identity we try to interpret – a
combination of urban and non-urban systemic outcomes. The resulting
dynamics demand, therefore, a constant evaluation for interpreting changes,
use of concepts, new techniques and analytical instruments and tools for
planning intervention.
The work produced by different authors [17] and the study of the
evolution of diverse metropolitan areas and urban regions, namely the LMA,
can be combined to define a common approach and to identify three
different components in the formation of metropolitan territories:

x the local geographical character and conditions;


x the urbanisation process, through its urban infrastructures and the
shape and uses produced by human settlements;
x the planning and policy making process, through the normative and
regulatory framework.

These three components, although influential in the creation of a


polycentric pattern, are not enough to reflect the richness and complexity of
today’s cities. Other concepts, which provide further analytical tools for
understanding the urban-ness of places, can be advanced [18]:

x Territory [4], land, water and landscape establishing infrastructural


and central place conditions;
x Infrastructural axis and centralities [13] promoting conditions of
urban intensity and synergies;
x Densities [11] determining urban and building concentration,
diversity of land uses, styles of life, types of network and
infrastructures.

When reading urbanisation trends in the LMA all these aspects must be
considered. The study of the urban in the metropolitan context has to be
considered in relation to non-occupied spaces [12]. On the other hand,
changes have to be understood looking into series developed in previous
studies over long periods of time [7].
Urbanisation Trends and Planning in the Lisbon Metropolitan Area 561

3. URBANISATION TRENDS IN THE LISBON


METROPOLITAN AREA 1965-2001

One of the aims of this study is to put into perspective the process of
urbanisation of the Lisbon Metropolitan Area in order to understand ongoing
trends. As a result, the methodology followed is a consequence of previous
research, based on time scales and systematic cartography3.
Starting with previously produced cartography (1965, 1992, and 2001) it
is possible to identify several types of urbanisation processes reflecting the
urban morphology, the development of infrastructure systems and the
dominant land uses.
From the late 19th century to the mid-20th century, the spatial
configuration created by the infrastructures – railways, national roads,
docklands and harbours – inducted significant dynamics in the still incipient
metropolitan area of Lisbon. The opportunities created by the development
of a system of infrastructures promoted an intensive use of the territory and
originated a pre-metropolitan configuration. As a result a productivist land
use model and a new urban conceptualization, with metropolitan value,
started to be developed.
The cartography of 1965 shows a radial structure converging into Lisbon
and the Tagus Estuary, reinforcing its current status as metropolitan centre,
national centre and European centrality. Regarding land use, large
specialised areas emerged, namely industrial, dependent from a direct link to
Lisbon and, especially, on the Port of Lisbon infraestrutural ring around the
Tagus Estuary.
In the 1990s, a consistent metropolitan structure gains shape combined
with the renewing of land use opportunities. Democracy in 1975 and the
entering of Portugal in the European Union in 1986 opened new political
and economic perspectives. The tertiarisation and changes in the productive
system promoted the obsolescence of industrial and dock areas in central
places, the emergence of new forms of centrality, correlated with a
knowledge based economy, integrated in a highway network.
In the Lisbon Metropolitan Area – a metropolitan association of
municipalities institutionalised in 1991 – a polinucleated system of
alternative urban centre begins to develop, supported and integrated by an
increasingly dense transport and road network system.
Thus, the radial structure developed into a network system of motorways,
creating opportunities for new growth areas along the main axis built by the
train in previous periods. As a result, functional complementarities start to
develop, between urban centres and the metropolitan centre – Lisbon and
Tagus Estuary.
562 J. Cabral et al.

Figure 1. Lisbon Metropolitan Area 1965.

Source: Dinâmicas de Uso e Ocupação do Solo da Área Metropolitana de Lisboa 1940-2001,


Pedro George, Sofia Morgado, CCDRLVT-FA.UTL, Lisboa 2004.

As a consequence, new land uses and functions, previously located in the


traditional centres, emerged in high connectivity areas and external rings,
together with new dense and specialised uses, such as shopping centres and
malls. Under high mobility patterns, dimmer urban tissues gained central
functions, competing with Lisbon, and shaping a polycentric metropolitan
network.
As the 20th century ends, European metropolises reach an advanced
development stage, both in terms of achieving high levels of spatial and
functional articulation between centres and promoting innovative uses in
their territories.
Thus, in 2001-06, Lisbon metropolis gains and consolidates its shape,
influence and role in the conurbation of Lisbon – Galiza, along the Atlantic
coast. This new reality points towards even more complex transformations,
to include several concatenated metropolitan formations.
This increasing and extensive urban growth, together with the integration
in higher network systems, determines a drastic decrease of continually
unoccupied space. At the same time, a landscape homogenisation gains
shape contributing to undifferentiated territories with poor urban references.
Urbanisation Trends and Planning in the Lisbon Metropolitan Area 563

Figure 2. Lisbon Metropolitan Area 1992.

Source: Dinâmicas de Uso e Ocupação do Solo da Área Metropolitana de Lisboa 1940-2001,


Pedro George, Sofia Morgado, CCDRLVT-FA.UTL, Lisboa 2004.

Figure 3. Lisbon Metropolitan Area 2001.

Source: Dinâmicas de Uso e Ocupação do Solo da Área Metropolitana de Lisboa 1940-


2001, Pedro George, Sofia Morgado, CCDRLVT-FA.UTL, Lisboa 2004.
564 J. Cabral et al.

4. URBANISATION TRENDS AND MUNICIPAL


PLANNING

4.1 Zoning regulations and urbanisation trends

The comparison of urban land uses defined in the different Municipal


Master Plans produced in the mid 1990s, with the 2001 maps of urban
occupation, provides information on the relevance of zoning ordinances
determining urbanisation trends (Figure 4).
The urban occupation patterns observed in the urban and/or to be
urbanised land use areas, show, however, variations between municipalities.
Four tendencies could be identified: conformity, consolidation, fully
occupied and pore over. The conformity between the zoning defined in the
Municipal Master Plans (Plano Director Municipal - PDM) and the urban
areas; the consolidation of the urban space defined in the proposed spatial
categories of the different Master Plans; a progressive occupation of the
urban vacant land; and the pour over of the urban space beyond what was
planned, occupying areas classified for other land uses.

Figure 4. Urban occupation and urban land uses in the Lisbon Metropolitan Area PDM-
Planos Directores Municipais anos 90 – Classes de Espaço.

Source: Projecto Totta/UTL/01 – Dinâmicas de Localização, Transformação do Território e


Novas Centralidades na Área Metropolitana de Lisboa: que papel para as políticas públicas?
Urbanisation Trends and Planning in the Lisbon Metropolitan Area 565

The observed patterns of urbanisation are distinct and contradictory:


intensive, extensive, fragmented and diffuse. An intensive pattern in the
capital city and surrounding areas associated with the first roads; extensive
patterns along the river, road and railway transport links on either side of the
Tagus; fragmented, mainly in the northern area of the LMA, the underlying
factor being a greater mobility associated to cars and physical (relief) issues
that do not allow a continuity of the urban space, namely in the Sintra and
Mafra Municipalities; diffuse with a dilated character, mainly in the
southern area of the LMA partly based on the building of illegal lots from
the 1960s and 1970s that are a clear feature of the landscape.
An evaluation of the first generation of Master Plans concludes,
generally speaking, that there was an over sizing of the land under urban and
urbanised categories, with spaces classified for urban use in excess in terms
of urbanisation needs.
As such, we can find an uncoordinated urban expansion where land
remains allocated to urban development for a long time, possibly benefiting
from a valuation that makes it inaccessible for quick municipal intervention.
In theory, the Municipal Master Plan should be a model and a vision for
urban development planning. What is happening is that the municipality
relies on zoning ordinances for the control of urban growth and for
evaluating planning applications, disregarding other comprehensive
planning tools such as urbanisation plans and detailed plans, which are the
adequate instruments to foresee and implement adequate urban development
models.

4.2 Planning objectives and planning practice

As part of the planning objectives established in their Municipal Master


Plans (which include zoning ordinances), municipalities propose the creation
of Operative Planning and Management Units (OPMU), for special planning
areas. To spatially develop the municipal land, the following planning
instruments are used in the OPMU: Urbanisation Plans and Detailed Plans.
Table 1 lists the municipalities of the LMA with information on the
inclusion of planning objectives (general planning instruments – GPI) in
their Municipal Master Plans as well as the number of urbanisation and
detailed plans to be implemented. It also lists the state of those plans:
general planning instruments (GPI), when referring to the OPMU and in
carrying out the planning instruments (UP and DP); the different OPMU; the
Urbanisation Plan - Planned (UP-P), reviewed or revoked; the Urbanisation
Plan – Materialised (UP–M); the planning instrument, Detailed Plan –
Planned, reviewed or revoked (DP-P); the Detailed Plan – Materialised (DP-
M).
566 J. Cabral et al.

Table 1. Existence and number of planning instruments (planned and implemented).


Instruments
GPI1 OPMU1 UMP-P1 UMP-M2 DUP-P1 DUP-M2
Municipalities
LMA - North
Amadora Y 4 - 1 - 13
Azambuja Y - - - - -
Cascais Y - - 1 2 13
Lisboa Y 1 6 7 16 13
Loures/Odivelas Y 8 - 1 - 7
Mafra Y - 1 1 - 2
Oeiras Y 23 - 5 - 8
Sintra Y - 2 2 2 2
V. F. Xira Y - - - - 22
LMA – South
Alcochete N 5 6 - 7 8
Almada Y 15 1 1 3 4
Barreiro Y 177 1 - 86 2
Moita Y - - - - 3
Montijo Y - 7 - 6 1
Palmela Y 2 - 2 - 1
Seixal Y 9 - - 1 14
Sesimbra Y 11 5 3 4 2
Setúbal Y 2 - - 3 8
1
Source: Municipal Master Plans (PDM) (18 municipalities of the LMA)
2
Source: DGOTDU

According to the figures, we can conclude that normative planning


activity in the LMA, through the creation of OPMU and implementation of
other urban plans (UP and DP) has been rather weak. Half of the
municipalities produced neither an urbanisation plan nor a detailed plan. In
terms of the detailed plans the figures are higher, though still weak as
municipalities used this planning instrument to resolve critical situations
such as legalising informal settlements. The northern municipalities of the
LMA implement more plans while the southern local authorities have
produced a larger number of urbanisation and detailed plans.

5. CONCLUSIONS AND PROSPECTS FOR FURTHER


RESEARCH

This article has given accounts regarding two levels of findings which
are part of a research project looking into the relation between location
trends, urbanisation dynamics and the formation of centralities in the LMA.
The two levels relate to changes in urban trends and urban forms in four
Urbanisation Trends and Planning in the Lisbon Metropolitan Area 567

decades and to the role of municipal planning in the last decade influencing
and determining those changes.
Two main tendencies were detected. In terms of urbanisation trends the
LMA is becoming an increasingly urbanised region, under diversified and
fragmented patterns along lines and corridors following the development of
the main roads and transport infrastructures. In terms of the role of urban
planning the evidence shows a relative incapacity of municipal regulation to
implement an urban development model consistent with the planning
principles foreseen by the legislation and the planning system.
Thus, two different levels of challenges for making planning more
efficient and for further research can be identified. One level relates to the
type of information and indicators planners need for reading and
understanding new urban and land use trends and dynamics for plan and
policy formulation and implementation. Another level relates to an
evaluation of current planning practices and governance competences, so
that the planning system becomes a useful tool for the formulation of urban
development models accountable to long term public and community
interests.
Three aspects of urban planning and development processes, in its
articulation with public investment and policies, illustrate the emergence of
the above, although interconnected, two levels of research. Two aspects are
represented by effective interactions between changes in urban densities and
urban patterns with regional and national road and transport infrastructures.
A third aspect relates to the capacity regional plans have (and have not)
shown interpreting urban trends for adequate policy response namely
through a perception of centralities formation in the LMA.
Firstly, in terms of urban densities, according to the Statistics
Office/DGOTDU, the percentage of the resident population in the
“Predominantly Urban Areas”4 in 2001, was around 70%. The highest levels
could be found in the Metropolitan Areas, and, in 1991, a growth of 9% was
registered.
At the same time as the process of urban growth took place, population
growth occurred along the main transport axis. If we link population density
with the configuration of the road network defined in the National Road
Master Plan 2000, we find a direct link between density and the road
network layout [3].
The transport axis (rail in a first phase and road after), had an important
structuring role in the LMA, promoting processes of urban expansion and
the development of new settlements. In terms of distribution of the urban
land uses areas, it is possible to identify a number of tendencies. There is,
however, an influential matrix associated with the railway (digitaliform) and
568 J. Cabral et al.

its stations that today still bare an important mark on the land, observed
through population density [16].

Figure 5. Population density and transport infrastructure in the Lisbon Metropolitan Area,
Census 2001, INE.

Source: Projecto Totta/UTL/01 – Dinâmicas de Localização, Transformação do Território e


Novas Centralidades na Área Metropolitana de Lisboa: que papel para as políticas públicas?

In the North part of the LMA there are three main Axis: i) North-South
direction, following the bank of the River Tagus and the large road and rail
axis; ii) East-West direction, followed by the coastal area of the Tagus
estuary; iii) Southeast-Northwest direction following IC19.
The South part of the LMA presents the following urban lines: i) along
the Tagus banks made up of industrial and/or urban settlements; ii) origin in
Almada/Setúbal following the central axis of the peninsula, proportional
growth mainly along the main road axis; iii) peripheral, mainly Northeast-
Southwest [19].
Thus, an increase in accessibility largely supported by public
investments on roads and transport infrastructures has led to structuring
changes in the LMA: i) gave settlements greater mobility and interaction in
the metropolitan space; ii) led the circulation in the LMA to be carried out
outside Lisbon, favouring a possible multi-polarisation of the metropolitan
space, by the direct relation between peripheral municipalities; iii) structured
Urbanisation Trends and Planning in the Lisbon Metropolitan Area 569

the morphology of the metropolitan space as a continuous built space,


opening the way to integrate a long distance metropolitan system [20].

Figure 6. Connectivity and infrastructures.

Source: Projecto Totta/UTL/01 – Dinâmicas de Localização, Transformação do Território e


Novas Centralidades na Área Metropolitana de Lisboa: que papel para as políticas públicas?

The second aspect is associated with the central role large road and
transport infrastructures have had shaping urbanisation form and trends as
shown in Figure 6. Thus, urbanisation forms developed in strong association
with the implementation of road and rail infrastructures, similar to what
Benton McKaye has called liquid planning (Keller Easterling, 1999) and
characterised planning in the United States in early twenty century.
Under these conditions, the most important network interfaces have a
critical role in the location and formation of central places and urban uses as
well as in the location of smaller urban centres. As follows, the 1965 map
shows that urban radial patterns coincide with railway lines and with the first
motorways while suburbanisation processes developed around the train
stations. The railway line to Sintra is a good example of that.
In the 1990’s the rapid development of the road network, highways and
motorways, as part of the National Road Plan, promotes conditions for high
levels of mobility within the LMA, integrating previously marginal and
570 J. Cabral et al.

peripheral locations in the urban network system. The emergent network


with multimodal characteristics becomes apparent in the 2001 maps of the
LMA, through a constellation of connective knots, multimodal interfaces,
rail stations of different types and free-way exits representing a clear
metropolitan condition with polycentric characteristics.

Figure 7. Centralities – PROT-AML 2003.

Source: Projecto Totta/UTL/01 – Dinâmicas de Localização, Transformação do Território e


Novas Centralidades na Área Metropolitana de Lisboa: que papel para as políticas públicas?

Thirdly, this polycentric tendency was identified by different


metropolitan planning studies and plans. However, although all the three
territorial plans produced for the LMA have thought and mapped different
strategies for polycentric development (1964: PDRL – Plano Director da
Região de Lisboa; 1990-04, PROT-AML – Plano Regional de Ordenamento
do Território; 2003, PROT-AML – Plano Regional de Ordenamento do
Território) only the last one was approved and ratified becoming a statutory
document. Still, since there is no articulation between the municipal and the
metropolitan planning levels, it is not possible to say that there is an
effective territorial metropolitan strategy for the metropolitan area.
Urbanisation Trends and Planning in the Lisbon Metropolitan Area 571

NOTES
1. Pedro George (coord) and Sofia Morgado, Dinâmicas de Uso e Ocupação do Solo da
Área Metropolitana de Lisboa 1940-2001 FA-UTL (Faculdade de Arquitectura –
Universidade Técnica de Lisboa), July 2004.
2. Projecto Totta/UTL/01 (2004-2006) Dinâmicas de Localização, Transformação do
Território e Novas Centralidades na Área Metropolitana de Lisboa: que papel para as
políticas públicas? Project financed by the Colégio de Estudos Integrados - UTL with
the participation of four institutions at the Universidade Técnica de Lisboa – Clara
Mendes (FA) (coord.), Romana Xerez (coord. ISCSP), Manuel Brandão Alves (coord.
ISEG – CIRIUS), Fernando Nunes da Silva (coord. IST – CESUR) and João Cabral,
Pedro George, Sofia Morgado, José Luís Crespo, Carine Coelho (FA).
3. George, Pedro; Morgado, Sofia, 2004, Dinâmicas de Uso e Ocupação do Solo da Área
Metropolitana de Lisboa 1940-2001, Protocolo CCDR-LVT, Comissão de Coordenação
e Desenvolvimento Regional de Lisboa e Vale do Tejo / FA-UTL, Faculdade de
Arquitectura, Universidade Técnica de Lisboa, policopiado, Lisboa, Julho 2004.,
included in La Explosión de la Ciudad, Barcelona , 2004.
4. Predominantly urban areas: the urban Freguesias (administrative unit smaller than a
municipality, equivalent to a parish); the semi-urban Freguesias contiguous to urban
ones according to functionality/planning orientations and criteria; the semi-urban
Freguesias according to functionality/planning criteria; the Freguesias where the town
hall is located with more of 5 000 inhabitants. The urban Freguesias are the ones which
have a population density superior to 500 inhabitants/Km2 or integrate one a resident
population equal or superior to 5 000 inhabitants (INE-DGOTDU, 1998, pp.8-9).

REFERENCES
1. AAVV (ed. Ángel Martín Ramos), Lo urbano en 20 autores contemporáneos, Escuela
Técnica Superior de Arquitectura de Barcelona, Edicions UPC, Barcelona, 2004
2. AAVV (ed. Antonio Font; scientific coord. Francesco Indovina, Nuno Portas, Antonio
Font), L’explosió de la ciutat. Morfologies, mirades i mocions sobre les trasnformacion
territorials recents en les regions urbanes de l’Europa Meridional, Collegi d’Arquitectes
de Catalunya-COAC/Forum Universal de les Cultures Barcelona 2004
3. Costa E, Silva G, Costa N. “Estratégias de Povoamento e Políticas de Expansão dos
Aglomerados Urbanos”. Paper presented at the conference Ordenamento do Território e
Revisões do PDM, ANMP, Évora, 21-22 October, 2003.
4. Dematteis, Giuseppe, Progetto implicito. Il contributo della geografia umana alle scienze
del territorio, Franco Angeli, Milano, 1995.
5. Easterling, Keller, Organization space. Landscapes, Highways and Houses in America,
MIT Press, Cambridge Massachusetts/London, England, 1999.
6. Gaspar J. “Economic Restructuring and New Urban Form”. Finisterra, XXXIV, n.º 67-
68, pp. 131-152, 1999.
7. George, Pedro; Morgado, Sofia, «Dinâmicas do Uso e Ocupação do Solo da Área
Metropolitana de Lisboa 1940-2001», Pós-Revista do Programa de Pós-graduação em
Arquitetura e Urbanismo da FAUUSP, Faculdade de Arquitetura e Urbanismo da
Universidade de São Paulo, Nº 18, Decembre 2005.
572 J. Cabral et al.

8. Graham, Stephen.; Marvin, Simon, Splintering Urbanism. Networked infrastructures,


technological mobilities and the urban condition, Routledge, London/New York, 2001
9. GUST, The Ghent Urban Studies Team (dir Dirk de Meyer, Kristiaan Versluys), The
Urban Condition: Space, Community and Self in the Comtemporary Metropolis, 010
Publishers, Rotterdam, 1999.
10. INE-DGOTDU. Tipologia das Áreas Urbanas, Lisboa, INE-DGOTDU, 1998.
11. Lang, Robert E., Edgless Cities. Exploring the elusive metropolis,Brookings Institution
Press, Washington D.C., 2003.
12. Morgado, Sofia, Protagonismo de la ausencia. Interpretación urbanística de la formación
metropolitana de Lisboa desde lo desocupado, Departament d’Urbanisme i Ordenació
del Territori-Universidade Politécnica da Cataluña, Barcelona, 2005.
13. Pavia, Rosario, Babele. La città della dispersione, (Babele/7), Meltemi editore, Roma,
2002.
14. Portas, Nuno; Domingues, Álvaro; CabraL, João, Políticas Urbanas – tendências,
estratégias e oportunidades, Fundação Calouste Gulbenkian, Lisboa, 2003.
15. Rodrigues D. “Pressão construtiva nas áreas metropolitanas em Portugal”. Regiões e
Cidades na União Europeia: Que Futuro. Actas do VI Encontro Nacional da APDR, Vol.
1, pp. 307-326, 1999.
16. Salgueiro T. “Lisboa, Metrópole Policêntrica e Fragmentada”. Finisterra, XXXII, n.º 63,
pp. 179-190, 1997.
17. Secchi, Bernardo, La città nel ventesimo secolo, Editori Laterza, Roma-Bari, 2005.
18. Sieverts, Thomas, Cities without cities. An interpretacion of the Zwischenstadt, Spon
Press/ Routledge, London/ New York, 2003
19. Silva A, Vala F. “Acessibilidades e Construção na Área Metropolitana de Lisboa, 1991-
2001”. Revista de Estudos Regionais – Região de Lisboa e Vale do Tejo, INE-DRLVT,
2º Semestre 2001, n.º 3, pp. 25-40, 2001.
20. Silva E. “Cenários da Expansão Urbana na Área Metropolitana de Lisboa”. Revista de
Estudos Regionais – Região de Lisboa e Vale do Tejo, INE-DRLVT, 2º Semestre 2002,
n.º 5, pp. 23-41, 2002.
21. Soja, Edward W, Postmetropolis. Critical Studies of Cities and Regions, Blackwell
Publishing,Oxford, 2000.
22. Viganó, Paola, La città elementare (Biblioteca di Architettura Skira/7), Skira Editore,
Milão, 1999
TECHNICAL, ECONOMICAL AND
ORGANIZATIONAL INNOVATION IN
TRANSPORT SYSTEMS

José Manuel Viegas


Instituto Superior Técnico, Universidade Técnica de Lisboa, Av. Rovisco Pais 1049-001,
email: viegas@civil.ist.utl.pt

Abstract: This paper presents the research that is being undertaken by the Transport
Infrastructure and Systems Group of CESUR – IST, under the author’s
leadership. There are three main research streams: Modelling and innovation
in urban mobility management; Organization and technology as instruments
for efficiency gains in large scale transport systems; and, Quality and safety of
transport infrastructure. Most of this research is developed through PhD
dissertations. This paper briefly provides the main motivation, methodology
and expected results of the different research studies, as well as the labs and
international networks that underpin this research effort.

Key words: Transport, Infrastructure, Systems, Innovation, Quality, Modelling, Safety

1. INTRODUCTION

The Transport Infrastructure and Systems Group of CESUR, Centre for


Urban and Regional Systems, at the Department of Civil Engineering and
Architecture of Instituto Superior Técnico, is an emerging young
multidisciplinary team, currently (January 2006) formed by 3 Professors, 4
Assistant Lecturers and 12 graduate students under the author’s leadership.
Research is structured along three main thematic fields: Modelling and
Innovation in Urban Mobility management; Organization and Technology as
instruments for efficiency gains in Large Scale Transport Systems; and
Quality and Safety of Transport Infrastructure. A large part of the research
activity is carried out through PhD dissertations.

573
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 573–593.
© 2007 Springer. Printed in the Netherlands.
574 J.M. Viegas

Urban mobility is the area of greater activity, with projects incident on


improvement of mathematical modelling techniques, on new organizational
solutions in favour of modal shift, on integration of technologies to provide
more efficient use of road space, on the combination land-use and transport
endowments needed to promote recovery of population levels in city centres,
on urban logistics, and on the adoption of a systems approach to achieve
higher overall quality of the mobility system.
Research about efficiency gains in large scale transport systems includes
one project that analyzes the technical and business model requirements for
wide scale adoption of processes of partial refurbishment of private cars
(replacing components where quicker technological innovation has occurred)
in order to reduce emissions and environmental impacts, another project
focused on the development of instruments that could enhance intermodal
freight transport in support of air cargo services, another project on
management of railway marshalling yards for gains of productive efficiency
of the railway system, and one project that studies the conditions and
incentives needed to promote modal transfer of freight from road transport to
short sea shipping.
In the infrastructure field, a project is being develop on the quality
assessment of road drainage design and equipment, two more projects are on
road pavements, one investigating the impacts of skid resistance and texture
on traffic safety, and another dedicated to evaluation of life cycle costs for
different road pavement construction and maintenance strategies, and one
more aiming at the specification of rail stiffness in high speed railway
systems in order to minimize life cycle costs.
Associated with this research activities two laboratories are under
development and the group is strongly engaged with several important
international networks – TRANSPORTNET and EURNEX, besides having
hosted in 2005 the main international Conference on Competition and
Ownership in Land Transport and being a finalist candidate to host the
World Conference on Transportation Research in 2010.
The following chapters provide an abstract of the main research work.

2. MODELLING AND INNOVATION IN URBAN


MOBILITY MANAGEMENT

2.1 Quality management in urban mobility systems

Careful analysis of Urban Mobility reveals that consistent and effective


policies can only be well defined and implemented if the various
Innovation in Transport Systems 575

components of the system and their interrelations are considered. The


definition of the Urban Mobility System (UMS) goes far beyond the
provision of public transport and should entail all services, infrastructure and
traffic management that in its whole enable citizens to satisfy their mobility
requirements. The complexity and diversity of dimensions of the
conurbation and agents involved in an UMS imply focusing the analysis of
its performance on the symbiotic relationship between its main components
[1].
Quality factors and processes should be set up in a coherent
organizational framework, providing adequate interaction mechanisms for
policies and intervening institutions. The research work used the observation
of several cities around the world to confirm that quality improvements done
at company and service levels are insufficient to ensure significant
improvement of performance of the UMS. This objective was pursued by
decoupling, observing and understanding interactions among the different
elements of the system and between these and the surrounding environment.
The research concluded by stating the need for a holistic approach to
urban mobility management and presenting a model along those lines, which
is largely based in four main pillars:

x the recognition that urban mobility is a system nested within a hierarchy


of systems;
x the fact that within that system a high diversity of agents exist and
interact with considerable autonomy of action requires a clear and sound
institutional design to clarify decision levels, roles and functions, in
particular an indispensable steering and function at system level;
x the fact that this diversity together with functional interdependencies
(e.g. between land-use and transport, between individual and collective
transport, etc) brings high complexity to the system and consequently its
qualified sustainability can only be achieved by providing the system
with structural consistency (i.e. vertical, horizontal and cross-effects
accruing from the previous) as a pre-condition for its success;
x the recognition that urban mobility systems behave as living systems and
as such they have universal properties (i.e. robustness, efficiency,
dynamic diversity) that have a specific materialization in each urban area
providing it with a satisfactory system configuration.

2.2 Zoning and information loss in transportation


planning studies

Most applications of transport modelling require a predefined partitioning


of the study area in zones, allowing the use of discrete (instead of
576 J.M. Viegas

continuous) models. One of the most used applications is Traffic


Assignment, from which traffic load estimates are obtained, and this models
has led to the name of Traffic Assignment Zones (TAZ). These zones are
used for the specification of the Origin-Destination matrix, representing the
number of trips that start in one zone and end in another, from which paths
adopted by travellers are estimated and the ensuing traffic loads computed.
Several criteria have been used to devise TAZs: homogeneity of land-
use, continuity, compactness, or administrative criteria that facilitate
gathering of data on population and land-use.
Similarly to any other problem of adoption of discrete representation of
continuous reality, there is a loss of information. In this case, this means
geographical information, as any location within a zone is represented by a
single point within that zone, its centroid. But on top of that, because the
number of trips in each cell of the Origin-Destination matrix are obtained by
sampling, there is also a statistical error involved. The problem is that in
general these two types of error grow in opposite directions of zone size: as
zones get bigger and thus have larger numbers of trips starting or ending
there, geographical error increases and statistical error decreases. However,
when starting and ending points of trips are geocoded, gains in one
dimension may be obtained without loss in the other, and this is where
optimization is possible.
Professional practice has mostly ignored this dilemma, and adopted more
or less intuitive approaches: None of the Transportation and Spatial Analysis
GIS packages gives a solution to this pressing need. The majority of the All
transportation model suites that we have looked at take for granted the
inclusion of a TAZ file, but never mention how the user should have defined
the zones and their boundaries.
You et al. [2, 3] were the first to present an algorithm for this problem.
They consider the distance between zones, and area dimension to assess the
TAZ. Horner & Murray [4] used Theissen polygons and transportation costs
obtained by Euclidean distance among each zone's centroids. Chang et al. [5]
take into account the zoning structure and the details of the transportation
network to define TAZs.
Ding [6] and You et al. [2] created small routines that can be complied
and run in commercial GIS packages, nevertheless their applications never
had spread use or have been included in the main list of GIS routines (e.g.
ESRI's list of extensions, or in the VISUM software).
Innovation in Transport Systems 577

In this project we developed an approach and an algorithm that


automatically produce an optimal set of TAZ, for a range of number of zones
defined by the user. That range is largely defined by the level of detail
needed in the results of the transport study. That algorithm is written in
Visual Basic, working on a GIS (Geographical Information System) basis.
An application is being made to the Lisbon Metropolitan Area, for which
multiple earlier studies defined their own zoning schemes, thus allowing
comparative assessment of results

2.3 The Intermittent Bus Lane (IBL) – Demonstration in


Lisbon

The concept of Intermittent Bus Lane (IBL) was introduced by Viegas in


1996 [7] as an innovative approach to achieve bus priority. The IBL consists
of a urban street lane in which the status of a given section changes
according to the presence or not of a bus in its spatial domain: when a bus is
approaching such a section, the status of that lane is changed to BUS lane,
and after the bus moves out of the section it becomes a normal lane again,
open to general traffic. Therefore when bus services are not so frequent, road
capacity available for general traffic will not be greatly reduced , and bus
priority can still be obtained.
The underlying motivation to the development of the IBL concept is
rather simple: (1) The productive effectiveness and regularity of buses is
strongly improved whenever they may run unimpeded by the general traffic;
(2) Providing (permanent) bus lanes in streets serviced by only a few buses
per hour is rather inefficient, because the corresponding space could be
serving a higher number of passengers in private cars; (3) But if those streets
are congested, those few buses will lose a lot of time, which could be
avoided if a bus lane would be activated in each section only when strictly
necessary (i.e. when the bus is approaching).
The theoretically development of the IBL has already been concluded
[8], in which the relationships among information on bus location general
traffic flow and the related IBL signals and normal traffic signals are
established and the calculations to obtain the optimal control parameters are
also derived. This work concerns the demonstration project that is being
developed in Lisbon, in cooperation between our University, the
Municipality and the urban bus operator (CARRIS). The most important
methodological steps of the demonstration project are the following: (1)
Selection of Test Locations; (2) Computer based traffic micro-simulation;
(3) Preparation of the Technological Prototype and Interfaces; (4)
Preparation of the Physical Conditions for the Demonstration (road structure,
traffic detectors, signals,…); (5) Public Communication of Project
578 J.M. Viegas

Activities; (6) Demonstration; (7) Evaluation of the IBL, Conclusions and


Recommendations. The demonstration activities started in late September
and last for about six months.

2.4 Coupled intelligent priority measures for buses and


for pedestrians

This project consists of the development of a bus priority measure with


pedestrians being directly included. The goal is to provide a safe
environment for pedestrians to catch the bus, and to make all passengers
spend less time both on the buses and waiting for the buses, through getting
and processing more and better information. Besides the detectors, which
have been operating in the existing Urban Traffic Control systems and bus
priority measures, some new detectors will be installed at suitable places, by
which the pedestrians both on the buses and at roadsides can be detected. In
the meantime, besides general traffic signals, some additional signals are
also installed to show warning message or green signal if any person
attempts to cross the road and board the bus when it is approaching [9]. The
optimal settings of all signals which are installed must be continuously
adjusted to consider new detected traffic messages, through some models
and a new objective function for flow optimization.
Building on the work done with Intermittent Bus Lanes [10], some
models to deal with pedestrians, buses and other traffic flow will be
constructed, in which the relationships between the traffic signals, in which
some are designed specially for buses and pedestrians, and the new objective
function, which includes not only the times people spend on buses but also
their waiting times at the bus stop, are found. The optimal signal settings will
be continuously adjusted based on the ongoing detections of vehicles, buses
and pedestrians through different kinds of detectors.

2.5 Adjusting public transport service to mobility


requirements through new fleet and service
specification

The evolution of public transport supply towards a more client driven


attitude from public authorities and operators, despite promoting significant
changes in the configuration of supply in last decades, has proved to be
insufficient to satisfy citizens’ needs and thus to conquer market share.
Modal choice is influenced by a wide diversity of factors that can only be
detected through the attitudes of population. Psychological methods of
analysis developed by Ahern are used to support the observation and
understanding of reaction towards new transport modes and services. This
Innovation in Transport Systems 579

research project studies such behaviours at the light of the theory of planned
behaviour developed by Azjen and uses the Green Quality Function
Deployment to establish the configuration of service supply that is most
appropriate to induce modal shift towards the new services.

2.6 Information systems for urban mobility management

The Urban Mobility System is a highly complex system. This is due not
only to the great number of elements that compose this system (networks,
vehicles, interfaces, stations, etc.), but also because of the high number and
different nature of the interacting agents that play any kind of role in it (e.g.
operators, regulators, planners, clients), and also because the different users
have different requirements and evaluate the system through different
perspectives.
This complexity requires great management efforts, in order for the
system to be managed in an efficient yet accountable way. As an essential
tool for any demanding management task, the Management Information
Systems must play a key role for the entities responsible for managing the
Mobility System – the Transport Authorities. The Information System must
be designed in order to serve all the different tasks (strategic, tactical and
operational) of the Authority, and shall gather, process, store and present the
relevant information for all the managers involved.
This project tries to identify the relations between the Authority and all
the other stakeholders in the Urban Mobility System, and the management
functions underlying each of these relations. This will allow gathering the
information requirements for each function, and thus establish the basic
requirements (e.g. inputs / information sources; technological (hardware and
software) needs; outputs / indicators) for the intended Information System.

2.7 Reduction of road traffic congestion through better


utilization of private cars capacity - An extension of
the carpooling concept

Rising use of private cars deriving from suburban occupation and car
ownership growth is making traffic congestion more frequent and harder in
urban areas. This results in air pollution, energy waste and unproductive and
unpleasant consumption of people’s time. In most Metropolitan Areas the
majority of trips in private cars are single occupant vehicle trips (SOV). In
1990 approximately 90% of work trips and 58% of the other trips in the US
were done in SOV [11].
580 J.M. Viegas

One may conclude that most of the big cities weren’t able to implement
effective mobility policies for controlling modal split and traffic congestion,
and thus need now recovery measures.
Automobile utilization is very attractive. Its universal appeal is
demonstrated by rapid growth in car ownership levels even in countries with
high fuel prices, good public transport systems and dense land occupation.
Therefore mobilization of private cars is an option that can be used as an
advantage through an increase in vehicle occupancy, moving the same
number of people in fewer cars.
Car pooling systems search that higher occupancy in commuting trips,
associating neighbours who travel to work places next to each other, using
the vehicle of one of them. These neighbours form pools that must be stable
for the system to work, but the cost of such schemes is a loss of flexibility in
personal activities since all the participants must be able to start the return
trip at the same place and time, thus severely restraining extra activities after
the working period. Car pooling experiences have obtained some success but
mostly succumb after a few months because of that loss of flexibility, and
more importantly haven’t been able to reach the scale where they would
reduce the congestion problem.
The work being developed in our group is based on the concept of a club
of car pools, numbering from several dozen to a few hundred, such that there
is a high probability that the changed pickup point (in time and space) of a
member of a pool can be satisfied by another car pool in the same club. This
includes the study of the acceptability of the system, more specifically, the
attitudinal factors which may hinder a good participation rate (survey in
Lisbon’s Metropolitan Area), the modelling of the system functioning with
different scenarios (different participation rates and personal requirements),
as well as the search for pay and reward schemes that may help launch and
stabilize the wider club.

2.8 Mitigation of road traffic congestion problems


caused by incidents

Congestion is a big problem for all large cities. Very often congestion
doesn’t occur only because of a known excess of demand over capacity
(peak hours) but because of sudden reductions of supply caused by incidents
like accidents, immobilized vehicles or other kind of temporary obstacles.
There is a large amount of study and discussion on the first case, and rather
less on the second and that is what we tackle in this project.
When the goal is to solve congestion problems shortly after they occur as
a result of temporary causes, the solution necessarily has to be obtained with
the existing infrastructure. It is known that the time for solving a bottleneck
Innovation in Transport Systems 581

sharply increases with the time during which the bottleneck was allowed to
accumulate.
Thus, we look for optimal strategies to recover normal flow conditions in
the sequence of disturbances caused by temporary incidents, trying to act as
quick as possible in the removal of the causes/incidents (but that is the police
job) and in re-routing the vehicles for other paths of the existing network
avoiding further accumulation upstream of the critical section.
For re-routing vehicles we want to use parts of the infrastructure that are
not saturated and remove some pressure from the problematic parts. But as it
is done in the evacuation of persons from big buildings only the quantities of
demand that can be well attended in other paths with normal flows should be
guided to them. The final part of the project is dedicated to developing the
most effective ways of communicating to the drivers the recommendations
concerning their modified paths, admitting that not all drivers will have cars
fully equipped with the sophisticated positioning and telecoms units that
would make that communication much easier.
We have found that the complexity and variability of existing situations,
as well as different characteristics of networks and flows, lead to problems
of very different degrees of complexity and require solutions adjusted to
each case.

2.9 Repopulation of Lisbon's inner centre: Required


conditions and expected effects on mobility

There is evidence that transport and land use systems of cities all over the
world are becoming unsustainable. Many experts believe that a better
integration of urban transport and land-use planning can help to reduce the
need for travel and make the resulting system more sustainable [12í14].
Lisbon’s Metropolitan Area (LMA) has experienced an undesired
development trend characterized by the loss of inner-city population due to
deaths and migration to new settlements in the suburbs, by the degradation
and abandonment of thousands of dwellings in Lisbon, and a growing
number of commuting trips to the city centre, among other factors.
This project focuses on the relationship between residential location
choices and transportation patterns and its general purpose is to provide a
better understanding of which could be the conditions that favour the success
of a policy focusing on a relatively concentrated "back-to-the-city"
relocation of residences, as well as of its expected impacts on mobility.
Research in the last decades has shown that land use and transport are parts
of a dynamic system that are linked together by time lagged feedback loops
[15]. Therefore integrated land use and transport models are needed to assess
the performance of urban policy strategies.
582 J.M. Viegas

The aim is to develop a tool that supports decision making in transport


and urban planning. This tool focuses specifically on the residential location
modelling problem in order to build and assess repopulation strategies of the
city-centre. In that sense, an integrated urban location choice and transport
model for the LMA will form the core part of this decision support tool.
The policy questions to which the model will contribute to answer are:

x Which are the requirements for a successful strategy towards a


strong "back-to-the-city" movement to the centre of Lisbon?
x What repopulation strategy should be implemented, regarding the
instruments and the intensity level that should be applied to this
strategy?
x What are the expected changes deriving from the previous strategy,
focusing mainly on the mobility system?

2.10 Managing interaction processes between land use


and transport

Land use and transport policies have been the two main streams of action
to influence the spatial distribution of activities, often seen as alternatives to
each other. Direct intervention on land use policy or indirect influence on
land use patterns through intervention on transport policy are common
actions, a reason why these instruments should rather be seen as
complementary to each other in the developing and shaping of sustainable
urban areas.
The political and administrative frameworks of a city or conurbation are
influencing factors for the level of performance that is required in the
interaction between these two sectors – land-use and transport – which
despite a consensual recognition that co-ordination between them is needed
are, in most cities all over the world, under different institutional settings.
Despite the intensity of political statements and scientific background on the
need for good performance in the interaction between these sectors, a major
challenge is still today to answer the following question: “why does the
interaction between land use and transport not work?”
The object of this project is to analyze the processes of interaction
between transport and land use that occurs in the scope of territorial
management, identifying the critical aspects that have contributed for a
deficient interaction between these two sectors. The current situation in
many Brazilian cities, exemplified by case studies, is depicted, among other
reasons, by the deficiencies in the institutional structure; the deficient
application of urban legislation; and the non-coordination of most
interventions. The case studies are led in three Brazilian cities of similar
Innovation in Transport Systems 583

dimension, Fortaleza, Belo Horizonte and Curitiba. In the comparison of


case studies, the processes in the cities are identified. The similarities and
differences among them are analyzed, and the performances of processes are
evaluated through process indicators.

2.11 Optimization of distribution logistic systems in


urban environment

Due to an ever growing demand, logistics activities have been placing


constantly higher constraints to the other city users, the most significant ones
being the pollution associated with the transportation, the use of
infrastructures built for other uses (roads, parking lots), and the congestion
sometimes caused by double parking during loading and unloading
activities. On the other hand, it is subject to the same restrictions (traffic,
lack of urban space) of some other road users.
Despite this, as a result of the lack of knowledge and the appropriate
instruments, local authorities have always denied the urban logistics the
attention it needs, leaving it in second plan in the urban and transport
planning processes.
In order to reduce the impacts of the distribution of goods inside cities,
yet keeping it in pace with the city needs, this project tries to define
“Logistic Profiles”, which we define as the logistic characteristics of
homogenised groups, defined by the features of territory, and by the specific
needs of the goods and the agents. This will allow the definition of a
conceptual organizational model, the quality of which will then be evaluated
and validated. Based on these analyses, the work will produce guidelines for
a master plan in Urban Logistics.

3. ORGANIZATION AND TECHNOLOGY AS


INSTRUMENTS FOR EFFICIENCY GAINS IN
BIG SCALE TRANSPORT SYSTEMS

3.1 Driving an effective sustainable transport policy – A


new concept for an adaptive and efficient vehicle
fleet

Much transport research has been performed for the last 20 years and,
lately, it is much dedicated to environmental impacts. After the IPPC, CAFE
and similar policies, research is now even more focused on these issues. As a
584 J.M. Viegas

prime contributor to atmospheric emissions, much attention is paid to car’s


technological aspects (particularly on energy consumption and emissions).
Although it is commonly accepted that technology can solve many
environmental problems from road transportation, there is no precise
knowledge on whether, when and how a full solution will be reached.
The diffusion of technological innovation occurs when new models are
introduced in the automotive market, but older vehicles, which do not
incorporate the new technologies (e.g., Catalytic Converters, Antilock
Breaking System (ABS), Particles filters, etc.), are kept in operation for
typically more than 10 years. Logically, old vehicles should be scrapped and
replaced by the new models, in order to foster technological renewal of the
entire car fleet. However, in terms of life-cycle analysis, the urge for a faster
technological renewal of the car fleet can have negative environmental
impacts deriving from the retirement of old vehicles. In fact, many authors
(Jorgensen and Wentzel-Larson [16], Wee et al. [17], Dill [18], Kim et al.
[19] and Wang [20]) argue that early scrapping old vehicles is not
necessarily the best solution to reduce overall car fleet environmental
performance. As referred by Kim et al. [21], a lifetime of 18 years for a
standard car minimizes cumulative energy and CO2 emissions, based on
driving 15 000 km annually. On the other hand, the process of technological
innovation could be refrained if the renewal rates of car fleets decreases,
since the demand for new products would also be reduced. Moreover, in the
present market conditions, changing private cars is still expensive. This
could also bring equity issues if any regulation would impose directly or
indirectly restriction to the use of inefficient vehicles (e.g., restricted access
to the city centre with privileges to Zero or near-.zero emission vehicles).
This project looks at the possibility and conditions that would make it
feasible for anyone to upgrade the (energy and environmental) efficiency of
his/her vehicle at acceptable costs, whenever new technologies are available.
Car owners would be given the possibility to ‘refurbish’ or ‘retrofit’ their
vehicles through replacement of an older component by a new, more
efficient one. We start by analyzing the potential impact of retrofitting
vehicles as a mean to increase the global efficiency of car fleets and an
alternative to increasing the rates of substitution of old vehicles by new
models, and then try to identify the technological and business model
conditions that would make this a practical possibility.

3.2 Instruments and strategies for development of air


cargo

Owing to globalisation and other economic dynamics, the freight


transport paradigm has evolved towards a new stage. Nowadays, transport
Innovation in Transport Systems 585

services are required to provide door-to-door transport in short and


predictable times, which for the air transport companies, operating on a
point-to-point basis, entails integration with other modes, namely road
transportation. As such, the sustainability of the air cargo business in the
traditional companies can only be assured through intermodality.
Nevertheless, many air transport companies are having trouble to adapt, with
some withdrawing or outsourcing their cargo divisions, with others
increasingly replacing air by road transport, notably in continental legs.
The aim of this project is to develop instruments to enhance
attractiveness of multimodal transport chains including air cargo
transportation. These instruments are meant to provide guidance during the
chain construction process, enabling the achievement of the modal chain’s
best fitting. The instruments are identically supposed as underlying the
companies’ strategies, since they will provide help in the definition of the
strategic alliances and partnerships with other companies, as well as of the
target markets.
In order to accomplish this task, the project has been divided into four
parts: state of the art, theoretical analysis, case studies analyses, and
assembling of the instruments. Four case studies of intermodal transport
chains have been identified: sea & air, rail & sea; rail & air; road & air,
covering all possible combinations.

3.3 Mechanisms for evaluation and improvements of


freight transportation plan by rail

The task for development of the freight transportation plan by rail is a


complex, combinatorial and constructive one that consists of a set of levels
of scrutiny. Each level of scrutiny requires achieving optimal results for the
organization of transformations of a set of flows. These flows are defined as:
freight flows; freight cars’ flows; and freight trains’ flows. The flows are
transformed in railway facilities for the sake of the provided transportation
service. These railway facilities are defined as: Terminals for loading and
unloading; Dispatch and Terminal Railway Yards; and Formation Yards.
Thus, for the objectives of the development of a freight transportation plan
there is a set of networks to be defined. This is imposed by the hierarchical
structure of railway freight system, meaning at different levels different
operating processes. That is why one defines several networks, as follows:

x Extended railway network which consists of all the railway facilities


mentioned above and all the railway lines that connect them.
x Basic railway network which consists of all the dispatch, terminal
and formation yards and the railway lines that connect them.
586 J.M. Viegas

x Reduced railway network which consists of all the formation yards


and the railway lines that connect them.

All the tasks for development of the transportation plan are within the
defined railway networks.
A crucial step is that of analyzing of “demand-supply” relationship in
providing the freight transportation service by the railway operator. These
are the classes and quantities of freight at the points of sourcing and points
of consumption over the entire railway network. By such an exercise one
identifies points of strong and poor sourcing, and points of strong and poor
consumption. Linking the points of sourcing with the points of consumption
through the existing railway network one identifies all the possible routes for
transportation of the originated freight flows. This is designated as a task for
distribution. These aspects create a plausible environment for modelling and
optimization of the operating processes with freight cars/freight trains flows
over the railway network.
In providing freight transportation service by rail there are Clients;
Railway Operator(s); Infrastructure and Type’s Flows. The level of quality
of the provided freight transportation service depends on how the railway
operator organizes the service of the freight car/train flows and with what
type’s rolling stock over the railway network in order to transport the
originated freight flows from their points of sourcing to their points of
consumption, subject to the clients’ requirements and infrastructure limits.

3.4 Market analysis and policy tools for promoting the


motorways of the seas in the European Union

The White Paper on European Transport Policy 2001 has forecast that
European economic growth will bring more mobility. One of the goals of the
EU Transport Policy 2001 is to reduce dependence on EU trade on road
transport, for which one of the options suggested in the White paper is to
revitalise sea transport for freight transport. Shipping volumes are growing
more or less parallel to those of road though it has been with a relatively
steady segmentation of the market, sea being used only for relatively low
value goods mainly bulk in non-scheduled services.
Previous attempts to promote Short Sea Shipping (SSS) for intra-
European freight movement have been a commercial failure in spite of
subsidies provided at start-up. It was observed that though SSS was price
competitive there were other requirements to be fulfilled to make the SSS
competitive. Frequency of service, competitive door to door time, safety and
security of shipped goods and administrative simplicity are some critical
service requirements demanded by shippers. Provision of this combination
Innovation in Transport Systems 587

of services has led to the concept of Motorways of the Seas (MoS). The aim
of the MoS would be to provide a fluid, available and simple movement of
freight on the sea, port and landside links of the Intermodal transport
network similar to the road motorways.
The study aims to assess the potential for Motorways of the Seas and
indicate an appropriate path in order to achieve a significant transfer of intra-
EU freight transport from road to sea.
On the demand side:

x Identify the potential industries that would (be more likely to) use
Motorways of the Seas.
x Track the movement of their cargo flows within the EU at macro
level.
x Find the service requirements of these shippers/freight forwarders
along the value chain and prioritise the requirements based on
industry and value of transported goods.
x Find out whether it is possible to customise the quality of Intermodal
transport service in order to fulfil the users’ requirements.
x Find the price the shippers are willing to pay
On the supply side:
x Assess the availability of the infrastructure available at ports close to
major European markets (port capacity, adequate hinterland
connections, e.g. to TEN-T networks) for implementing the MoS.
x Find the maximum and minimum port capacities that provide the
required levels of transport service efficiency to implement MoS at
the identified ports.
x Identify best policies at various levels (local, national, regional and
Community) to actively promote and market Intermodal container
goods haulage so as to achieve the goals set in the White Paper.

4. QUALITY AND SAFETY IN TRANSPORT


INFRASTRUCTURE

4.1 Quality management models applied to road


drainage systems

Effective drainage of roads is a key concern in their design, construction


and maintenance. Several approaches and many types of drainage devices
exist, depending on the relief and rain patterns of the region where the road
is implanted, as well as on engineering traditions.
588 J.M. Viegas

Recognizing that drainage solutions must perform well for long periods
after construction of the road, a quality management approach is adopted in
this project, starting with users’ requirements and their translation into
engineering specifications, proceeding to performance evaluation in the field
and to feed-back of the results of that evaluation to the design of those
solutions. Since performance of drainage systems is not described by a
binary result and normally degrades with time in service, a Bayesian
formulation has been adopted, aiming at the development of a learning
process through which, for each type of drainage system, more efficient rules
for inspection and replacement would be available.

4.2 Innovative methods for improved specification of


road pavements towards higher traffic safety

Traditionally, road pavements are designed according to required


structural capacity. Nowadays the functional properties of pavements have
gained additional significance due to a large availability of new materials
and technologies and the strict demands of road users.
The purpose of this project is to develop an innovative methodology to
specify road pavements, considering its functional performance relating to
traffic safety. It is planned to interrelate the functional properties of asphalt
pavements with traffic safety indicators – Skid resistance and Texture vs
Accident risk – particularly for adverse weather conditions (rain, snow and
frost). Many countries have already guidelines to ensure safe skid resistance
levels on their roads [22, 23]. They established skid resistance threshold
values that define the lowest acceptable road friction, below which the risk
of an accident may increase. These threshold values have resulted from
research carried out on the relationship between skid resistance and accident
risk.
The models that will be developed to simulate the performance of
different wearing courses, applied in different road conditions – traffic,
geometry and speed – will be validated and calibrated through the
monitoring and observation of experimental stretches.
The final objective of this work is to provide tools that can be used by the
Road Administration to evaluate the efficiency of alternative pavement road
solutions in increasing traffic safety.

4.3 Road infrastructure management and pavement life


cycle analysis

Currently intervention on road networks tends to focus in maintenance


and rehabilitation actions since most infrastructures are already built and
Innovation in Transport Systems 589

should be preserved. But there is need for a network quality optimization


strategy regarding the available assets. The life cycle cost analysis (LCCA)
tool has known a major increase of its use in the last two decades for
pavement related investments. Nowadays concerns about the scarcity of
financial resources, the definition of the most cost-effective investment
strategies, as well as, the important role of user related costs when evaluating
optimum decisions, contributed to the widespread use of LCCA
methodologies [24, 25].
This project aims at the development of a LCCA model for supporting
the choice between different design, maintenance and rehabilitation
alternatives for the Portuguese highway network. The definition of the
framework for this pavement related LCCA model addressed primarily the
characterization of performance and costs models. The most adequate
pavement performance models were selected from the literature and will
subsequently be calibrated using data available from a highway
concessionaire pavement management system. Regarding costs model, both
agency and user costs were integrated. Besides vehicle operation costs, it is
also desirable the inclusion of work zone related costs (e.g. user delay) in the
costs model [26]. With this study, several design/preservation strategies can
be evaluated, in order to verify whether the adoption of a life cycle approach
and the inclusion of user costs can lead to different optimum decisions.

4.4 Modelling dynamic behaviour of very high-speed


railways to evaluate track vibration and
deterioration

Some of the new railway high-speed lines (HSL) currently under


development in Europe are designed to enable maximum speeds of 350 km/h
in commercial operation. Running trains at such high speeds presents some
concerns in terms of track geometry deterioration, and thus in track
maintenance costs, due to the known increase of track vibration levels and
consequent ballast deterioration and changes in track quality.
This project focuses on this particular concern by performing calculations
with a validated finite element dynamic train-track model allowing to
investigate how increasing speed and ballast vibration influences on track
degradation, helping to conceive new HS track design recommendations.
Notwithstanding, it is crucial to dispose of important data obtained from
experimental measurements in real high speed lines and also from tests made
in laboratory, mainly for a range of very high speeds. In order to achieve
these targets, fruitful collaboration work is in place with CENIT (Center for
Innovation in Transport, Barcelona), SNCF - Innovation & Research
Department (Paris) with support given by CEDEX (Madrid).
590 J.M. Viegas

The dynamic track model will enable to study different types of infra-
structure subject to several types of trains and, together with in situ
measurements data, will enable to evaluate the influence of defects and
irregularities of track and train and estimate long term behaviour of track
(simulating track settlement progression in time). Expected results would
enable to come up with some conclusions in form of:

x Measures to adopt to reduce vibrations and track deterioration


x Obtain reference design values for good dynamic performance of
high speed track
x Estimation of the consequences on track maintenance planning and
process
x Recommendations for the determination of the infrastructure
charges, for different levels of load, speed and traffic

5. LABORATORIES

As a result of its relatively late growth with respect to the other areas in
the Department of Civil Engineering, our group was not contemplated with a
laboratory space at the time of installation in the new building, in 1992. Only
more recently, in 2004, was the Laboratory for Transport Infrastructure and
Systems formally created and a space allocated to it. This is now receiving
the first pieces of equipment, acquired since the decision to create the lab.
But a bolder move had been made in 2002, in the framework of the
National Program for Scientific Re-equipment, launched by the Portuguese
Science and Technology Foundation. A very ambitious T-LAB proposal was
set up, involving three main scientific areas – Transport, mobility and safety;
Transport Infrastructure; Transport Energy and Environmental Impacts – and
two Departments of our school (Civil and Mechanical Engineering) plus the
Engineering school of the Lisbon Polytechnic. After a lengthy process, this
proposal has now been approved, which will open the possibility of a radical
increase of experimental research in our group.

6. NETWORKING

International cooperation has long ago been identified as a key


component of the strategy for the scientific development of our group and
for recognition of its merit. Since the author came to the Chair of
Transportation in 1992 this has been one of the key priorities.
Innovation in Transport Systems 591

Given the very small dimension of the group then, this process was
launched through a private company managed by the author, and based on a
significant level of participation in EU R&D framework programs. This led
to very high levels of responsibility – with coordination of a dozen projects
in only 8 years – and visibility in the European Transport research world.
This visibility and associated prestige have led to growth of the group
dimension in the school, making it now capable of direct participation with
high quality performance.
Our group has been founder and is Director-general of the
TRANSPORTNET network of 8 European schools with strong activity in
Transport research and advanced teaching. Besides our school, the members
are the Universities of Antwerp, Delft, Lyon II, Karlsruhe, Genoa, Aegean,
and the EPF Lausanne. This network has now 2 Marie Curie projects, one
dedicated to Early Stage training of doctoral students, and another to short
training courses in several areas of Transportation.
In parallel, we also participate in the EURNEX network of excellence on
Railway research, funded by the European Commission, and the author has
been appointed as one of the leaders of its Pole on Strategy and Economics.
Besides these two formal networks, high levels of interaction with
international colleagues are sustained. As a natural part of that activity, the
author is Vice-President of the Scientific Committee of the World
Conference on Transport Research Society, and our group is a finalist
candidate to hold the 2010 World Conference in Lisbon. In parallel, our
group has hosted in September 2005 the Thredbo 9 conference, dedicated to
Competition and Ownership in Land Transport, the Program Committee
being chaired by Prof. Rosário Macário.

7. CONCLUSIONS

This synthetic view of the research activities being carried out in the
Transport Infrastructure and Systems group of CESUR is a clear indication
of the vitality of the group and of the sophistication of many of the on-going
projects. Although the critical dimension for the group has been reached
only very recently, the youth, talent and energy of its members constitute
strong guarantees of a high quality output for a long period.
Clear objectives are defined both for individuals and for the group as a
whole, as well as the identification of the resources and processes that must
be mobilized to achieve them.
The sustained effort being made on networking, and the level of
international recognition of our work already achieved provide legitimacy to
592 J.M. Viegas

our goal of being one of the top transport research groups in Europe in the
next decade.

REFERENCES
1. Macário R., 2005, “Quality Management in Urban Mobility Systems: an integrated
approach”, PhD dissertation at IST.
2. You J, Nedovic Z, Kim T. “A GIS-Based Traffic Analysis Zone Design: Technique”,
Transportation Planning and Technology, 21, 1997, pp. 45–68.
3. You J, Nedovic Z, Kim T. “A GIS-Based Traffic Analysis Zone Design: Implementation
and Evaluation”, Transportation Planning and Technology, 21, 1997, pp. 69–91.
4. Horner M, Murray A. “Excess Commuting and the Modifiable Areal Unit Problem”,
Urban Studies, 39 (1), 2002, pp. 131–139.
5. Chang K, Khatib Z, Ou Y. “Effects of zoning structure and network detail on traffic
demand modelling”, Environment and Planning B, 29, 2002, pp. 37–52..
6. Ding C. “Impact analysis of spatial data aggregation on transportation forecasted
demand”, URISA, 1994, pp. 362-375.
7. Viegas, J. “Turn of the century, survival of the compact city, revival of public transport
ottlenecks”, in Transportation and the Port Industry. (H. Meersman, Ed). Antwerp,
Belgium, 1996.
8. Viegas J, Lu B. "The intermittent bus lane signal settings within an area", Transportation
Research Part C, 12, 2004, pp. 453–469.
9. Houten R V. "ITS animated LED signals alert drivers to pedestrian threats", in TRB 80th
Annual Meeting Proceedings, Washington, 2001.
10. Viegas J, Lu B.,"Widening the scope for bus priority with intermittent bus lanes"
Transportation Planning and Technology, 24, 2001, pp. 87–110.
11. Shaheen S. “Carsharing and Partnership Management – An International Perspective”,
Transportation research Record, 1666, 1999, pp. 236-257.
12. Wegener M. “Overview of land-use and transport models”, 8 th International
Conference on Computers in Urban Planning and Urban Management (CUPUM03),
Center for Northeast Asian Studies, Tohoku University, Sendai, Japan, 2003.
13. Ewing R, Cervero R. “Travel and the Built Environment: A Synthesis”, Transportation
Research Record, 1750, 2001, pp. 87–114.
14. Eliasson J, Mattsson L. “A Model for Integrated Analysis of Household Location and
Travel Choices”, Transportation Research, Part A, 34, 2000, pp. 375–394.
15. Batty M. “Agents, Cells, and Cities: new representational models for simulating
multiscale urban dynamics”, Environment and Planning A, 37 (8), 2005, pp. 1373–1394.
16. Jorgensen F, Wentzel-Larson T. “Forecasting Car Holding, Scrappage and New Car
Purchase in Norway”, Journal of Transport Economics and Policy, 24 (2), 1990, pp.
139–156.
17. Wee, B V, Moll H C, Dirks J. “Environmental impact of scrapping old cars”,
Transportation Research Part D: Transport and Environment, 5, 2000, pp. 137-143.
18. Dill J. “Estimating emissions reductions from accelerated vehicle retirement programs”,
Transportation Research Part D: Transport and Environment, 9, 2004, pp. 87– 106.
19. Kim H C, Ross M H, Keoleian G A. “Optimal fleet conversion policy from a life cycle
perspective”, Transportation Research Part D: Transport and Environment, 9, 2004, pp.
229–249.
Innovation in Transport Systems 593

20. Wang M Q. “Examining cost effectiveness of mobile source emission control measures”,
Transport Policy, 11 (2), 2004.
21. Kim H C, Keoleian G A, Grande D E, Bean J C. “Life Cycle Optimization of
Automobile Replacement: Model and Application”, Environmental Science and
Technology, 37 (23), 2003, pp. 5407–5413.
22. The Highways Agency. “Design Manual for Roads and Bridges: Volume 7 - Pavement
Design and Maintenance: Section 3 - Pavement Maintenance Assessment: Part 1 - Skid
resistance”, London, 2004.
23. Transit New Zealand. “Specification for skid resistance investigation and treatment
selection”, New Zealand, 2002.
24. Haas R. “Reinventing the (pavement management) wheel”, Distinguished Lecture, Fifth
International Conference on Managing Pavements (available for download at
http://www.asphalt.org/graphics/haaslecture.pdf), Seattle, 2001.
25. Hall K T, Correa C E, Carpenter S H, Elliot R P. “Guidelines for Life-Cycle Cost
Analysis of Pavement Rehabilitation Strategies”, 82nd Transportation Research Board
Annual Meeting, Washington DC., 2003.
26. Walls J, Smith M. “Life-Cycle Cost Analysis in Pavement Design – Interim Technical
Bulletin, Report No. FHWA-SA-98-079”. Federal Highway Administration, Washington
D.C., 1998.
HOTEL ARCHITECTURE IN PORTUGAL

Madalena Cunha Matos


Faculdade de Arquitectura, Universidade Técnica de Lisboa, Rua Sá Nogueira. Pólo
Universitário Alto Ajuda, 1349-055 Lisbon, Portugal, E-mail: mcunhamatos@fa.utl.pt

Abstract: Hotel Architecture in Portugal is a two-year project planned to be carried out


in the Faculty of Architecture of the Technical University of Lisbon. Its
objectives are to investigate the development of hotel architecture in Portugal
in the 20th century as well as to expose the issues that the development of this
architectural typology will raised in the near future. The study focuses on the
relationship between cultural, site-specific and even nation-identity concerns
of production, especially from the architect-designer in one end of a spectrum
and, at the other end, the dissemination of international models. A hypothetical
polar dichotomy is thus established.
Relevance to architectural and urban issues will be treated as the main subject,
relating iconography, formal language, functional organization, architectural
morphology, site plan and location. The project intends to collect and compare
alternatives to finally sort out the best practices from the architectural
production. Being a work-in-progress, it is organizing a database of authors,
dates, locations, alterations and contributors from other fields besides
architecture. The analysis of individual architects and cases-studies of a
number of hotels will help create the background to a possible framework of
strategies dealing with the future. The relevance of this issue to the economic
health of the country is evident, as well as to many other countries supporting
a large tourism industry. As a result of this project's investigations and in
collaboration with colleagues, educational institutions, professional bodies and
governmental agencies throughout the country, as well as with consultants in
Europe, the USA and Latin America, it aims to develop national frameworks
for the encouragement of the highest standards in understanding, assessing and
designing leisure-related accommodations, namely hotels and their supporting
facilities.

Key words: Hotel, architecture, tourism, history, territorial policies

595
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 595–603.
© 2007 Springer. Printed in the Netherlands.
596 M.C. Matos

1. PRESENTATION OF THE SUBJECT: A SHORT


HISTORY

At the heart of the urban artifact known nowadays as the ‘hotel’ lies the
critical reception of modernity - , as, in Portugal, it evolved from the Grand
Palace in the second half of the 19th century, to the Grand Hotel either in
traditional bourgeois seaside towns or in the older thermal spring locations.
It then underwent a process of multiplication and diffusion into small to
medium scale hotels in the main cities and finally it reached a turning point.

Figure 1. Hotel in Av. Infante Santo. Arch. Alberto Pessoa, 1957.

This was the major undertaking of the Ritz Hotel in the country’s capital,
which was a command state-ordered by the dictator himself and carried out
by a group of the leading financial families of the country. It was built in
1952-59 and designed by a group of exclusively Portuguese architects, led
by the foremost State buildings architect, Porfírio Pardal Monteiro, who
didn’t live to see it completed. Jorge Chaves followed the architectural
venture up to its end, with many colleagues who formed the first team who
studied and worked in a specialized way the hotel typology in the country –
many of those centered their subsequent careers in the field of hotel design
and construction, in the administrative duties of hotel governmental
certification, or in the international promotion of the just launched
Portuguese tourism industry. An impressive number of artists who
cooperated created the imposing number of artifacts of art, decoration and
equipment still to be seen.
Hotel Architecture in Portugal 597

Figure 2. Study of site position, Hotel Ritz, in Revista Binário. nº 13, Oct.1959.

Figure 3. Hotel Ritz in Lisbon as seen in a contemporary postcard (aprox. 1959).

The Ritz epitomized the modern slab at a monumental scale. Its volume
and site position is at odds with the up to then conservative urban fabric of
the city, while its architectural language cuts through the customary
compromise of simplified 18th century pastiche which was imposed in the
whole country as an assertion against modernism.
Antedating by a few years or immediately following suit this key
enterprise, the territorial southwards transfer of hotel building initiative
prevailed.
The discovery of the Algarve by the industry of tourism led in the 50’
and 60’s to the construction of an increasing number of hotels by the seaside,
598 M.C. Matos

transferring the focal point of foreign travel from the traditional island resort
of Madeira, initiated mainly by the British in their journey to India and other
colonial outposts during the 19th century. Very soon, the selective clientele
preferred by the government and opposition alike would give way to the
charter-transported tourists from colder climates, who used the purpose-built
airport of Faro.

Figure 4. Hotel Balaia Atelier Conceição Silva. 1969.

While whole sections of the coast have been overridden by massive


construction, a differentiation of products have occurred since the 40’s
regional ‘pousadas’, reused historical buildings such as convents and
fortresses and more recently, private-public initiative of restored manors,
‘solares’; and rural constructions.
A lack of definition remains in the front most seaside locations: their
environmental value is continuing to be shattered by the on-going
uncharacteristic or degraded folk-style massive construction.

2. A PRIVILEGED TYPOLOGY, AN UNKNOWN


SUBJECT MATTER

The main objective of this research is to create new knowledge in a field


– academic studies on tourism activities and infrastructures – in which
Architecture as a discipline has up to now had a very subdued presence in
Hotel Architecture in Portugal 599

our country. This is contradictory with the relevance that the tourism
infrastructure as built artifacts has in the attractiveness of particular regions
or towns, in the management and ultimately in the success of an increasingly
important but nevertheless fragile industry. Hotels are and were rarely
designed by non-architects, which make them into one of the very special
and privileged typologies. They contribute powerfully to the image of the
cities or resorts in which they are built – viz. the Ritz or the ‘Grande Hotel
da Figueira’ – but are neglected as a subject matter in academic lines of
architectural enquiry. This project aims at filling this gap in such a sensible
area of the national, European and international economy. The strengthening
of diversities in regional identities and histories related to hotel construction
is a further aim to attain.

3. ROLES TO BE PLAYED, MARKS ON THE


TERRITORY

A hotel in a foreign country plays the role of home’s surrogate. It is the


most intimate place for a newcomer in a different, either difficult or
attractive, environment. The question being asked is how do architects in
Portugal respond, have responded and will respond to this specific form of
creating a welcoming environment, both inside and outside; how they have
blended the volumetric mass of this often imposing type of building into its
urban or non-urban context, or turned it into an icon; how they have helped
build the identity of the country or the specific region as it is hinted at or
fostered by architecture. What challenges in the difficult terrain of
architectural design does the present and near future engender for site-
specific differentiation of the ‘product’ offered in the globalized tourism
market? How can the establishment of links to other disciplinary areas help
the architectural definition of 21st century hotels? What can we learn from
prior and parallel types of enquiry worked out in other parts of the world,
mainly in the more experienced research centres?

4. STATE OF THE ART: A RESUMÉ

Factual information is fairly provided for in most published works about


hotels: their wealthy patrons, chic clientele and unique legends are all well
represented in this glossy literature. They are usually meant for the general
public and are part of a special marketing. Some are particularly well
researched and supply precious data on architects and teammates, such as
artists and graphic designers [1]; or attempt a panoramic view over a number
600 M.C. Matos

of great hotels [2]. However, these books do not locate the buildings in their
urban setting or socio-political context; neither do they try to define
architectural typologies or present an evolutionary reading. On the other
hand, some scientific production on tourism research is expanding at a very
fast rate and breadth, emerging from the economic and sociological areas [3,
4]. The interweaving of tourism infrastructures with issues of heritage,
sustainability, intercultural contacts and conflicts provide a wealth of
questions to be dealt with [5]. A most promising approach to tourism as a
mass activity deals with the present trends of commoditizing natural or
artificial assets and the new accelerated relationships experimented by the
tourist in the landscape [6]. In architecture, however, the reflexive method
has been quite silent. The usual manuals on how to plan and build this
particular typology continue to be appearing [7], but a front-line tackling of
the problems is rare [8]. In Spain, a great effort to survey and register hotels
and other leisure facilities has been made in the last years, as is patent in
GRANELL [9]. Hotel’s architectural history is still best presented in
PEVSNER [10], but it does not cross the 70’s borderline. The most
enlightened analysis of the presence of modern hotels in the cities is
authored by WHARTON [11], who also decodes the architectural ambiances
and iconographies and initiates a line of enquiry of the outmost importance
for this beacon of modernity, after the traces left by Foucault and Benjamin.
In Portugal, the combined efforts of the DGEMN, IPPAR and Architects
Association have managed to classify and protect some important 20th
century specimens. A number of Masters’ theses have chosen as subject
matter a particular hotel [12], a seaside old town [13] or thermal towns.
Before the critics and historians, the architects themselves have left their
writings [for example, 14]. Many archives are waiting to be classified and
analyzed. In 2000, the architectural periodical JA [15] published two
thematic journals on ‘As praias de Portugal’, the beaches of Portugal, in
remembrance of a first guide for the ‘bather and the traveler’ by the literary
celebrity of the 19th century, Ramalho ORTIGÃO [16]. It presents a number
of articles on architecture and urban planning of seaside resorts, making a
fresh start for further enquiry [17]. In 2003, Docomomo Ibérico met at
Valencia to study Modern Architecture and Tourism, 1925-1965, whose
Proceedings include a study on the production of hotels in the 50’s and 60’s
in Portugal [18]. CORTÉS [19] showed how seaside architecture may be
analyzed in strictly architectural terms and reveals surprising links between
architects from very distant parts of the world. The dichotomy between local
identity and international aspirations as interpreted in hotel architecture
during the postwar period in Portugal was more recently examined in a
presentation shown at Columbia University [20] and an overview of the
Hotel Architecture in Portugal 601

problems and successes of this typology in Portugal was offered at the UTL
Symposium last February [21].

5. CONCLUSIONS

The Portuguese reality has yet to be studied, as many architects and their
productions are unknown and an upgrading of the hotel infrastructure is
necessary. Furthermore, the next few years should incorporate more explicit
knowledge from the areas of architecture and urban morphologies into the
government policies regarding hotel building.
Repercussions are two-fold: internal and external. The project is intended
as a seed of a research practice in the Faculty of Architecture. It is located in
its disciplinary core, the Architectural Department, but bridges into the
Social and Human Department. From the interior of the Department the
presence of researchers with strong Visual Communication and Design
interests complement the team. Some of the members of the team have had a
very relevant architectural practice for an extended period of time and have a
valuable knowledge of processes and sources. The younger members, to be
later selected, will be offered the opportunity to participate in a project with
the responsibility of presenting results. The general theme will be open for
Master’s students’ selection of a dissertation theme, with the support of the
team and its production.
The exterior consultants were chosen and invited for their worth and for
the proposed establishment of more permanent links with their academic
institutions, both in the country and at an international level.
As the members have no prior experience as a complete team, except in
separate inter-individual work tasks, the project is of a relatively small scale
and is intended to generate a drive - to push forwards the research activity in
the Department and create some necessary routines.
As to its external repercussions, the project should allow for a rethinking
of built alternatives for the development of tourism. It starts from the
observation of reality, paired with a reflection on this reality and
reconceptualization of some taken for granted models. It will seek to foster
the publishing and publicity on the supply of high quality hotels, not from
richness of materials or other usual indicator but from an architectural point
of view and judgment. It will put forward to official tourism agencies and
private associations guidelines for a better use of both heritage and
environmental resources. It will seek to establish a network of collaborative
agents in the preservation; refurbishment and modernizing of the
architectural and urban design assets of the different regions, and relate it
with thematic routes and areas. It should also foster the edition of
602 M.C. Matos

monographies on the referential hotels – including their origin, their


architectural plans, retracing the history of the construction and pointing out
the urban adaptations or impacts they produced. It will finally engage in
some recommendations towards transformations of the built environment,
either in the sense of construction as in its opposite.

ACKNOWLEDGEMENT

This project was approved to be funded by the Fundação para a Ciência e


a Tecnologia (ref. POCTI/AUR/61470/2004).

REFERENCES
1. Carita, H et al, Ritz, Lisboa: Hotel Ritz, 2001.
2. Guimarães, M & Valdemar, A, Grandes hotéis de Portugal, Lisboa: INAPA, 2001.
3. Kotler, Ph et al, Marketing places, NY: The Free Press, 1993.
4. Seaton, A et al. (ed.), Tourism: The State of the Art, Chichester: J Wiley & Sons, 1994.
5. Alsayyad, N (ed.), Consuming tradition, manufacturing heritage, London: Routledge,
2001.
6. Bell, C & Lyall, J, The accelerated sublime, Westport: Praeger, 2002.
7. Lawson, F, Hotel and resorts, Oxford: Arch. Press, 1995.
8. MVRDV, Costa Ibérica. Barcelona: Actar, 2000.
9. Granell, J et al, Arquitectura del Sol, COA Catalunya I. Balears C. Valenciana Murcia y
Canarias, 2002.
10. Pevsner, N, A History of Building Types, Princeton: Princeton Univ. Press, 1976.
11. Wharton, A, Building the Cold War: Hilton International hotels and modern
architecture, Chicago: Univ. of Chicago Press, 2001.
12. Pires, F, Para uma leitura da arquitectura doméstica temporária: o hotel Avenida Palace.
Dissertação de Mestrado, FAUTL, 2000.
13. Briz, M, A Arquitectura de Veraneio. Os Estoris 1880-1930. Dissertação de Mestrado,
FSCH-UNL, 1989.
14. Monteiro, P, ‘Memória descritiva para o Hotel Ritz’ in Binário nº 13, Out. pp.1-14,
1959.
15. JA, n.º 196-197, ‘As Praias de Portugal 1-2’, 2000.
16. Ortigão, R, As praias de Portugal, Lisboa: Livraria Clássica Ed. [1876], 1966.
17. Matos, M C, ‘TURISMO E TERRITÓRIO: NOTAS SOBRE UMA RELAÇÃO’, JA
Jornal Arquitectos n.º 197, Set./Out. 2000 – ‘As Praias de Portugal 2’, pp. 23-30, 2000
18. Matos, M C, ‘Face ao oceano. Arquitectura portuguesa nos hotéis atlânticos dos anos
Cinquenta e Sessenta’, in ‘Arquitectura Moderna y Turismo 1925-1965, Actas IV
Congresso Docomomo Ibérico Barcelona: Ed. Docomomo Ibérico, pp. 175-179,
2004.
19. Cortés, J ‘Schindler en Mallorca’, paper presented at IV Congresso Docomomo Ibérico,
Valencia. included in the Proceedings, 2003.
20. Matos, M C, ‘Local Identity and International Aspirations: Postwar Hotel Architecture in
Portugal’, poster presented at the VIII International Conference - Docomomo
Hotel Architecture in Portugal 603

International, New York: Columbia University. ‘Import/Export: Postwar Modernism in


an Expanding World, 1945-1975’, DOCOMOMO-USA website 2006 (forthcoming)
21. Matos, M C, Lameiro, C, ‘Projecto de Arqª: Praxis, metodologias, tendências.
Arquitectura Hoteleira em Portugal’, Simpósio UTL ‘A Investigação na Universidade
Técnica de Lisboa’. Sessão Temática ‘Arquitectura, Urbanismo, Design e Transportes’.
Centro de Congressos IST, Lisboa, 2 -3 Jan. 2006.
INCLUSIVE DESIGN: A NEW APPROACH TO
DESIGN PROJECT

Fernando Moreira da Silva and Rita Almendra


Faculdade de Arquitectura, Universidade Técnica de Lisboa, Rua Sá Nogueira, Alto da
Ajuda, 1349-955 Lisboa, Portugal, email: dasilva@fa.utl.pt, almendra@fa.utl.pt

Abstract: All human beings are entitled to human dignity on equal terms. This principle
must dominate the development of a society open to everyone, which leads to
Inclusive Design concept.
Inclusive design is not an obstacle, it’s a challenge; it’s a philosophy based on
individual differences. The concept implies the creation of environments,
products and services available and usable by the largest possible number of
people, of all ages, sizes and with different abilities, given them equal
opportunity to participate in society, thus the physical environment can
directly prevent people from participating in desired activities on equal terms
with the majority. In Europe almost 25 % of the population suffer from some
form of functional limitation. In Portugal, more than 1 million of persons have
some type of inability.
The design project, as central subject in the designer’s formation, should be
developed having in mind the Inclusive Design principles, in a sustainable
perspective, behind the concept dissemination. The idea is to demonstrate how
the design practices can adopt a routine inclusive approach, if those principles
should be considered right from the very beginning. Behind the inclusive
design contents and practices associated to the design project, in the designers’
graduate formation, the integrated research projects and the post-graduate
formation are also mainstream forms of the approach to the theme in the
Lisbon Faculty of Architecture (FA).
Several research projects in inclusive design are in development at FA,
integrated in LID – Design Innovation Laboratory, among which: “The
Observatory in Inclusive Design”; “Evaluation of the Accessibility and
Usability Conditions of ATM Machines”; “Design Ergonomic Project”;
“Accessibility and Inclusion in Graduation Teaching”.
Everyone involved in the current process at FA hope that this postural
changing may contribute for a better knowledge and application of the rule and
standards in what concerns accessibilities and Inclusive Design, when
developing a Design Project, in a way to integrate a greater number of persons.
We should be able to make our choices about spaces, environment, objects and
information design, and also its politics, not only to reduce the barriers, but

605
M. Seabra Pereira (ed.), A Portrait of State-of-the-Art Research at the Technical University
of Lisbon, 605–621.
© 2007 Springer. Printed in the Netherlands.
606 F.M. da Silva and R. Almendra

also to incorporate all in a sustainable approach, with social responsibility and


respect for the human rights.

Key words: Inclusive design, Universal Design, Design For All, Product design, Identity,
Stigmatization, Satisfactory products, Denotation, Connotation.

1. WHAT IS INCLUSIVE DESIGN, UNIVERSAL


DESIGN AND DESIGN FOR ALL

Inclusive Design shares a similar origin as well as identical objectives of


other design approaches or denominations, such as: “universal design”,
“design for all”, “lifespan design” and “design for the diversity”.
The origin of the term is North American and begin to be developed after
the worries with the disabilities, having assumed later on amore overall
position and meaning, assuming nowadays the definition of projecting for
all.
Formally the concept supports itself in the “Seven Principles of Universal
Design” [1], developed by a group of researchers formed by architects,
product designers, engineers, and environment designers. Inclusive Design is
not an obstacle, it’s a challenge! It’s a philosophy based on individual
differences.
Inclusive Design is a philosophy which is materialized in design project
processes whose result is the creation of products, services and/or
environments available and usable by the largest possible number of people,
regardless of age, gender or abilities. The aim of the concept is to make life
easier for everyone by making products, means of communication, buildings
and urban environments, which are more useable for more people at little or
no extra cost (adapted from Shipley 2002) [2].
It is a complex and evolutive concept which tries to answer not only the
above questions as well as the ethnic questions, economic resources,
education, culture, etc.
Inclusive Design, or Universal Design, or Design For All, is an overall
strategy and philosophy which is based on giving all people equal
opportunities to participate in modern society. This means that our physical
surroundings, products and services are planned and designed so that
everyone can participate regardless of age or physical ability.
All human beings are entitled to human dignity on equal terms. This
principle must dominate the development of a society open to everyone,
which leads to the Inclusive Design concept.
However, the physical surroundings also contribute to the creation of
physical barriers – e.g. when wheel-chair users, persons with walking
difficulties, people pushing prams or senior citizens cannot climb the steps to
Inclusive Design: A New Approach to Design Project 607

a shop or an office or a mean of transportation and where there is no elevator


or any other alternative. We must not forget that these people are active
agents of their own will.
The physical environment can directly prevent people from participating
in desired activities on equal terms with the majority. The physical
environment covers many things: In addition to buildings and product
design, sign-posting, colour selection, transport and IT are also areas where
Inclusive Design is of the utmost importance. An environment filled with
obstacles or lacking of facilitators will reduce performance, while other
environments, fitted with facilitators, will allow improving performance.
Inclusive Design doesn’t start only with the aspiration of designing for
all, but especially with a critic look over the world in which we live in. It’s
based in a holistic and sustainable understanding of the responsibilities of
those who act in the built environment.
Inclusive Design is framed by the two central concerns of the XXI
century Design:

x Demographic Changes – aging of the population, longer life of the


elderly and a reduced birth rate, especially the occidental population
(even if Japan is the worst case in statistic terms).
x Sustainability in global terms – understood not only under the point
of view of the degradation of the natural resources, trying to find a
better management of these, time consistent, creative and without
lost of human life quality, but, and above all, a social, cultural and
relational sustainability, stimulated by a material culture which,
besides the ecological considerations, benefits the relationships,
promotes the interaction and the communication and has the objects
as multifaceted perceptional tools.

Inclusive design is frequently associated with usability and utility. As a


result, inclusively designed products are often looked at as tools with which
users achieve tasks. However, this approach to inclusive design might be
risky. It ignores issues such as people’s emotions, values, hopes and fears.
Thus it ignores the very essence of what makes us human and might be
interpret as stigmatizing. When designing inclusively, designers need to look
beyond usability at other factors that can affect the relationship between
person and product. This one can obtain by designing satisfactory products,
which are socially accepted products and pleasurable products.
Satisfactory products call for an understanding of the users and their
requirements and it links product properties to emotional responses. The
necessary knowledge on how to design satisfactory products we get from the
field of product semiotics. Products that are designed this way will be highly
usable and communicate the desired identity of the users. By designing
608 F.M. da Silva and R. Almendra

inclusive this way, we design inclusive, mainstream products. Thus we avoid


producing stigmatizing products because mainstream products cannot be
stigmatizing by definition.
The challenge for inclusive design is to move from looking merely at
users, products and tasks, to take a more holistic view at people, products
and their relationships.

2. THE DEMOGRAPHIC ISSUE

2.1 The ageing of the population


According to Coleman (1999) [3], demographic data point out to a drastic
world change in what concerns the age structure: in a few years there will be
a considerable grown of the elderly population. This population (which is
nowadays between 50 and 60 years old) according to Coleman it will
probably be more clarified and exigent than the former “retired” generations.
They will be more experienced consumers, with high perspectives in what
concerns products and services. Demographic studies estimate that, over the
next 50 years, in the OCDE countries, the percentage of elderly people,
which is currently about 15%, will reach 20-30%, the number of elderly
people over 80 will triple and the number of elderly people over 65 will
double.
In Europe almost 25 per cent of the population suffer from some form of
functional limitation. In Portugal, over 1 million of persons have some form
of disability.
Moreover, the increasing age of the population both inside and outside
Europe has made accessibility and user-friendly design a universal demand.
This need that one feels to adjective this inclusion worry, which is
Inclusive Design, gives us the real dimension of the generalized indifference
to this type of questions.
This uninterested almost endemic takes root in different factors, such as:

x Commercial and marketing issues, related to consume culture;


differentiation mechanisms; styling and the use of design as an
epidemic form, in makeup operations, inducing the idea of an
“innovation” and/or personalization which doesn’t exist but is
communicated.
x Social/Cultural issues which pass through a valorization, by
occidental societies, youth, individualism, autonomy, undertaking,
competitivety, in opposition to elderly experience and knowledge
which are devaluated, ostracized and forgotten.
Inclusive Design: A New Approach to Design Project 609

3. THE RELATIONSHIP BETWEEN SERVED


POPULATION AND INCLUSIVE DESIGN
SUCCESS

The inexistence of a philosophy of inclusiveness serving the human


diversity is early enounced, just when the target of the populations that will
be served by products, spaces and services are defined.
It is unrealistic and simplistic to think that the target-population for each
product and space is ”all the people”. It is known that there are limits to the
number of persons who wish to use a certain product. Therefore, and
according to Keates e Clarkson (2003) [4], we can consider the existence of
different potential users when we project a certain product or space: whole
population – which is not more than the solution for an utopic issue -, ideal
population – which represents the maximum of population which can be
covered by the product -, the negotiable maximum population – which is the
one who is reachable by the product – and, the included population – which
represents the one who is effectively served by that product.
But this classification known by WINI (Whole-Ideal-Negotiable-
Included) doesn’t translate necessarily the way the wanted target population
definition is made, which strongly depends on the choices of the different
enterprises based in segmentations oriented by marketing logics. So, it is
probable that the wanted target population is defined by the enterprise based
in the whole population or in the ideal population, being that its own size
and composition, independent of the negotiable maximum population and
the included population.
WINIT
In the scheme proposed in [4] it is possible to observe that the population
decreases in dimension from the whole population for the included
population, being the wanted target population established by the enterprise.
These five possible classifications of population are nominated by the
authors as WINIT (WINI+ Target) and they are used by them as success
measure base in terms of Inclusive Design.
610 F.M. da Silva and R. Almendra

Figure 1. Scheme proposed by Keates & Clarkson (2003) [4].

4. DESIGN PROJECT

When an inclusive project is developed it is essential to redefine who the


user is, eliminating the expression “them” or “us” [5].

4.1 Project strategies – process and methodologies


definition

There are some models which formalize different approaches to the


inclusive project processes. Among these: 4.1.1) Top-Dowvs Bottom-Up
approach; 4.1.2) the inclusive design cube and 4.1.3) the deign approach in
Seven Levels.

4.1.1 Top-down/bottom-up

In terms of inclusive project strategies, there are two types of approaches


according to Clarkson e Keates (2003) and defined by them as: the “top-
down” and the “bottom-up”. These two approaches come out of the analyses
of the capability pyramid by Benktzon (1993) [6], presented down and
which tries to translate graphically the different capabilities of the
population.
Having this model present, a designer who wishes his product to be the
more inclusive as possible, faces two possibilities of project approach: or
makes de option of a “top-down” approach or a “bottom-up”. The question
he has to deal with is whether to make an assistive product more mainstream
friendly or make a “normal” product more inclusive.
Inclusive Design: A New Approach to Design Project 611

Figure 2. Capability pyramid by Benktzon (1993).

In the first case we are behind a project attitude in which the users target
is the least functionally capable users. This circumstance ads the fact of
generically and till now, the majority of the products developed according to
this approach have limited success, because they are viewed as technical
aids.
In truth, a great number of these objects serve “medical” needs and are
often acquired by health organisms for which the esthetical value is not very
important in comparison with the functional value, the longevity of the
object or the spatial solution.
In the case of the product developed according to a Top-Down strategy, if
we are interested in its dissemination or its commercial success, it is
essential that it may be attractive for the maximum number of user.
Producer and consumer benefit of the enlargement of the mainstream
market. The products accepted by the ma market, by definition, they are not
stigmatizing. There is also another advantage when producing for such a
large market, which is of obtaining scale economies, turning possible a
product less expensive and larger profits.
In what concerns the Bottom-up strategy, the designer projects for the
majority and tries to develop his projects in a way to include markets of
especial needs. This approach is of great potential for commercially
profitable products.
The greater difficulty presented by this approach is that the way we
approach the top of the pyramid, greater inabilities are taken in account and
more marginal is the return in financial terms. This fact may demote the
companies of investing in this type of strategies, especially if the involved
financial effort is significant when enlarging till the less ability sphere. On
the other hand, it’s very limited the possibility of progression till the top of
the pyramid, because there are situations of great inability which demand the
design of a special product.
612 F.M. da Silva and R. Almendra

This shows that, probably, it is more sensate to complement the bottom-


up strategies with top-down strategies in a way to cover the global
population needs.

4.1.2 The inclusive design cube

If we look at Benktzon pyramid we are able to see that different project


for all approaches are in three main categories:

x Design concerned with the user: which tries to extend the boundaries
of the mass products in a way to include the possible greater
number of persons;
x Design customizable or modular: design which tries to minimize the
difficulties of adaptation to particular users;
x Design for special uses: design for specific users with very particular
needs.

The principle in the origin of the Inclusive Design Cube is that these
same design approaches can became more visible if formalised in a model
with a cube shape in which the part of the population functionally more able
are in the interior of the cube and the ones with less abilities are in the
exterior.

Figure 3. “The Inclusive Design Cube (IDC)”, Keates and Clarkson (2003).

This model explicit more clearly who is included and therefore who is
excluded in each design approach.

4.1.3 The approach in seven levels

In a simple approach to the design project process, there are three main
steps:
Inclusive Design: A New Approach to Design Project 613

But, if we wish to include in the design project process social and


practical considerations, we have to promote its extensions, as it happens in
the approach to design in seven levels, shown on the right hand side.
Although this approach is presented in a sequential way, each one of the
steps can be thought as a support of interactions.
In fact, many times it’s really essential the existence of interactions inside
and between the different levels.
Keates and Clarkson (2003) tried to introduce in the model of the in-
clusive design cube the concerns present in the design approach in 7 levels.

Figure 4. “The design approach in 7 levels”, Keates and Clarkson (2003).


614 F.M. da Silva and R. Almendra

If the design approach in 7 levels was adopted to frame the design


process development, this can be completed with the inclusive design cube,
in a way that with this one we are able to monitorize the populational
coverage aimed by the different project choices.
In fact, the design approach in 7 levels can be seen as the projector
considering the three axes of the inclusive design cube. The necessary
change to use it, or for the system definition, is the rename of the axes, in a
way that these be able to reflect the levels 3 and 5 of the design approach in
7 levels (see Figure 4).

4.2 How to translate the information from the user of


inclusive products and services

One of the critical problems in the inclusive project of objects, spaces


and services is not only the quality data collection near the users, but above
all it translation in material terms. Clarkson and Keates (2003) propose the
use of the known “knowledge loop” down presented which is not more than
a representation of the necessary information flux and of the activities
evolved in the production of projects genuinely validated a inclusive. This
model has also the virtue of presenting the large spectre of agents evolved in
the inclusive design which go further than the stereotype of the final user, as
being “old and unable” and of the designers as passive information users.

Figure 5. “The knowledge loop”, Keates and Clarkson (2003).


Inclusive Design: A New Approach to Design Project 615

Being such a rich model, it may be used in different forms. It can be read
after any point but, in the present case, it’s important to see it in terms of
information users and of final users.
In the perspective of successful inclusive product production we have to
consider that the designer, the user of information of the model (point 5)
wants to produce an inclusive product.
The first step it will be to collect the necessary information about the
final user and the available inclusive design methods (point 6 of the model).
Next step is the application of the methods and of the data, and the
generation of the product concept (point 7). Then, the concept needs to be
verified and compared with the specifications (point 8) in a way to assure
that the product corresponds to the prevue functional requests. However, to
validate the product it is necessary to test it with the final users (point 1).
The data information during the tests with the final users needs to be
collected (point 2) and generated from them, summary representations (point
3). Only after these representations being available the product is finally
validated (point 4). If the product isn’t validated, the designer should make a
revue of the initial concept and restart the process which should be repeated
as many times as necessary till an acceptable product being generated, under
the inclusive point of view.
Under the perspective of the data collection about the final users (point
1), this first step evolves the identification of potential data collection
techniques (point 2) and the application of these techniques in a way to
generate the adequate representations of the data (point 3). The generated
information can be verified in a way to assure that it has coherence and
internal correction (point 4) before being passed to the information user
(point 5). To validate the generated information it is necessary to assure that
it is usable by the information users, so, these should apply this information
in a way to test it (point 6). Only developing a product or a service (pointo7)
adequate to the users’ wishes and needs originally observed, the data
collection methods can be really validated (point 8). In what concerns the
use of the “Knowledge loop” to develop an acceptable inclusive product, the
data collection process should be repeated till the data validation being
obtained.
As final note, having present the interdependence between the 8 points
which constitute the “Knowledge Loop”, it is necessary that each of them be
solved efficiently in a way that the resulting products and services being also
them successful objects.
616 F.M. da Silva and R. Almendra

5. THE INCLUSIVE DESIGN TEACHING

5.1 The need to teach inclusive design

During the last years, the inclusive design issue has acquired a significant
importance at European level. One of the most relevant points concerns the
need of including or not this subject in the curricula of the courses directly
related with the construction of our material reality.
Among the reasons which fundament this option of including this subject
at university level, and according to a reflection handled by the “Special
Interest Group in Inclusive Design for Centre for Education in the Built
Environment” coordinated by Ruth Morrow from the School of Architecture,
University of Sheffield [7], there are five reasons as the more determinants:
the moral argument, the sustainability argument, the legal argument, the
professional argument and the economical argument.

5.1.1 The moral argument

Inclusive design is essentially a value based process, which takes as its


premise the fact that everyone has a right to participate in community life.
Consequently, a powerful argument to support the importance of teaching
inclusive design is the need to assist students in the development of their
own set of values to underpin their future practice as built environment
professionals. Inclusive design can fulfill this important function. It is clear
that teaching students to administer technical codes or interpret legislation
for equal rights is an important part of the preparation of a student for
professional practice, but more clear is the fact this type of approach will not
have practical reflection without a philosophical underpinning. (Lifchez,
1986) [8].
Over the last few decades academic and professional discourses have
provided substantial weight to the argument that society has created a
disabling environment for many people (Oliver, 1990) [9]. This, in turn, has
lead to the wide spread realization that society has a responsibility to remove
obstacles to promote equal participation for all people and avoid the creation
of new disabling environments (Davis, 1993) [10]. However, within the
context of built environment and product design education, the strongest and
clearest moral argument to teaching inclusive design is that an inclusive
environment is a fundamental human right.
Inclusive Design: A New Approach to Design Project 617

5.1.2 The sustainability argument

A sustainable environment is one that supports a sustainable society or


community. Built environments which are inaccessible or exclude people
lead to isolated and poorly interconnected communities. Such communities
have been shown to require more external support and resources.
Community sustainability is best achieved through the creation of inclusive
environments, product and services which combine flexible, usable and
formal adaptability with long term affordability and access to services and
material and functional characteristics.
Such environments encourage neighborhoods to evolve and flourish, by
supporting and facilitating change, growth and responsiveness to changing
needs of built environment, products and services users.
Flexible, “organic” environments, which grow with their communities,
are less likely to become redundant or abandoned. These sustainable
environments will enable and encourage interaction and socialization with
others in the surrounding community as well as with other communities
elsewhere.
Inclusive environments allow people to exercise their right of choice,
integration and participation, regardless of their age, ability, gender, culture,
etc. The inclusive communities find it easier to develop both formal and
informal networks at different levels which define them as sustainable
communities for being balanced, healthy and less resource-hungry.

5.1.3 The legal argument

Laws that embrace and safeguard health and equality have increased in
number over the last 30 years. The built environment, which provides the
context for our lives, is framed as much by its hard physical edges and the
aims of individuals as it is by legislation.
The basic aim of the legislation is to end any form of discrimination
against disabled people, and this includes discriminatory practice in the
design and management of built environments.
Nature needs of what is “reasonable”, in terms of inclusiveness at legal
level, and is in permanent evolution. As a consequence of this, the role of the
designers, architects and all agents evolved in the built environment, will be
not only to keep abreast of changing views of what is regarded as reasonable
in terms of inclusiveness to be able to produce solutions, but also to act as a
leader of changing public opinion about what is possible in terms of
improving levels of inclusiveness.
618 F.M. da Silva and R. Almendra

5.1.4 The professional argument

"How ethical is it to practice architecture - to be professional licensed to


design buildings and places of assembly - without having first developed an
intellectual and emotional understanding of people?" (Lifchez, 1986)

The substance of the moral argument has been embedded in the codes of
conduct of many built environment professional institutions. It is important
that this argument doesn’t restrict itself to paper and that a pro-active
conscience of this inclusive philosophy, because a professional is as well as
defined and shaped by his knowledge and skills as by his integrated ethical
approach to the profession.

5.1.5 The economic argument

Put simply, one can say that if an environment is inclusive, it allows


more people to access it, work there, pay taxes and buy consumables and
services. Built artificial environment professionals have a duty to show their
private and public sector clients that inclusive design offers benefits that
directly affect long-term profitability, consumer relations and corporate
reputation.
In this approach to inclusive design, students can be encouraged to
demonstrate that designing inclusively can be seen not as a burden for the
clients, but as an opportunity to expand markets and increase business
profitability. However, it is evident that many people still associate inclusive
design with extra costs. There is some evidence that it may have an implied
cost, but equally there are case studies, where inclusive design has cost more
in the short term but cost less in the long term by increasing profitability
through decreasing life time management costs of the products,
environments and services.
It is worth being aware that there are other financial incentives to
designing inclusively. Many funding bodies such as Design Council,
EQUAL, CABE, etc, have begun to embed the concept of inclusion into
their aims and funding criteria. There seems to be a general trend toward the
situation, where any project that attracts central government funding will
only be successful, if the project demonstrates exemplary practice in relation
to inclusive design.
Furthermore, there are many disadvantages associated with failing to
design inclusively, such as the possibility to include the cost of bad publicity
associated with poor design solutions, the creation of hard to let buildings
and poor economic viability, the costs associated with the need to undertake
remedial works and even the costs of litigation. In addition, there are costs
Inclusive Design: A New Approach to Design Project 619

associated with providing care or support to people who are unable to/ can
no longer use these environments and products independently or safely.

5.2 Teaching inclusive design at Lisbon Faculty of


Architecture (FA)

The concept of inclusive design is not yet equally disseminated in the six
graduation courses existent at FA.
The Design graduation course has been ahead of the process, mainly
because of the position and involvement of its teachers and professors about
the inclusivity them.
As we have already seen, Inclusive Design has as it main objective to
evolve a large number of necessities and wishes of the users, well defined in
the development of any type of project.
The needs of the users, which may concern people with or without
limited use of their capacities, have to be known by project makers in
general (architects, designers, engineers, landscape architects, etc.).
A design process (understood a project) must have in consideration these
needs, which must be based in a detailed knowledge of the actual situation of
the users and the possible options.
In a great number of cases it is necessary to work with the users or with
people from different organizations of disabilities, during all project process.
Project makers acquire competence based upon knowledge of people
with different characteristics, needs and/or wishes.
At Lisbon Faculty of Architecture, one of the first concerns in the
construction of the contents for the subjects which are related with inclusive
design is to give a special emphasis to contents considered relevant, such as:
equal opportunities for all; inability and legislation (national and European);
international recommendations; the nature of the inability (physics or motor
mobility; sensorial and cognitive difficulties, aging; urban
environment/transport: comfort, health and safety; professional
responsibility.
The design project is the central subject in the designer’s formation. It’s
important to have in mind Inclusive Design Principles, in a sustainable
perspective, not forgetting the concept dissemination and to adopt always an
inclusive approach, having in consideration the inclusive principles since the
initial phase of the design process.
In the Design course curriculum, besides the subjects of the Design
Project, inclusive design principles and practices are approached in other
subjects: Ergonomics, Design of Communication; Theory of Design and
Object Critics. Ergonomics is also taught in Interior Architecture and
Fashion Design courses. So, Architecture, Urban and Territorial Planning
620 F.M. da Silva and R. Almendra

and Urban Management courses don’t have subjects in their curricula in


which the inclusive design problematic is approached. However, some
teachers with knowledge in the area include it in the contents of their
subjects.
There is also another subject, optional and transversal to all courses:
Inclusive Deign and Sustainability. In its contents there is a reinforcement of
competences specifically oriented to the promotion and management of
programs fomenting inclusivity and sustainability in equipments, built
spaces and exterior spaces.
The integrated research projects and the post-graduate formation are also
important ways to the theme approach at FA. LID – Design Innovation
Laboratory is the mainstream for this research where many research projects
are being developed, among which: “The Observatory in Inclusive Design”;
“Evaluation of the Accessibility and Usability Conditions of ATM
Machines”; “Design Ergonomic Project”; “Accessibility and Inclusion in
Graduation Teaching”.
In global terms, there’s a will to continue the process by developing more
research in the area, in parallel with inclusive design graduation and pot-
graduation projects (MSc and PhD projects).

All the people involved in this process at Lisbon Faculty of Architecture


hope that this postural changing may contribute:

x To design new strategies and introduce the necessary corrections in


the curricula of the different taught courses;
x To develop a new way of project making, with a better knowledge
and application of the rules and standards in which concerns
accessibilities and Inclusive Deign, in a way to allow the integration
of a wide number of persons.

REFERENCES
1. Resource: The Center for Universal Design, North Carolina State University, USA
2. Andrew S. “What is Inclusive Design and how can it achieve a built environment to be
enjoyed by everyone?” Discussion Report arising from the November 200 Disability
Rights Commissions Round Table Discussion on Inclusive Design, 2002.
3. Coleman, R. Inclusive design in Human Factors in Product Design – Current practice and
Future trends, London, Taylor & Francis, pp 159-170, 1999.
4. Keate, S, Clarkson, PJ. Countering Design Exclusion: An Introduction to Inclusive
Design, UK, Springer-Verlag., 2003.
5 Steinfeld, E, Tauke, B. “Universal Designing”, Universal Design – 17 ways of thinking
and teaching, pp 165-189, Oslo, Husbanken, 2002.
Inclusive Design: A New Approach to Design Project 621

6 Benktzon, M. “Designing for Our Future Selves: the Swedish Experience”, Applied
Ergonomics, 24, 1, pp.19-27, London, Butterworth-Heinemann, 1993.
7 Morrow, R, “Building and sustaining a learning environment for inclusive design - A
framework for teaching inclusive design within built environment courses in the UK”,
Final Report of the Special Interest Group in Inclusive Design for Centre for Education in
the Built Environment, Group Coordinator and Editor: Ruth Morrow, School of
Architecture, University of Sheffield
8 Lifchez, R. Rethinking Architecture : Design Students and Physically Disabled People,
Berkeley, California, University of California Press, 1986.
9 Oliver, M. The politics of disablement, Basingstoke, Macmillan, 1990.
10 Davis, K. “On the Movement”, SWAIN,J.,FINKELSTEIN,V., FRENCH,S., and
OLIVER,M.(eds)Disabling Barriers and Enabling Environments, London, Sage
Publications in Associations with the Open University, 1993.

S-ar putea să vă placă și